-
Notifications
You must be signed in to change notification settings - Fork 3k
LoRaWAN: Retransmission back-off correction for re-join #9217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@hasnainvirk, thank you for your changes. |
Previously, we had been initializing our time base in LoRaMac::initialize() and if the device was not power-cycled, that initial time was always being used in the calculation of the elapsed time. This introduced a bug mentioned in issue ARMmbed#8921. The bug made the re-join attempt of a device after a week or so to use wrong back-off because in calculate_backoff() API we pass the elapsed time. If we do not set the initial time while trying to connect, the elapsed time will include previous session time as well and the device will think it has spend tha much time and will apply harsher duty cycle back off.
54851d4
to
c101d88
Compare
Hi @hasnainvirk , Happy New Year! :-) AFAICT, what you done will start the timer every time a join sequence is started. But, LoRaWAN spec V1.0.2 says:
(My emphasis.) It was my assumption that Are you sure your change is consistent with the spec author's intentions? I think what you've done means there's the possibility that a number of end-devices will all start using the minimum back-off for join requests after some system wide event. My interpretation of the spec is that a reduced back-off was required for the re-transmissions based on the time since power-on or reset. IMHO, what you've done is fine. I think there are pros-and-cons with both approaches and any given approach is likely to work in some situations and not others. Hence analysing it too much is a waste of time but, on the other hand, we want to be compliant with the spec! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
@hasnainvirk Will start CI once @mattbrown015's comment is resolved. |
@mattbrown015 You are right once again :) 🥇 |
It seems we're stuck trying to interpret exactly what the original spec authors meant. I definitely don't know, I'm relatively new to LoRaWAN and I am on my own here! Like I said before, I think we can probably come up with scenarios where either approach works best. I can understand your point that we should be careful not to take 'retransmission' too literally. I've no experience of large deployments and catastrophic events. I do know that in a small deployment one device can occasionally, possibly erroneously, decide to re-join in which case applying the long back-off is a real pain. Perhaps there's a temporary object gets in the RF path and a few link checks get missed by 1 device. Allowing 1 transmission with normal back-off could work in some situations. Any device deciding to re-join gets 1 go with the normal back-off. In a large deployment there may be a large number of initial transmissions but then the extended back-off will get applied and prevent an RF storm. If it is only one device applying the extended back-off it is a pain but at least it got one go relatively quickly. One thing that worries me with the extended back-off, it's important the the back-off is randomised. If the 1000 devices all apply the extended back-off and then transmit together we've waited all that time and achieved nothing! |
@mattbrown015 Yeah, the back-off is randomized at least for Ack Timeout but it isn't there for Joining process. I have taken a note of this and we will do something about it ASAP. By the way, in v1.1 rejoin is a proper feature and there are 3 types of it. There can be network initiated forced rejoin requests and there can be periodic rejoins as well. In v1.0.2 there is no concept of rejoin as such. However, you could have a logic based on LINK_CHECK_REQUEST as you briefed. |
Closing this PR in favour of #9251 |
Description
Previously, we had been initializing our time base in
LoRaMac::initialize() and if the device was not power-cycled, that
initial time was always being used in the calculation of the elapsed
time. This introduced a bug mentioned in issue #8921. The bug made the
re-join attempt of a device after a week or so to use wrong back-off
because in calculate_backoff() API we pass the elapsed time. If we do
not set the initial time while trying to connect, the elapsed time will
include previous session time as well and the device will think it has
spend tha much time and will apply harsher duty cycle back off.
Pull request type
Reviewers
@AnttiKauppila
@kjbracey-arm