-
Notifications
You must be signed in to change notification settings - Fork 3k
LWIP system mailbox overflow fix #11976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@pstolarz, thank you for your changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, @kjbracey-arm please review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'm suggesting a slight simplification on the maths.
(Looking at the surrounding EventFlags
code, I do wonder why is this doing it the hard way. Isn't this the canonical case of when a Semaphore
would actually be a good idea? I normally loathe semaphores, cos I hardly ever want to count things, but here we do.)
As a side note (improvement).
Since the current implementation has a proper code, which guards the mailbox against fetching from an empty mailbox and posting to the full one, basing on the above flags, the only change required to implement this improvement is to update the post/trypost anti-overflow check to: if (mbox->post_idx == mbox->fetch_idx)
osEventFlagsClear(mbox->id, SYS_MBOX_POST_EVENT); So now, post/trypost check is fully symmetric to fetch/tryfetch check against empty mailbox condition. That's all. EDIT: |
In a circular buffer scenario, yes, you can add a bunch of logic to ensure that you can cope with a totally-full buffer. But the logic is generally going to be bigger than just increasing the buffer by one entry. Unless your buffer entries are unusually large, at least. I'd consider adding logic if these were 1.5K Ethernet frame buffers, but I don't think it's worth worrying about it for 4-byte Although, as you note, we already have a bunch of event flags complexity there so we've possibly already paid the cost. If you want to submit it as a separate PR, we'll consider it, but I'd need to stare at it quite a while. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note that astyle isn't caring about the formatting here - we'd normally want more braces. I guess this is being treated as "lwIP" code, so using lwIP style is fine.
It was just proposal - do whatever you want with it.
I don't agree. It's not a matter 1 extra But it was just side note. I found more fundamental issue related to LWIP mailbox sizes. I'll raise issue for it soon |
Hmm, yes, despite our docs saying nothing, looking at the lwIP headers, they do clearly say that the (But then with either of those changes I'd probably reduce the config settings down to 7 to compensate, as we don't actually need to store an extra entry, afaik. If it works fine as it is...) |
CI started |
Test run: FAILEDSummary: 1 of 11 test jobs failed Failed test jobs:
|
jenkins-ci/mbed-os-ci_dynamic-memory-usage restarted |
LWIP system mailbox overflow fix (cherry picked from commit 9248169)
Description
This is a bug fix for the LWIP mbed-os system mailbox overflow. The bug reveals during heavy network loads when a device is not able to handle all incoming network traffic in time, therefore the internal network buffers (including LWIP mailboxes and memory pools) work at their maximum capacities.
The bug was detected on K64F platform working as an UDP server flooded by large number of UDP packets for a long period of time. Description of encountered problems may be found here #11681.
Summary of change
Proposed patch constrains LWIP mbed-os system mailbox's post/fetch indexes value range from 0 to
MB_SIZE-1
(that is modMB_SIZE
integers). In the current implementation indexes are not constrained and wraps up at their maximum integer type value (255 foruint8_t
) causing mailbox corruption.Bug root cause
sys_mbox_trypost()
andsys_mbox_post()
defined infeatures/lwipstack/lwip-sys/arch/lwip_sys_arch.c
have a common code as follows:Both
post_idx
andfetch_idx
areuint8_t
defined insys_mbox_t
(filefeatures/lwipstack/lwip-sys/arch/sys_arch.h
). Conditional check in line 7 protects the mailbox against overflow (anti-overflow check).Issue 1
Let
MB_SIZE == 8
(default mailbox size value),post_idx == 255
,fetch_idx == 249
, thenpost_idx - fetch_idx == 6
. So there is possible to post only 1 more message before entirely filling up the mailbox. But, after posting the message we havepost_idx == 0
(uint8_t
wraps), sopost_idx - fetch_idx == -249 (0xffffff07)
(promotion fromuint8_t
toint
in the anti-overflow check). This causes the anti-overflow check to returnfalse
and the mailbox will be filled with newly posted messages causing mailbox corruption (and memory leaks on the underlying memory pool where the messages are stored).Issue 2
It will not happen for mailboxes where
255 mod MB_SIZE == MB_SIZE-1
, so will not reveal ifMB_SIZE
is 8, 16, 32, 64.... Therefore the default mailbox (size 8) is safe regarding this issue.Let
MB_SIZE == 10
,post_idx == 255
,fetch_idx == 250
. The mailbox looks like:After posting a new message,
post_idx == 0
(uint8_t
wraps):Next messages posted on the mailbox will overwrite the ones already stored in the mailbox (slots 0-5) causing mailbox corruption.
Pull request type
Test results