-
Notifications
You must be signed in to change notification settings - Fork 3k
RTOS: Fix semaphore #4579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RTOS: Fix semaphore #4579
Conversation
rtos/Semaphore.cpp
Outdated
@@ -27,7 +27,7 @@ | |||
namespace rtos { | |||
|
|||
Semaphore::Semaphore(int32_t count) { | |||
constructor(count, 1024); | |||
constructor(count, UINT16_MAX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot your include of stdint.h
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but I just translated the CI failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually including stdint.h didn't work. UINT16_MAX isn't defined until c++11/c99. I've just updated to a hex literal.
Before rtx 5, the max count on semaphores was UINT16_MAX, aftewards it was decreased to 1024 with an assert on overflow. This is especially problematic for semaphores used for signaling, since there is no replacement currently available in C++.
da09aec
to
6b02cea
Compare
@geky Please capitalize "RTOS" and change "Unbreak" to "Fix" or "Improve" in your subject line. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the submission that will help a lot to have the previous behavior back.
Once done, please start morph CI |
@geky: With this change, i see issue in future with current socket implementation. Semaphores are used for signalling and with every release we allow count to go high till max, but in case of connectivity error (or some bug) receive won't wait for any release signal and continue processing till current count > 0. This means we are pushing bug/connectivity issue to MAX count. You are anyways removing assertion from overflow, what is the need to increase overflow count till max? |
@deepikabhavnani, that's a good thought, and I'm still heavily in favor of #4580. But even with these slow microcontrollers software-only oprations are very fast. The time it takes to decrement the semaphore all the way down from max count is trivial compare to how long a network operation is. You're right we haven't seen a reason to increase the semaphore back up to 65536, but I haven't seen a reason we're at the current value of 1024, and 65536 is at least backwards compatible. |
@geky: requirement for queue and sockets is to have binary semaphore, shouldn't we set max count argument as 1 for them (not default). Other alternative would be to check the count before performing release and make sure count does not go beyond 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this - for the reasons you've explained we've lots of existing semaphore users who could tend to saturate, and we've not had time to migrate them to anything better.
At some point, allowing it to assert again might be appropriate, once we've got C++ EventFlags (and maybe ConditionVariable?) for these sorts of users migrate to.
@tommikas Can you please restart CI , there are at least two jobs from Friday with no update. I cancelled some, please check if I was able to or restart them please. This one seems to be started 30 mins ago, after the weekend. Did you restart CI ? Should we just wait or it wont update ? Thanks |
@0xc0170 There's been some trouble with Jenkins, requiring at least one restart on friday. That caused a build to hang, blocking all the builds over the weekend. The blockage has been cleared and builds are running now but there's quite a bit of a queue built up. The ones you restarted seem to be running too, it'll just take Jenkins some time to get to them. |
/morph test-nightly |
Fixes #4584 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Result: SUCCESSYour command has finished executing! Here's what you wrote!
OutputAll builds and test passed! |
Before RTX 5, the semaphore count limit was UINT16_MAX as a side effect of using a uint16_t (here). After #4294 was merged, the count limit was dropped to the arbitrary value 1024 (here). Additionally, #4389 added an assert that would halt if the semaphore count exceeded this limit (here and here).
This is especially problematic for applications that are using the semaphore for signaling. In RTX 4, the semaphore was the only library-friendly mechanism available for signaling, and in RTX 5 it is still the only option for the C++ layer.
This has already broken several higher-level apis, and we haven't even made the 5.5 release yet:
If we're already having breakages internally, I'm not really looking forward to next week when the release is out.
On top of the breakage, we don't really have a path for users. In the C++ api, a semaphore is the only library friendly signaling mechanism. RTX has thread signals, but these are limited to a single bit-field that is shared with users. @YarivCol is working on a wrapper for the osEventFlags #4517, but it's not ready to merge yet.
Lets look at how other libraries handle semaphore overflow:
By changing the semaphores behaviour and ruling out the use of semaphores for signaling, we're challenging the user's expectations rather than providing a tool they are familiar with.
This patch removes the assertion on semaphore overflow and instead relies on the semaphore saturation introduced in RTX 5. This should match how a user expects a semaphore to behave and returns the ability to use semaphores for signaling.
related issues #4580, #4575, #4571, #4560, #4517, #4500, #4492, #4449
cc @sg-, @c1728p9, @deepikabhavnani, @pan-, @kjbracey-arm