-
Notifications
You must be signed in to change notification settings - Fork 3k
[rtos] Fix irq-safe rtx calls when inside critical sections #2391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looks good to me |
Hold off on the bot for just a bit, making a couple tweaks right now. I'll start it when I'm done |
/morph test |
Result: FAILUREYour command has finished executing! Here's what you wrote!
Outputmbed Build Number: 612 Build failed! |
9ad767b
to
08ca38a
Compare
/morph test |
Result: FAILUREYour command has finished executing! Here's what you wrote!
Outputmbed Build Number: 613 Test failed! |
This is the limitation if RTX (not using any RTX API with interrupts disabled) - reported here - #1841 (there's a comment about unprivileged mode, something to consider?) Taken from the RTX sources:
The example you provided looks fine as it does not have any functional change, but lets replace semaphore by mutex. With this fix, the unlock would return ISR error status even in the thread mode. I am not fully certain this is something we could fix, however having hard fault for use cases like this is not user friendly. @RobertRostohar - please have a look at this change-set, what do you think? |
@0xc0170, this pr isn't attempting to change the rtos behaviour in interrupt contexts (maybe I should rename this pr), but rather changes the rtos to treat critical sections where interrupts are disabled as though they are interrupt contexts. Mutexes should still error in critical sections, since it's a context where the thread can't yield, and the processor would deadlock. However irq-safe functions, such as the semaphore release, are perfectly valid. |
08ca38a
to
9fec1c4
Compare
/morph test |
Result: SUCCESSYour command has finished executing! Here's what you wrote!
Outputmbed Build Number: 621 All builds and test passed! |
@RobertRostohar @ReinhardKeil Any concerns for this changeset ? Shall this be also sent upstream to the RTX kernel? |
I'm not thrilled about this solution and would not go this way. The proposed changes make function calls from threads with disabled interrupts to be treated by RTX as if they were called from ISR. Therefore the functions will either return an error indicating that they cannot be executed from ISR which is not compliant with the API. Or they will be executed as if they were called from ISR. This will bypass the problematic SVC but can have also other unwanted side effects. Some of the functions perform slightly different operations when called from thread or ISR. Some of the functions also assume that they are executed in Privileged mode (achieved by SVC) which might lead to other issues. Because of all this I'm not in favour of this approach, especially not since it is applied to all functions. I would not consider this to go into upstream RTX or RTX5 and would discourage using it in mbed also. What is the problem that you are actually trying to solve? The initial problem description was that printf cannot be used inside critical sections. That was due to interrupts being disabled when entering a critical section and RTX functions (implicitly called by printf) are not allowed to be called when interrupts are disabled. Critical section is typically implemented by disabling interrupts (global or partially). However it is also common that during critical sections RTOS functionality is limited. RTX currently does not provide directly support for critical sections but can co-exist with above mentioned implementation of critical sections as long as RTOS functions are not called inside critical sections. Even FreeRTOS which explicitly includes support for critical sections (enter/exit functions) states that no other RTOS function is allowed to be called inside critical section. Is it only printf (or other Clib functions) that needs to be called from critical sections? In that case only the Clib mutli-threading interface functions that RTX implements need to be changed. |
Consider an irq-friendly data structure that signals users on state-changes. A flexible implementation may take a callback with context to call on state-changes, leaving it up to the user how to defer from interrupt context if needed. The state associated with the user-provided callback must be synchronized from both foreground and interrupt contexts, leaving a critical section as the obvious solution. This results in the user-provided callback being called from a critical-section, but given that the callback may already by called from an interrupt context, I would expect no problems (and there are none with this patch). However, rtx does not understand the critical section and hardfaults even in irq-safe functions. This is especially problematic since it prohibits signalling a semaphore, which is one of the most common methods of deferring execution from interrupt contexts. The specific case where this issue became problematic would be in the mbed-events library, where hardware/software timers need to be updated in foreground/irq contexts. This is also a problem for callback APIs with multiple implementations, such as the network-socket API and the mbed hal. Alternatively, I did manage to stitch together a workaround using an RCU lock, however that requires irq-safe dynamic memory or finite simultaneous updates, which is a whole other issue.
This is not true. It is poorly documented on the FreeRTOS site, but the Another example is the Linux kernel, which allows semaphore signalling in any context, including critical sections with disabled preemption: http://lxr.free-electrons.com/source/kernel/locking/semaphore.c#L171
Although critical-sections are poorly defined in the cmsis specification, I would argue returning osErrorISR is better than non-descriptively hard-faulting. You make a good point about privileged mode. If an rtx operation requires hard-mode, this patch retains the original behaviour, which is to hard-fault. |
I believe @RobertRostohar meant that within critical section there should be no FreeRTOS API call, "FreeRTOS API functions must not be called from within a critical section. ". taken from: http://www.freertos.org/taskENTER_CRITICAL_taskEXIT_CRITICAL.html. I assume this is a general rule, and some
As there was a valid objections to the return code, shall we consider a new error status for not allowed with interrupts disabled ?
@RobertRostohar Any other proposals for this use case? What functions are missing? What side affects ? How we can address them? Hard fault is not ideal, thus I would go with a new error code, fix comments (ISR or privilged , add there + interrupt disabled). If there's anything else we have missed, let us know. |
Any updates here? |
@sg- This would require a bit of work if we wanted to adopt the |
Despite that we have not really heard about user complaints not being able to use RTX functions while interrupts are disabled, I do recognize that there are use cases (also mentioned in this thread) where such functionality is needed. Therefore we should make this possible but only for functions where this makes sense (avoiding overhead for all functions and potential side effects). We should identify those functions which will probably be only those that can be called from IRQ: osSignalSet/Clear, osSemaphoreRelease, osMessagePut/Get. I have analyzed the impact of privileged/unprivileged code on those functions and the initially proposed changes should not be an issue from that perspective (see below). We do plan to add functions for Critical sections to CMSIS RTOS API2 and RTX5 at some point and also address privileged/unprivileged threads. Currently we will be focusing on ARMv8-M and TrustZone which introduces even more complexity. At this point Critical sections can be externally implemented by disabling all interrupts (PRIMASK) or partially (BASEPRI). However this can be only done from code executing in privileged mode. Therefore it is safe to execute RTX isr* functions which require privileged access and blocked interrupts used by RTX. Therefore the proposed changes (slightly modified to cover BASEPRI) should be ok for the limited set of functions. There is also no need for new error codes in this case. Example:
|
9fec1c4
to
e41b129
Compare
Hi @RobertRostohar, will be looking forward to critical sections in RTX5. Thanks for looking a bit into the impact of critical sections on the rtos. I have updated this patch to also check the BASEPRI register as you suggested. Let us know if you see any concerns with this implementation. |
As mentioned in my previous comment I would not add this for all functions but rather just for a few relevant ones. Otherwise I don’t have any concerns anymore (even if it is applied to all functions). Anyway we plan to address this in RTX5 in the future together with critical sections (also document which functions may be called when in critical section). |
@geky were you going to update for specific functions? |
Sorry, @RobertRostohar, the only changes to the other functions should be to consistently return an error if called in a critical section (correct me if I missed something). Do you want the erroring checks to be removed to avoid possible regressions? I can replace the errors with asserts or just remove them completely. Let me know which you prefer. |
@mbed-bot: TEST HOST_OSES=ALL |
/morph test |
Result: FAILUREYour command has finished executing! Here's what you wrote!
Outputmbed Build Number: 809 Build failed! |
[Build 890] |
@geky Have a look at CI results. There are a few error such as this. Looks like all M0 targets are affected
|
@GKEY, I was trying to avoid additional overhead to all functions just to report errors when called from critical section (and the error code is anyway not consistent as discussed earlier). Let’s keep the proposed solution for the time being as it is in the PR. In the long term with CMSIS RTOS API2 and RTX5 we should:
|
e41b129
to
4336283
Compare
I've updated the pr to remove BASEPRI for now, since the register is absent on M0 devices and unused in the mbed OS codebase. Let us know if anyone sees an immediately important reason to accommodate the BASEPRI register. |
Before, rtx calls would hard fault in critical sections when an svc instruction was attempted with interrupts disabled. Required changes: - Added check for CPSR I bit in cortex A rtx - Added check for PRIMASK in cortex M rtx - Modified critical sections in cortex M rtx to be recursive (already recursive in cortex A)
4336283
to
3e7b5ed
Compare
/morph test |
Result: SUCCESSYour command has finished executing! Here's what you wrote!
Outputmbed Build Number: 931 All builds and test passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Before, rtx calls would hard fault in critical sections when an svc instruction was attempted with interrupts disabled.
Code exhibiting the issue:
Required changes:
(already recursive in Cortex-A)
cc @c1728p9, @0xc0170