-
Notifications
You must be signed in to change notification settings - Fork 3k
fix nordic critical section #5595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix nordic critical section #5595
Conversation
2f12d7e
to
0bfaaa8
Compare
@pan- @anangl @anangl Can you review this? |
|
||
static uint8_t _sd_state = 0; | ||
static volatile uint32_t _entry_count = 0; | ||
uint8_t nordic_cr_nested = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for the local scope only - static
?
signals are lost. It was observed at last for tests-api-spi ci-test-shield's test. This patch introduce usage of sdk5 origin implementation in which sd_nvic_critical_region_enter/exit is calling each time critical region is enter/exit. This fixes the issue.
0bfaaa8
to
b084812
Compare
@nvlsianpu Could you explain what was the root cause ? |
Good point, I would like to suggest to have this part of the commit - why and how the commit fixes the bug. Please note that first line should be <=50 characters |
@nvlsianpu with this change running against https://github.com/ARMmbed/mbed-os-example-spiflash-driver it does fix the problem where the board would get stuck in an infinite wait/sleep state accessing the SPI storage using the NRF52840. However, it also comes across a new assertion that I hadn't seen before. Tested with this PR and #5606 rebased with master. spif test |
@dlfryar I remember that I was observing similar assertion while I was debugging: when I had stooped the execution and resumed. My explanation is that: with the tickles enabled, nRF52840 uses RTC as clock source so clock is running even if the core stal in breakpoint - then expected tick might be deferent from those fetched after resume. Is that what you observed? |
@nvlsianpu this same assertion has been seen in other testing apart from the example. In both cases they were simple applications executing with no debugger attached. Is there a known fix to resolve the assertion? |
|
||
void core_util_critical_section_enter() | ||
{ | ||
// if a critical section has already been entered, just update the counter | ||
if (_entry_count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one of possibles scenerio for 'previous code 'is:
In routine A _enter_count check could had been positive, then execution of routine can been interrupted by routine B which would call core_util_critical_section_exit. In result after routine A executed core_util_critical_section_enter the critical section is not enabled - and the _entry_count value is equal to 1.
So _entry_count check-modify must be a atomic operation in order to mitigate this and similar scenarios.
This Is exactly how app_util_critical_region_ente call work.
@pan- , @0xc0170
This is answer to:
@nvlsianpu Could you explain what was the root cause ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining the bug it completely make sense.
/morph build |
Build : SUCCESSBuild number : 756 Triggering tests/morph test |
@0xc0170 hold on this, I discover serious problem with this PR, will fix it soon. |
Exporter Build : SUCCESSBuild number : 409 |
looks like all mbed-os tests pass so fare. |
Unfortunatelly DUT-DK stoped reponding to mbedls (It hapens sometime :( ) I will put tests results after NY. |
Test : FAILUREBuild number : 587 |
mbed-os tests results - all passed:
|
@0xc0170 #5595 (comment) : Nordic target were not included as DUT. |
/morph build |
Build : SUCCESSBuild number : 781 Triggering tests/morph test |
Exporter Build : SUCCESSBuild number : 434 |
Test : SUCCESSBuild number : 616 |
nordic_cr_nested--; | ||
|
||
if (nordic_cr_nested == 0) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you fix the code style in this function - {
here on the same line as if
,
#ifdef on line 96 at the line beginning (also 69)
line 100 , same as this line
line 78 also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed by push --force
Thanks for sharing the tests results. Apart from style changes request, LGTM |
which can be called from diferent contexts. Orginal nordic critical primitives must been called in pairs from exacly the same function. As mbed hal call it in separate methods, so they are not suitable here.
6f07fe8
to
ae41c7d
Compare
/morph build |
Build : SUCCESSBuild number : 787 Triggering tests/morph test |
Exporter Build : FAILUREBuild number : 441 |
Node failure, restarting /morph export-build |
Exporter Build : SUCCESSBuild number : 448 |
Test : SUCCESSBuild number : 622 |
@scartmell-arm This fix gets will need to be propagated to the critical section update |
Reference: #5346 (this PR will require an update) |
current critical section implementation makes possible that interrupt
signals are lost.
This patch introduce usage of sdk5 origin implementation in which
sd_nvic_critical_region_enter/exit is calling each time critical region
is enter/exit. This fixes the issue.