Skip to content

Update cmsis/rtx to version 5.3 #6273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 24, 2018
Merged

Conversation

bulislaw
Copy link
Member

@bulislaw bulislaw commented Mar 5, 2018

Description

Update CMSIS and RTX to version 5.3.

Pull request type

  • Fix
  • Refactor
  • New target
  • Feature
  • Breaking change

@deepikabhavnani could you have a look at the v8 integration.

@@ -97,8 +91,7 @@ static void default_idle_hook(void)
{
uint32_t elapsed_ticks = 0;

core_util_critical_section_enter();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c1728p9 we need to review this critical section. RTX5.3 marks the svcRtxKernelSuspend/Resume as static so we can't use it here and I'd rather not modify RTX again. Can you remember why are we entering the critical section before suspending the kernel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general problem is that the check that it is safe to enter sleep and the act of entering sleep need to be done atomically. If this is not done an interrupt could change state by waking a thread before WFI or WFE has been called.

This problem is visible this Tick-less Low-Power Operation code:

void osRtxIdleThread (void) {
    for (;;) {
        tc_wakeup = osKernelSuspend();
        /* Is there some time to sleep? */                                            
        if (tc_wakeup > 0) {
        tc = 0;
        /* Enter the low power state */
        MSP432_LP_Entry();
                                                //<- If an interrupt signals a thread to wake up at this point will the call to WFE() won't wake because of it. There is a thread that should be running now but won’t be able to until tc_wakeup ticks have passed.
        __WFE();
        }
        /* Adjust the kernel ticks with the amount of ticks slept */
        osKernelResume (tc);
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JonatanAntoni I don't think it's currently possible to protect against this race using RTX APIs, do you guys have a way of dealing with it?

Copy link
Member

@JonatanAntoni JonatanAntoni Mar 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__WFE should not have a race here, only __WFI would be problematic to my knowledge.
To my understanding each interrupt that is enabled to set the event register should do so. __WFE returns immediately if the event register is already set. Otherwise it puts the device to sleep until the event flag is set by any active event source. __WFE clears the event register on return.

For instance refer to http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0553a/CHDEAAHJ.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good to me, I had a look at the Architecture Reference and couple of manuals for specific designs from partners and it seems that no one does anything weird. Anyone against replacing WFI with WFE? @c1728p9 @0xc0170

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That solution sounds good to me as well as long as we don't have any devices with the errata mentioned here. I was under the impression that __WFE required any interrupt that could wake the system to call __SEV, which is not the case. This should be set by hardware automatically and its just errata that caused it not to be set for some devices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets change it then and resolve this internal API usage

@bulislaw
Copy link
Member Author

Updated to use WFE

@bulislaw
Copy link
Member Author

@c1728p9 @0xc0170 I've replaced the WFI with WFE now.

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine

I checked the wfe changes, found that few drivers are used in our codebase (I identified silicon labs, freescale, maxim, nuvoton - they invoke their driver sleep functions that use WFI by default, I could not locate WFE counterparts, thus have to change their driver code or implement sleep API without driver invocation).

@@ -7,11 +7,11 @@
* $Rev: 0.1 $
* $Date: 01-21-2016 $
******************************************************************************
* Copyright 2016 Semiconductor Components Industries LLC (d/b/a �ON Semiconductor�).
* Copyright 2016 Semiconductor Components Industries LLC (d/b/a �ON Semiconductor�).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these changed? github does not show the characters properly now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, must be some encoding issue with the file or they are not " but some weird unicode and my editor went crazy seeing that. I'll revert that.

@kjbracey
Copy link
Contributor

kjbracey commented Mar 21, 2018

I recently posted a bit of a "WFI/WFE for dummies" here:

https://os.mbed.com/forum/mbed/topic/29547/?page=1#comment-55419

The normal thing for an idle/sleeping OS to do is disable interrupts, shut down hardware, WFI, restart hardware, enable interrupts. And that's a privileged operation. You must disable interrupts around WFI. It's an OS instruction, and the interrupt disable makes the entire thing atomic, including whatever special hardware shutdown you're doing.

WFE is an unprivileged instruction for applications, who can't disable interrupts and just need to wait for something like a spinlock to be released, but don't want to yield back to the OS. I don't think it's what you should be doing here.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced the WFE change is conceptually right.

@kjbracey
Copy link
Contributor

As I've been discussing this sort of thing with @c1728p9 - the logic of WFI is similar to that of a condition variable. The condition variable wait assumes you must be holding a mutex to be inspecting the condition you're monitoring without a race, so incorporates that mutex to make it atomic.

Similarly WFI assumes you must be in a critical section to be doing hardware shutdown without a race, and thus is designed to wake for an interrupt source even though you're in that critical section.

It's not clear to me how removing the critical section you have is safe.

@kjbracey
Copy link
Contributor

This change also effectively changes the unspecified API of sleep() by no longer calling it from a critical section.

Stared at the RTX implementation a bit more - the suspend call to RTX is notionally atomic, and designed to work when called from normal thread context. After exit, the OS is suspended, with its timers stopped.

But interrupts are still enabled so device drivers could be queuing stuff for RTX post-processing - this could potentially quite rapidly hit ISR queue overflow, as the post-processing is suspended.

And this post-processing is a race - if an interrupt decides to release a semaphore after osKernelSuspend, osRtxPostProcess will set osRtxInfo.kernel.pendSV. We need to make sure we don't sleep if that happens. Now, using WFE could maybe avoid that - the interrupt setting pendSV would set the event flag. But the catch is that WFE must be used with interrupts enabled (always on ARMv7A, and unless a special config flag is set on ARMv7M) - a masked interrupt doesn't cause an event. If a hardware shutdown is (as seems likely) wants to shut off interrupts as part of the hardware shutdown, they can't portably use WFE.

To make this atomic and not change sleep semantics, I would suggest:

osKernelSuspend();              // outside critical section only because we're forced to be
critical_section_enter();       // to stop any more device activity
if (!osRtxInfo.kernel.pendSV) { // check devices didn't run any interrupt handlers kicking the OS between suspend and critical section
    sleep();                    // uses WFI - will wake as soon as any pending interrupts
}
critical_section_exit();        // lets pending interrupt handlers run (adding to ISR queue again!)
osKernelResume();               // off we go - finally empties the ISR queue

@c1728p9
Copy link
Contributor

c1728p9 commented Apr 3, 2018

It may be safest to stick with WFI for now, as it doesn't require changing all the existing ports. I have a patch which updates the idle loop as @kjbracey-arm suggested in #6534.

@cmonr
Copy link
Contributor

cmonr commented Apr 16, 2018

@bulislaw Are there any updates on your side with this PR?

@bulislaw
Copy link
Member Author

Yes, once i find some time ;)

@bulislaw bulislaw changed the title [WIP] Update cmsis/rtx to version 5.3 Update cmsis/rtx to version 5.3 Apr 24, 2018
@bulislaw
Copy link
Member Author

Updates:

  • Rebased on master
  • WFE removed and the idle loop updated as suggested by @kjbracey-arm

@c1728p9 @0xc0170 @kjbracey-arm please review

kjbracey
kjbracey previously approved these changes Apr 24, 2018
core_util_critical_section_exit();

// Ensure interrupts get a chance to fire
__ISB();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this ISB is overkill - we have SVC calls to osKernelResume and osKernelSuspend before our next disable, and SVC calls will allow interrupts to trigger according to the ARM ARM. (But then again, is it defined behaviour that these are SVC calls?)

core_util_critical_section_enter();
uint32_t ticks_to_sleep = svcRtxKernelSuspend();
if (ticks_to_sleep) {
if (!osRtxInfo.kernel.pendSV && ticks_to_sleep) {
os_timer->schedule_tick(ticks_to_sleep);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SysTimer (os_timer) was serialized to a critical section, but no longer is with this change. You'll either need to update SysTimer to add the additional synchronization or wait for #6534 to get merged.

@bulislaw
Copy link
Member Author

Rebased on top of #6534

@@ -79,7 +87,7 @@ SVC_Context
CBZ R1,SVC_ContextSwitch ; Branch if running thread is deleted

SVC_ContextSave
IF __DOMAIN_NS = 1
IF DOMAIN_NS = 1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flag name is changed by CMSIS, will have to check Mbed OS and change the flag name everywhere.

Copy link
Member

@JonatanAntoni JonatanAntoni Apr 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that inconvenience. The double underscore prefix is "reserved" for compiler intrinsic. Hence we removed this prefix to clarify that DOMAIN_NS is a user define and not a compiler one.

There are two "versions" of the IRQ assembly file for Armv8-M: irq_armv8mbl.s and irq_armv8mbl_ns.s. The former one contains the actual implementation. The second one just sets this define and includes the former one. There should be no need to know about this define anywhere else than locally within those two files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locally within this file (at the top) we have added pre-preprocessor directives, as Mbed OS does not support multiple assembly files. Those are not updated in this PR.
Also build tools enable this flag when building for NS core.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Will change that.

@bulislaw bulislaw force-pushed the update_cmsis_5.3 branch 2 times, most recently from b1c9e4f to 4315e11 Compare April 26, 2018 14:35
@deepikabhavnani
Copy link

@bulislaw - New PR which might have small clash here is in "ready to merge" stage. https://github.com/ARMmbed/mbed-os/pull/6747/files

This PR might need a rebase.

@bulislaw
Copy link
Member Author

/morph build

@orenc17
Copy link
Contributor

orenc17 commented May 22, 2018

/morph uvisor-test

@mbed-ci
Copy link

mbed-ci commented May 22, 2018

Build : ABORTED

Build number : 2105
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/6273/

@cmonr
Copy link
Contributor

cmonr commented May 22, 2018

Fyi, we're sorting through some things with the CI. As soon as it's resolved, we'll restart the builds.

@cmonr
Copy link
Contributor

cmonr commented May 22, 2018

/morph build

@mbed-ci
Copy link

mbed-ci commented May 22, 2018

Build : SUCCESS

Build number : 2109
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/6273/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented May 23, 2018

@mbed-ci
Copy link

mbed-ci commented May 23, 2018

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find this question but recall seeing it earlier, why are theres commits on this branch - looking at them on master, at least that import v0.30.0 does not match - the file added here is not there (46f9d46).

@Patater
@bulislaw
RTX5: uVisor: Add OsEventObserver  …
32d04a0
 @Patater
@bulislaw
RTX5: uVisor: Extend thread control block with context  …
86b91be
 @Patater
@bulislaw
RTX5: uVisor: Defer to uVisor for SVCall priority  …
2f7a841
 @Patater
@bulislaw
RTX5: uVisor: Switch threads very carefully  …
73a9579
 @Patater
@bulislaw
CMSIS/RTX: uVisor: Import v0.30.0
cb2b91c

@0xc0170
Copy link
Contributor

0xc0170 commented May 23, 2018

This PR should be now ready for integration, please last round of review and get this in for 5.9!

@cmonr
Copy link
Contributor

cmonr commented May 23, 2018

@JonatanAntoni @c1728p9 @kjbracey-arm @orenc17 @deepikabhavnani Final re-review?

@cmonr
Copy link
Contributor

cmonr commented May 23, 2018

@bulislaw Mind answering @0xc0170's question?

@bulislaw
Copy link
Member Author

I'm guessing rest of the files were already in the sources so rebase picked up only the file that had to be added.

@0xc0170 0xc0170 merged commit d8cb72a into ARMmbed:master May 24, 2018
TomoYamanaka added a commit to TomoYamanaka/mbed-os that referenced this pull request May 28, 2018
I fixed redeclaration of type name "bool_t" for target Renesas because this typedef has been defined in rtx_core_ca.h by ARMmbed#6273.
adbridge pushed a commit that referenced this pull request Jun 4, 2018
I fixed redeclaration of type name "bool_t" for target Renesas because this typedef has been defined in rtx_core_ca.h by #6273.
adbridge pushed a commit that referenced this pull request Jun 4, 2018
I fixed redeclaration of type name "bool_t" for target Renesas because this typedef has been defined in rtx_core_ca.h by #6273.
adbridge pushed a commit that referenced this pull request Jun 15, 2018
I fixed redeclaration of type name "bool_t" for target Renesas because this typedef has been defined in rtx_core_ca.h by #6273.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.