-
Notifications
You must be signed in to change notification settings - Fork 3k
Add absolute millisecond tick count to RTOS classes #5419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is a straw man to help kick #3648 along - I really want that to have This would be a minimal addition to the C++ API to make But do we really need every CMSIS-RTOS API in the C++ API? Is the C++ API supposed to be a complete abstraction layer providing all facilities of the CMSIS-RTOS abstraction layer? Is there anything wrong with expecting application to use |
(As an aside, I'm was surprised to see the CMSIS-RTOS 2 tick count being 64-bit. The API documentation all says it's 32-bit). |
User space facing C++ API always, providing us also keeping it backward compatible (cmsis API is evolving, changing). Is the proper placement for these 2 new functions (get tick count and freq) as part of Thread ? However as RTOSTimer is deprecated, I do not see any other class that would fit better apart from having a new one. Just those 2 attributes are not related to a Thread. New |
Sounds like CMSIS-RTOS is failing in its purpose then :(
I don't disagree - my first thought was A lot of the static Thread functions are debatable. The existing I guess if we think we'll want more |
I've found it weird for a while that we already have some global timing functions (wait_ms, sleep) but no function for access to some sense of global time. RTOS aside, why don't we just provide a high-level tick function (hooked to rtos when possible)? float mbed_tick();
mbed_tick_t mbed_tick_us();
mbed_tick_t mbed_tick_ms(); It's a common API, I find it strange it's missing in mbed OS. |
All of the user facing APIs use seconds/ms/us as a timebase and this is unlikely to change. We should seriously consider fixing the rtos tickrate to 1ms align with this. This won't add substantially to power consumption since with tickless RTX a tick will only occur when threads are already running. When all the threads are suspend the OS will stop ticking entirely. If this is done then the addition of the wait_until APIs will be greatly simplified and easier for our users. As for the global timing I agree that we definitely need these functions. Some kind of system wide timer which is available for both mbed 2 (backed by ticker or lpticker) and mbed 5 (backed by os) would be nice. Some thoughts about possible C/C++ API for this:
|
I agree with Russ that just defining "RTOS ticks == milliseconds" would avoid a lot of grief. Tick frequency really doesn't matter much, so doing that gains simplicity at basically no cost. Nothing wrong with choosing a value and sticking with it. Yes, would be useful to have the same thing available for mbed OS 2. Presumably there's an overhead issue there, so you wouldn't initialise it until first read? In effect it would be an on-demand global ticker, a bit like the on-demand global event queue? For mbed OS 5, it needs to be defined that this API is representing the RTOS ticks, thus is phase aligned with all the milliseconds in all the RTOS timeouts. @geky - I don't see how I think the "mbed_time_api.h" or whatever, with RTOS + non-RTOS versions can follow - figuring out exactly what the implementation is for non-RTOS is more complex. If we assume it calls the C++ RTOS API for the RTOS case, then we can pin that RTOS API down first. |
Great to see some alignment here. Having milliseconds , time API available for both mbed 2 and mbed OS.
👍 Users just used HAL us ticker getter functions to get time, this would finally replace this and have something "steady" :-) How it should look like? Just a note: long time ago we considered that once we can add chrono-like API. We are not yet c++11, I'll follow the idea here below, we can adust that for consistency to our codebase. What I would consider is something similar to above
Here some thoughts (pseudocode warning): This would be C++ API used, new class introduction:
what types do we have here? high resolution clock - timestamp_t that is 64bit wide. steady_clock - using ms resolution, 64bit also possibly? This can be extended for instance:
I might like if high resolution clock is on demand, thus needs created on demand and destroyed if not used (ctor/dtor would take care of it) - this is not what I have seen in chrono lib, it is always accessible. stead_clock or realtime_clock are always running, thus do not need this? Usage:
Looking at the usage above, we can see that this is actually Timer class, thus high_resolution clock could be eliminated. What we care are the other two (having finally RTC C++ API and System time C++ API under one), don't we ? This would be for mbed OS. The only diff above would be for steady_clock(), as RTOS is not present there, they would as noted above by @c1728p9 map to us ticker for instance. I used
cc @pan- - might have some thoughts |
0f6e432
to
2d6a33d
Compare
Quick rework, into having a By re-orienting into being a Kernel API, I'm kind of trying to side-step the bigger discussion about what time APIs and timebases would be needed independent of RTOS, and lock down access to the specific things that the RTOS already provides - in particular, reading the tick count that matches what we're passing to all the semaphore waits etc. A wider chrono-style API, or any other sort of RTOS-independent time API, could layer one of its clocks on top of this in the RTOS case. As it stands, this would unblock |
2d6a33d
to
aa56026
Compare
rtos/Kernel.h
Outdated
* | ||
* The Kernel object cannot be instantiated - all functions are static. | ||
*/ | ||
class Kernel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a class instead of a namespace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good answer to that. I don't have a preference either way.
My aim was merely that these methods appear much the same as all the static Thread
methods to casual users. A namespace might be a nicer way of doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice it doesn't change much; namespace or class with only static members are not very different however semantically a namespace indicates that kernel
is not an object and therefore cannot be instantiated as one; using a namespace clarifies the intent.
It also avoid to reference Kernel as a data structure in the documentation generated by doxygen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping things simple and adding these two extra timing related functions just to the Thread class?
uint32_t get_ms_count();
osStatus delay_until(uint64_t millisec);
The function delay_until is another form of delay so it make sense to reside alongside it and the function get_ms_count is coupled with delay_until, so it makes sense to go there as well.
rtos/Kernel.cpp
Outdated
namespace rtos { | ||
|
||
uint64_t Kernel::get_ms_count() { | ||
return osKernelGetTickCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was changed back to uint32_t in more recent versions of RTX:
ARM-software/CMSIS_5@9470b0b#diff-818bf32b54ffc78cfbdfe02740cf27f2R343
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah. Okay, that's going to cause grief. Are we moving to 2.1.1 in the immediate future?
I want this API to be stable - I guess we'd make it 32-bit now, and make sure it's written to cope with both 32-bit and 64-bit underlying?
Also, only 2.1.1 says it's okay to call from ISR, 2.1.0 says it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brother. Looking deeper, RTX 5.1.1 does actually permit call from interrupt, so could let that pass.
But looking at RTX 5.2.0, I think the implementation (and possibly the CMSIS 2.1.1 spec) is flakey. Wrapping isn't handled sensibly. If the current time is 0x8000 and you say delay_until(0x7ffe)
, it assumes you want to wait until 2^32-2 ticks in the future, not that you've overrun and are 2 ticks late. That's not a sensible assumption - it would be more conventional to assume half the number space is past and half is future.
2.1.0 is solid, because it returns immediately in that case - it does 64-bit maths, computes "target-now" as 2^64-2, and decides that's not a sensible wait because it is >=2^32-1, and returns osErrorParameter immediately. (Arguably it maybe could have said OK, but it does return immediately, which is the main thing).
I would be very wary about actually using this API with 2.1.1 behaviour. Get CPU starved a bit too long and you're sleeping for 49 days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah. Okay, that's going to cause grief. Are we moving to 2.1.1 in the immediate future?
I would assume we would. There are couple of changes incoming (cortex-a, cortex m33) that might require some updates (some bugs were found and are still being in discussion).
@JonatanAntoni are our findings correct that are described above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we decided to go back to uint32_t as tick type with CMSIS 5.1 release. osDelayUntil will wait exactly until the given tick time is reached (again). So if you call osDelayUntil with a value slightly lower than osKernelGetTickCount it'll wait 49 days, sure. I think we don't have a chance to predict if you really want to wait 49 days or not. Please let us know if you have reasonable concerns about the current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just raised an issue over here, going in to detail on my implementation concern: ARM-software/CMSIS_5#277
If I fix this API at 32-bit, and we need to support 2.1.0, I'm going to have to muck around anyway with 32-bit/64-bit adaptation, so it's not going to be straight mapped to osDelayUntil, so maybe I could also muck around in 2.1.1 to adjust the past handling to "31-bit" to suit my taste...
As it happens, I'm not intending to use osDelayUntil at this point anyway, just have an analogue to it in ConditionVariable::wait_until() - its timeout will be based on Semaphore::Wait, so it is free to have its own absolute timestamp interpretation, but behaviour really should be aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the 32-bit/64-bit difference - how about making the timestamp be a typedef, so it matches and adapts to the underlying system? Would be less problematic than trying to handle 32<->64 bit conversions.
Normal users shouldn't care whether it's just 32 or 64 bit, just that the "get ticks" and "until" APIs use the same numbers.
Although I'm not sure how I'd detect it, without grobbling into RTX headers - it's a compile-time difference and there doesn't seem to be a compile-time version macro in "cmsis_os2.h". And we're not using C++11, so no decltype(osKernelGetTickCount())
.
rtos/Kernel.cpp
Outdated
return osDelayUntil(millisec); | ||
} | ||
|
||
int32_t Kernel::lock() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any use cases for lock and unlock in mind? I would be inclined to leave this private unless it is needed for a certain reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No immediate use, but it seemed like if adding a Kernel class, it was the most obvious API from osKernel that you people might want. Otherwise it will just be the tick count, if I remove the delays, as I'm now minded to.
rtos/Thread.h
Outdated
@param millisec absolute time in millisec | ||
@return status code that indicates the execution status of the function. | ||
@note not callable from interrupt | ||
@note if the time in the past, or >= 2^32-1 ms in the future, this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be updated if the tick API changes to 32 bit in RTX (as mentioned above).
rtos/Thread.h
Outdated
returns with an immediate error | ||
@note this call is equivalent to Kernel::delay_until() | ||
*/ | ||
static osStatus wait_until(uint64_t millisec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you thoughts on dropping the Kernel class and adding get_ms_count() to Thread
since the other timing functions are already there? The Kernel class could always be added later if kernel specific functionality is needed. This would also keep the API expansion to a minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was what my first draft looked like, but changed to this after @0xc0170's comments, which I tended to agree with.
I'm struggling a bit to find logic in CMSIS-RTOS's placement of stuff in osThread versus osKernel versus osDelay - why not "osThreadLock" and "osKernelDelay" and "osYield"?
Although, thinking harder I guess delay is specifically current thread only, so actually delay probably /shouldn't/ be in Kernel - delaying one thread doesn't affect anything else in the system, unlike the lock. I'm now inclined to take those out of Kernel.
It make sense to have |
Well, this got complex fast. Added code to cope with a 32-bit counter from the RTOS and convert it to 64 bits, so we retain a lovely non-wrapping 64-bit count in case we move to API v 2.1.1, and avoid its "wait for 49 days" problem. Also removed the delay functions from Kernel, as they don't act on the kernel as a whole. |
rtos/Thread.cpp
Outdated
// by looking at the return type of osKernelGetTickCount. We assume | ||
// our header at least matches the implementation, so we don't try looking | ||
// at the run-time version report. (There's no compile-time version report) | ||
if (sizeof(osKernelGetTickCount() == sizeof(uint64_t))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this taking the size of a comparison? If so wouldn't that always be the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a compile-time test. There's no macro to test, so this is the only way to distinguish the 2.1.0 vs 2.1.1 APIs. I can imagine it may produce "constant controlling expression" warnings - may need to add suppression pragmas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be
sizeof(osKernelGetTickCount()) == sizeof(uint64_t)
rather than
sizeof(osKernelGetTickCount() == sizeof(uint64_t))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Good spot. Totally failed to grasp what you were saying earlier...
rtos/Thread.cpp
Outdated
do { | ||
// RTX internals seem to avoid 0xFFFFFFFF delays, so do the same, even | ||
// though nothing in documentation says osDelay(0xFFFFFFFFF) shouldn't work. | ||
const uint32_t max_delay = 0xFFFFFFFE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not matter here, but for all the other wait_until variants (based on Semaphore, Mutex, event, and stuff) the 0xFFFFFFFF maps to osWaitForever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't part of the API, it's just caution on the loop - "how big a chunk will we break this into for the underlying osDelay call?" The number is effectively arbitrary. I'll scrub the loop anyway.
@@ -334,6 +334,43 @@ osStatus Thread::wait(uint32_t millisec) { | |||
return osDelay(millisec); | |||
} | |||
|
|||
osStatus Thread::wait_until(uint64_t millisec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible alternative, is rather than converting up to 64 bits, you could instead convert down to 32 bits. You could still handle the case there the time is slightly in the past inside wait_until. The code to perform the delay would also be simplified substantially.
This would make the behavior for things like Semaphores, Mutexes and Events implementing wait_until more consistent, since it would just be a single call, rather than a loop which will break any ordering RTX tries to keep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try that approach, but I don't have a good feeling about it. There's complication in both cases. This one has the notional advantage that it's either
- use 64-bit API directly (osKernelGetTickCount and osDelayUntil), or
- build on top of 32-bit APIs
If it's the other way around, we know the 32-bit API is broken in 2.1.1, so there's no possibility of just calling osDelayUntil directly - all the calls will be doing emulation on top of osDelay.
The first approach has at least 1 "native" straight implementation where we're just calling the RTOS, which makes me happier. And users get a non-wrapping timebase out of it.
And I'm hoping CMSIS will see the error of their ways and put the API back to 64-bit, so we can scrub all the 32-bit stuff as coping with a 1-off total API breakage mistake in 2.1.1.
I suppose for the other calls where this is (currently) no underlying "until" in CMSIS-RTOS, so you're emulating anyway, having it be 32-bit means not having a loop. But you don't necessarily need to bother with loops for 64-bit - the options are:
- 64-bit absolute timebase - Semaphore::wait_until loops multiple osSemaphoreWaits to allow unlimited wait
- 64-bit absolute timebase - Semaphore::wait_until makes only 1 call, with wait limited to 49 days (checked on entry with error, or just would time out after 49 days if not signalled?)
- 32-bit absolute timebase - Semaphore::wait_until makes 1 call, wait limited to 24 days
If you're happy with option 2, the 64-bit approach can be just as simple as the 32-bit one. And that is what RTX 5.1.0 actually does for its osDelayUntil.
And if there ever was a native osSemaphoreWaitUntil in 2.2.0 with a 64-bit timebase for us to call, it's likely it would be limited in the same way osDelayUntil, so maybe we shouldn't be making undue effort to do better with loops.
Okay, I'm convinced. If we're doing 64-bit on top of 32-bit, there will be a 49-day limitation. And that applies whether it's the mbed C++ class doing that or the CMSIS-RTOS osXXXUntil
doing it on top of a 32-bit underlying OS. Longer than 49 days requires "64-bit all the way down", which will not be guaranteed by the mbed API.
The ordering affect is a bit subtle - I guess you're thinking something like
- On day 0, T1 schedules wake up for day 50 - means first delay to day 49
- On day 2, T2 schedules wake up for day 50
- On day 49, T1 wakes up after 49-day delay and sets final delay to day 50
- On day 50, when both delays expire simultaneously T2 might wake first when should have been T1
Does RTX or CMSIS-RTOS even guarantee wakeup ordering like that?
rtos/Thread.h
Outdated
@note if millisec indicates the current tick count it will return immediately, | ||
either with "osOK" or an error. | ||
@note depending on implementation, the maximum wait time may be limited | ||
to 0xfffffffe milliseconds, which means specifying a time further |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
osWaitForever would match existing defaults and be sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't an intended API, it's a warning about implementation limitations due to being underlying 32 bits, which is a numerical problem not directly related to osWaitForever.
If osWaitForever didn't exist, you'd still have a problem with delays >0xffffffff. osWaitForever's existence just nudges that down to 0xfffffffe.
As "osWaitForever" here shouldn't mean forever, using it to express the limit would just be confusing. It would be like saying "the maximum wait time may be limited to infinity-1". (eh?)
Maybe this should be made a bit woolier, to get away from confusing it with osWaitForever:
the underlying implementation may have a limit to the maximum wait time,
due to internal 32-bit computations, but this is guaranteed to work for any
delay <= 0x7fffffff milliseconds (~24 days)
rtos/Kernel.h
Outdated
@return negative osStatus error code on error | ||
@note not callable from interrupt | ||
*/ | ||
int32_t lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a bad idea to expose as an API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe. Was just trying to justify the existence of the class/namespace by giving it more than one method. I'll drop it.
rtos/Kernel.h
Outdated
@return negative osStatus error code on error | ||
@note not callable from interrupt | ||
*/ | ||
int32_t unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a bad idea to expose as an API.
9544046
to
41c73c3
Compare
41c73c3
to
e909ca1
Compare
Rebased now that ConditionVariable exists, and go through adding "until" calls to it and other classes. All "wait_untils" have the simplifying assumption that permits them to return early if the delay is excessive (specified as >24 days, actually >49 days in current implementation). |
|
||
bool Mutex::trylock_for(uint32_t millisec) { | ||
osStatus status = lock(millisec); | ||
if (status == osOK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose _count should be incremented in case of success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lock(millisec)
already does that - this is just munging the return value (and asserting there's no programming error).
This PR streamlines so everything ultimately goes via Mutex::lock
, so the __count++
is in one place.
(Do feel a bit sad that plain lock()
isn't void
return with a big assert on anything other than osOK though.)
Exporter Build : SUCCESSBuild number : 578 |
Test : FAILUREBuild number : 744 |
/morph build |
Build : SUCCESSBuild number : 959 Triggering tests/morph test |
Exporter Build : SUCCESSBuild number : 644 |
Test : FAILUREBuild number : 786 |
Give C++ access to the RTOS's absolute timebase, reducing the need to run private Timers and similar. Allows wait_until functionality, and makes it easier to avoid time drift. Place it in a new header and namespace in case we want more kernel functions in future. Try to cover over the breaking API change potentially upcoming in CMSIS-RTOS 2.1.1, when it reduces the tick count from 64-bit to 32-bit. (See ARM-software/CMSIS_5#277) Explicitly state that ticks are milliseconds in mbed OS, despite CMSIS RTOS 2 permitting different tick rates. See also ARMmbed#3648 (wait_until for condition variables) and ARMmbed#5378 (EventQueue should use RTOS tick count).
API is somewhat loose to cope with potential shift in the underlying RTOS APIs.
No ConditionVariable methods are useable from ISR, due to the built-in mutex and the requirement that the caller of notify holds the mutex.
Given the 64-bit timebase, add wait_until to ConditionVariable. Move the timeout example to wait_until(), and give wait_for() an alternative example, as it's no longer the best option for a timeout. Tidy up - remove the redundant RESUME_SIGNAL definition.
Given the 64-bit timebase, add trylock_until to Mutex. Naming is based on a combination of Mutex::trylock, Thread::wait_until, and C++11 timed_mutex::try_lock_until. pthreads and C11 use "timedlock", but that's not a good fit against our existing trylock() and lock(timeout) - they have only absolute-time waits, not relative. To increase the similarity to C++11, add trylock_for - same parameters as lock, but with the bool return value of trylock and trylock_until. Add an assertion when convering status codes to booleans to check that there are no non-timeout errors.
Given the 64-bit timebase, add wait_until to Semaphore. Naming is based on Thread::wait_until. pthreads uses "timedwait", but that's not a good fit against our existing wait() - pthreads only has an absolute-time wait, not relative.
83084b8
to
2b77caa
Compare
Rebased as I hear it may solve LP ticker test issues. |
/morph build |
@sg- Are you alright with the updated commits? |
Build : SUCCESSBuild number : 992 Triggering tests/morph test |
Exporter Build : SUCCESSBuild number : 686 |
Test : SUCCESSBuild number : 816 |
Waiting on @sg-. If no response, will merge EoD Monday (CST). |
Give C++ access to the RTOS's absolute timebase, reducing the need to run private Timers and similar. Allows wait_until functionality, and makes it easier to avoid time drift.
Explicitly state that ticks are milliseconds in mbed OS, despite CMSIS-RTOS 2 permitting different tick rates.
Try to cover over the breaking API change potentially upcoming in CMSIS-RTOS 2.1.1,
when it reduces the tick count from 64-bit to 32-bit. (See ARM-software/CMSIS_5#277)
Add "wait_until" methods to Thread, ConditionVariable, Semaphore and Mutex.
See also #3648 (wait_until for condition variables) and #5378 (EventQueue should use RTOS tick count).