-
Notifications
You must be signed in to change notification settings - Fork 3k
Optimise HAL_GetTick API #12915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise HAL_GetTick API #12915
Conversation
a22fc21
to
2bde715
Compare
elapsed_time += (new_time - prev_time) & 0xFFFF; // Only use the lower 16 bits | ||
prev_time = new_time; | ||
return (elapsed_time / 1000); | ||
elapsed_ticks = elapsed_ticks / 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be elapsed_time / 1000
.
You would have got a warning about using an unset variable if you hadn't set elapsed_ticks
to 0 at the start - good example of why bogus initialisation can cause errors. Personally I would do
prev_time = new_time
uint32_t elapsed_ticks;
if (TIM_MST_BIT_WIDTH <= 16 || elapsed_time < 65536) {
elapsed_ticks = 0;
while (elapsed_time >= 1000) {
to maximally localise declaration and minimise unnecessary sets, but others' taste no doubt differs. The current form is fine once corrected, and probably compiles to the same code.
// Variables also reset in us_ticker_init() | ||
uint32_t total_ticks = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If retaining these variables as global, please give them some sort of prefix to minimise chance of namespace collisions.
But you can make them static. As I said, I don't think they need manual init, but even then, this current code makes no sense. Why is init_16bit_timer
setting them to zero, rather than HAL_InitTick
itself? Do it there, and they can be static.
Ah comments suggest that us_ticker_init
may have previously been initialising them (why?) But that's no longer true, it seems. Some start-up previous ordering problems have apparently been resolved. Clean up any such comments.
It's worth noting in the description that this is also a small optimisation for other C libraries - you had it going from 10us to 5us for ARM_STD, was it? |
2bde715
to
54d00df
Compare
@kjbracey-arm Updated the PR description to show that information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. Only suggestion is just maybe a comment on the if
about it being a speed optimisation for small time intervals, avoiding a potentially-slow C library divide,
(Otherwise the casual reader might be trying to figure out what the functional difference is between the if
and the else
)
54d00df
to
45bebd1
Compare
…ptimized HAL_GetTick API to improve performance.
45bebd1
to
4ab794b
Compare
CI started |
Test run: SUCCESSSummary: 6 of 6 test jobs passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great improvement by calling us_ticker_read
directly!
I think in this case the direct call to It probably amounts to a few microseconds of the drop (as seen in the 10us->5us change). Still worthwhile. And we bothered to do it in #10609, as it really matters on platforms where it would have needed a division, and we end up avoiding any runtime division (eg wait for an 8MHz timer to increment by 8000, rather than waiting for "timer divided by 8" to increment by 1000). Here the bulk of the gain is avoiding division, particularly the 64-bit division. But do check out #12897, which tries to make the general code smaller, and will make it faster in some cases as a secondary benefit. (Smaller when compiled at least - all the ifdefs make the source bigger :( ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ST CI OK
(except timing drift tests, see
#12920 (comment))
Will be merged as soon as we have 6.0.0 beta1 released |
Summary of changes
HAL_GetTick API has taken about 51us of one-time execution on bare metal profile with small c library (Microlib) on NUCLEO_F303RE target with ARMC6 toolchain and causes failures in lp ticker speed test case from common tickers, so optimized that API to improve overall performance to achieve optimal execution time and this also improves in other profiles. Refer PR#12880 for the more details on this issue.
Impact of changes
with these changes, HAL_GetTick API performance has improved in all profiles, and the below table shows improvement results
(HAL_GetTick without optimization)
(HAL_GetTick without optimization)
(HAL_GetTick with optimization)
Migration actions required
None.
Documentation
None.
Pull request type
Test results
Reviewers
@evedon @kjbracey-arm