Skip to content

Update XMOS xcore.ai port to be compatible with v11.x #1096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 27, 2024

Conversation

ACascarino
Copy link
Contributor

@ACascarino ACascarino commented Jun 26, 2024

Update XMOS xcore.ai port

Description

The merge of the smp branch into main in version 11 broke an assumption that the xcore.ai port relied upon - the backwards-compatibility-preserving decision to conditionally define in tasks.c either pxCurrentTCB or pxCurrentTCBs depending solely on the value of configNUMBER_OF_CORES means that the pxCurrentTCBs symbol does not always exist. The assumption was made that this symbol would always exist, and for FreeRTOS instances with one core in use its 0th element would solely be populated. This change means that ports to SMP platforms need instead to be aware of the number of cores defined and change behaviour accordingly. This went unnoticed at XMOS at the time of release of v11, but we've now noticed the issue and I've (hopefully) created a fix.

This PR attempts to make the XMOS xcore.ai port agnostic to whether it is running on a single-core or SMP instance of FreeRTOS by simply introducing an additional layer of indirection to pxCurrentTCB(s) accesses. When the scheduler is started (which is the first time in the application that the port layer needs to interact with the TCB pointer(s)), we populate a global symbol with the address of the TCB pointer. It is this symbol, rather than pxCurrentTCB(s), which is then used in scheduler initialisation and on context switches.

This PR adds one instruction (plus compiler-defined register spill/restoration to protect the r5 clobber) to scheduler initialisation, and zero instructions to RTOS interrupt processing and context switching.

This PR also includes a fix to make exception behaviour entirely predictable; the previous implementation had assumed that the symbol _TrapHandler would always exist at 0x80080. This is not the case; for example, using the --first option in our toolchain to place arbitrary data at the top of memory shifts all other symbols, breaking this assumption and causing wildly undefined behaviour on exception. _DoException is a link-time visible symbol that presents the appropriate entry-point to the platform's exception handling routines, and its use here is preferable in all contexts.

Test Steps

As this is a community-supported port, there are no automated tests to highlight the issue, nor any to prove lack of regression. However, internal testing of FreeRTOS-based products has shown no regression in their functionality due to this change.

As a query - partner-maintained ports seem to have a series of automated tests that they use to show lack of regression, but these, from a cursory look, seem entirely single-core focussed. Are there any partner-maintained SMP ports for which the test suite would exercise the additional SMP-related functionality?

Checklist:

  • I have tested my changes. No regression in existing tests. There are no existing tests, but implementation testing shows no regression in selected FreeRTOS-based products
  • I have modified and/or added unit-tests to cover the code changes in this Pull Request. No unit tests exist

Related Issue

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ACascarino ACascarino requested a review from a team as a code owner June 26, 2024 18:34
@AniruddhaKanhere
Copy link
Member

Hello @ACascarino,

Thank you for taking the time to report and fix the issue. To me, in the first glance, it looks good.
I'll pass it along to our experts in the team for review.

Thanks,
Aniruddha

n9wxu
n9wxu previously approved these changes Jun 26, 2024
ldc r9, 0x0080
or r11, r11, r9
bau r11 //_TrapHandler is at 0x00080080. TODO: Is it always? Why can't I access the symbol _TrapHandler?
bu _DoException
Copy link
Member

@AniruddhaKanhere AniruddhaKanhere Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this change is doing - can you please clarify this for me?

Copy link
Contributor Author

@ACascarino ACascarino Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AniruddhaKanhere - this is a symbol that is generated by our toolchain on compilation. There are a couple of these which are automatically built into every xcore.ai executable to bring up the processor, set execution modes on each core, and provide routines for graceful exception handling (such as _DoException, which disables all events and interrupts and leaves the processor in a state in which an external debugger may connect and query the processor state at time of exception). Therefore, while it would not be expected to find this symbol defined in any application code, it will always be present in the final executable. _TrapHandler, referenced in commentary in the previous implementation, is also one such symbol - it is usually the initial symbol used as the kernel exception pointer and sets up the necessary processor state to then branch to _DoException. Due to the alignment of kexcept, this state has already been achieved by the time kexcept gets here, and so there is no need to go via the _TrapHandler symbol.

The previous implementation assumed that _TrapHandler always existed at 0x80080 - it is not visible to the assembler, and it is therefore required that the address be hard-coded if trying to jump to it. This assumption was not true in certain cases. _DoException, which is visible to the assembler, is therefore an appropriate symbol to jump to and a better choice in all circumstances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also added commentary to clarify this inline! :)

aggarg
aggarg previously approved these changes Jun 27, 2024
@ACascarino ACascarino dismissed stale reviews from aggarg, AniruddhaKanhere, and n9wxu via a0cca63 June 27, 2024 11:48
@ACascarino
Copy link
Contributor Author

Apologies - it turns out adding a new commit dismissed stale reviews. Sorry to ask you to review again!

Copy link

codecov bot commented Jun 27, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.31%. Comparing base (0c79e74) to head (a0cca63).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1096   +/-   ##
=======================================
  Coverage   92.31%   92.31%           
=======================================
  Files           6        6           
  Lines        3226     3226           
  Branches      885      885           
=======================================
  Hits         2978     2978           
  Misses        132      132           
  Partials      116      116           
Flag Coverage Δ
unittests 92.31% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@n9wxu n9wxu merged commit 17dfd0f into FreeRTOS:main Jun 27, 2024
17 checks passed
@chinglee-iot
Copy link
Member

@ACascarino

As a query - partner-maintained ports seem to have a series of automated tests that they use to show lack of regression, but these, from a cursory look, seem entirely single-core focussed. Are there any partner-maintained SMP ports for which the test suite would exercise the additional SMP-related functionality?

We have on target test cases which could exercise the SMP-related functionality. We are still in the progress to publish all the test cases. You could reference reference this link for more information.

@ACascarino
Copy link
Contributor Author

@ACascarino

As a query - partner-maintained ports seem to have a series of automated tests that they use to show lack of regression, but these, from a cursory look, seem entirely single-core focussed. Are there any partner-maintained SMP ports for which the test suite would exercise the additional SMP-related functionality?

We have on target test cases which could exercise the SMP-related functionality. We are still in the progress to publish all the test cases. You could reference reference this link for more information.

Ah great thank you - I managed to completely miss these!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants