Skip to content

Fix greentea test failure on Renesas Cortex-A9 targets #7139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 14, 2018

Conversation

toyowata
Copy link
Contributor

@toyowata toyowata commented Jun 6, 2018

Description

This fix is to prevent optimization code removal (only ARM build) for Renesas Cortex-A9 targets (GR-PEACH and GR-LYCHEE).
The problem was accidentally introduced potentially existed and was surfaced with this PR #6273

d8cb72a#diff-bf05db1901365d4d793784baf718227eR953

The ARM compiler removed __set_CP inlined functions (__set_DCISW, __set_DCCSW and __set_DCCISW) by optimization and target startup in the SystemInit() does not work properly. i.e. just hangs at startup
This fix prevents unnecessary compiler optimization and startup code initialize cache memory as expected.

You can see codegen difference by armcc here.
codegen_diff.txt

It should fix #7065 and ARMmbed/mbed-os-example-blinky#124

Pull request type

[X] Fix
[ ] Refactor
[ ] New target
[ ] Feature
[ ] Breaking change

Test result

mbed-test_gr_lychee_arm.txt
mbed-test_rz_a1h_arm.txt

@toyowata
Copy link
Contributor Author

toyowata commented Jun 6, 2018

cc @MarceloSalazar @ashok-rao

@MarceloSalazar
Copy link

cc @adbridge

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 6, 2018

@toyowata I dont fully understand what is this fixing - the commit msg does not specify why dummy needs volatile and how it fails currently. Please add more details for the bugfix

This fix is to prevent optimization code removal (only ARM build) for Renesas Cortex-A9 targets (GR-PEACH and GR-LYCHEE).
The was accidentally introduced by this PR #6273

This would be good to have it in there

@0xc0170 0xc0170 requested a review from a team June 6, 2018 10:22
@ithinuel
Copy link
Member

ithinuel commented Jun 6, 2018

__set_CP is not volatile :

#define __set_CP(cp, op1, Rt, CRn, CRm, op2) do { register uint32_t tmp __ASM("cp" # cp ":" # op1 ":c" # CRn ":c" # CRm ":" # op2); tmp = (Rt); } while(0)

So the following block gets optimized out :

for(int32_t set = num_sets-1; set >= 0; set--)
{
Dummy = (level << 1U) | (((uint32_t)set) << log2_linesize) | (((uint32_t)way) << shift_way);
switch (maint)
{
case 0U: __set_DCISW(Dummy); break;
case 1U: __set_DCCSW(Dummy); break;
default: __set_DCCISW(Dummy); break;
}
}

All (most ?) functions from cmsis_cp15.h expect the uses of __set_CP to expand as asm volatile.

// __ASM volatile("MCR p15, 2, %0, c0, c0, 0" : : "r"(value) : "memory");
__set_CP(15, 2, value, 0, 0, 0);

writing the following also fixes the issue and probably fixes all other uses of CP.

#define __set_CP(cp, op1, Rt, CRn, CRm, op2) do { register volatile uint32_t tmp __ASM("cp" # cp ":" # op1 ":c" # CRn ":c" # CRm ":" # op2); tmp = (Rt); } while(0)

(note the volatile added after register).

Copy link
Member

@ithinuel ithinuel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functions in cmsis_cp15.h expect __get_CP and __set_CP to expand as __ASM volatile.
it should be added to these functions instead of the Dummy variable.

@donatieng
Copy link
Contributor

@JonatanAntoni this one needs your attention :)

@toyowata
Copy link
Contributor Author

toyowata commented Jun 7, 2018

@ithinuel Thank you very much for your investigation and suggestion. I have updated my commits.

@0xc0170 I added more information in the description of this PR.

Add volatile modifier to prevent ARM compiler to remove inline function calls for __set_CP and __get_CP.
@toyowata
Copy link
Contributor Author

toyowata commented Jun 7, 2018

Squashed the commits.

@JonatanAntoni
Copy link
Member

We have the __ASM volatile for all other compilers as well, so it looks correct to me at the first glance. May I ask you to upstream this change to CMSIS, please? Note that we have the volatile modifier after the __ASM, typically. Just to clearly state its the assembly instruction and not the variable in this case.

@ithinuel
Copy link
Member

ithinuel commented Jun 7, 2018

As discussed with @JonatanAntoni, the syntax in c37875c is right one for armcc.
The syntax used by these macro is called "named register variable" : http://www.keil.com/support/man/docs/armcc/armcc_chr1359125006491.htm

@toyowata
Copy link
Contributor Author

toyowata commented Jun 8, 2018

@JonatanAntoni @ithinuel
Thank you very much for the information. I will also make PR to CMSIS_5 repo for this change.

@ithinuel Can you review this and accept the commit, because this is a blocker for Renesas Cortex-A platforms?

@ithinuel
Copy link
Member

ithinuel commented Jun 8, 2018

@toyowata ARM-software/CMSIS_5#376

@toyowata
Copy link
Contributor Author

toyowata commented Jun 8, 2018

@0xc0170 bump

@kjbracey
Copy link
Contributor

kjbracey commented Jun 8, 2018

Note that we have the volatile modifier after the __ASM, typically. Just to clearly state its the assembly instruction and not the variable in this case.

It is actually the variable being marked volatile, and that's giving you the semantics. It's not eliminating the read because the variable you're reading is volatile.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's logical that these register variables mapped to coprocessor controls should be declared volatile, the ARMCC documentation examples show similar use without volatile, implying it shouldn't be necessary - as if they were implicitly volatile. Is the documentation wrong/misleading?

@JonatanAntoni
Copy link
Member

@kjbracey-arm that's exactly what I have in mind as well. The compiler already knows that this variable is intended to access a register which is volatile by nature. I try to get a confirmation from the compiler team.

@kjbracey
Copy link
Contributor

kjbracey commented Jun 8, 2018

Not all hard registers necessarily have side-effects. The same mechanism can be used to access many different types of register.

So I can perfectly understand the logic of leaving volatile or not up to the C coder. I actually kind of prefer that, and feels like a better mapping to the language.

But the examples suggest that's not what they were intending:

Example 5-14 Named register variable for coprocessor register to enable MMU

register unsigned int cp15_control __asm("cp15:0:c1:c0:0");
cp15_control |= 0x1;

Anyway, volatile can't hurt, so this is fine.

@JonatanAntoni
Copy link
Member

JonatanAntoni commented Jun 8, 2018

Not all hard registers necessarily have side-effects.

Sure, but typically they have. That's what they are normally used for. And that's the big difference to ordinary memory. But of course adding volatile does not hurt. I propose we add volatile to both, get and set.

@ithinuel
Copy link
Member

ithinuel commented Jun 8, 2018

@JonatanAntoni @kjbracey-arm I was reading gcc's doc on named register variable and it does not seem to assume they are volatile.
You can specify which register to use for a certain variable, but if you only write to it, nothing prevents the compiler to optimize it away.

In the PR made against CMSIS_5, I only added the volatile to the write instruction because we want to tell the compiler not to optimise write to this.
OTOH I left the read non volatile as the compiler is free to remove this read if nothing actually use them.
It's also waiting for confirmation from the compiler team that reading to CP15 does not have side effects.

@kjbracey
Copy link
Contributor

kjbracey commented Jun 8, 2018

I think having default volatile in the CP macro for both set/get is safest. Avoids potential mistakes. And I can't see any CP15 use being in performance critical places where volatile would hurt.

I see the compiler writer's point of view about not volatile by default, but a CP15 helper macro should include the volatiles. I'm not aware of any CP15 registers that are read-sensitive, but maybe sequencing of the read relative to other accesses might be significant.

@JonatanAntoni
Copy link
Member

Can you please let me know at which level of optimization you observe the issues? We have named registers without volatile in many other places. I expect issues all around. Surprising that nobody raised issues so far.

@ithinuel
Copy link
Member

ithinuel commented Jun 8, 2018

-O0 to -O2, -Ospace and -Otime are not affected.
The issue is only present in -O3.

Checked with :

gdb-multiarch -batch -ex 'file ./BUILD/GR_LYCHEE/ARM/mbed-os-example-blinky.elf' -ex 'disassemble SystemInit'

looking for the missing inner loop.

@JonatanAntoni
Copy link
Member

Thx. I confirm it's reproducible with CMSIS CoreValidation tests. Unfortunately we currently run all the tests with -O0, only.

@cmonr
Copy link
Contributor

cmonr commented Jun 11, 2018

@ithinuel @kjbracey-arm What do we run in which we compile to -O3?

@ithinuel
Copy link
Member

ithinuel commented Jun 11, 2018

@cmonr we compile everything with -O3 on ARMCC when using the develop & release profiles.

IAR ARMCC ARMC6
release -Ohz -Ospace -O3 -Os
develop -Oh -Otime -O3 -Oz
debug -On -Otime -O0 -O0

@0xc0170
Copy link
Contributor

0xc0170 commented Jun 12, 2018

/morph build

@mbed-ci
Copy link

mbed-ci commented Jun 12, 2018

Build : SUCCESS

Build number : 2333
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/7139/

Triggering tests

/morph test
/morph uvisor-test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented Jun 12, 2018

@mbed-ci
Copy link

mbed-ci commented Jun 13, 2018

@cmonr cmonr merged commit 3280ebb into ARMmbed:master Jun 14, 2018
@ithinuel
Copy link
Member

ARM-software/CMSIS_5#376 has been merged upstream too.

@toyowata toyowata deleted the fix_rtx53_ca9 branch December 17, 2018 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fail only ARMCC of Cortex-A on Mbed OS that incorporated RTX5.3
9 participants