Skip to content

NANO130: Fix optimization error with NVIC_SetVector/NVIC_GetVector on ARMC6 #10534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ccli8
Copy link
Contributor

@ccli8 ccli8 commented May 6, 2019

Description

On ARMC6 with optimization level -Os, the two functions NVIC_SetVector/NVIC_GetVector are optimized to trapping instruction due to access to address 0. This PR fixes it by defining NVIC_FLASH_VECTOR_ADDRESS as a symbol instead of a direct 0 to avoid such unwanted optimization.

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

@ciarmcom ciarmcom requested review from Ronny-Liu and a team May 6, 2019 07:00
@ciarmcom
Copy link
Member

ciarmcom commented May 6, 2019

@ccli8, thank you for your changes.
@Ronny-Liu @ARMmbed/mbed-os-maintainers please review.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've only modified "Get" here, which I don't understand.

A volatile is not needed here, not is it sufficient. There's no reason at all Gets can't be optimised out - those aren't read sensitive, and Sets being optimised out isn't necessarily a problem either.

But there is a need to have a DSB after the SetVector - particularly on M7 - and I've raised an issue here pointing out that it needs one:

ARM-software/CMSIS_5#576

A DSB should solve any issues you're having - my immediate suggestion is to put a __DSB(); after the SetVector call if you're seeing a problem, and I'll continue to follow up in CMSIS.

Although this PR suggests to me that maybe the limited "DSB for later core only" form I did on CMSIS still leaves a remaining issue - on earlier cores you may not need a DSB, but you will need something for the compiler to keep things ordered. They should probably get a compiler barrier like ARM-software/CMSIS_5#493.

Could you also explain what exact problem you were seeing? What exactly do you mean by "optimised out"?

And I've just realised this is a dummy implementation, in which case what's the problem? All that can get optimised out is the error? The error isn't being generated when it should be?

(I'm on a mission to strip out unnecessary volatiles from the code base, and interrupt vector tables are an example of where they're not the correct mechanism. The table is not memory-mapped I/O; Sets and Gets do not need to be done exactly as writte;, something like memcpying the ROM table to the RAM table with any size of access is perfectly legal. You just need barriers at the correct points.)

@ccli8
Copy link
Contributor Author

ccli8 commented May 7, 2019

@kjbracey-arm The implementation of NVIC_SetVector for NANO130 degenerates to just guard code, so barrier is not the point here. Actually, this PR is more likely to get around bug of ARMC6 compiler.

void NVIC_SetVector(IRQn_Type IRQn, uint32_t vector)
{
    // NOTE: On NANO130, relocating vector table is not supported due to just 16KB small SRAM.
    //       Add guard code to prevent from unsupported relocating.
    uint32_t vector_static = NVIC_GetVector(IRQn);
    if (vector_static != vector) {
        error("No support for relocating vector table");
    }
}

ARMC6 translated code of NVIC_SetVector before adding volatile into NVIC_GetVector:
The code size and code content of NVIC_SetVector are both abnormal. And hard fault is met on running into NVIC_SetVector immediately.

Symbol Name                              Value     Ov Type        Size
I2C_ClearTimeoutFlag                     0x00002bb5   Thumb Code    10

NVIC_SetVector                           0x00002c67   Thumb Code     2

OS_Tick_AcknowledgeIRQ                   0x00002c69   Thumb Code    12
0x00002C66 DEFE      DCW      0x0000

ARMC6 translated code of NVIC_SetVector after adding volatile into NVIC_GetVector:
The translated code of NVIC_SetVector is much more normal than above and its behavior is correct. Due to optimization level -Os, source code and disassembly view are not so matched.

Symbol Name                              Value     Ov Type        Size
I2C_ClearTimeoutFlag                     0x00002bb5   Thumb Code    10  

NVIC_SetVector                           0x00002c69   Thumb Code    60

OS_Tick_AcknowledgeIRQ                   0x00002ca5   Thumb Code    12
    36: uint32_t NVIC_GetVector(IRQn_Type IRQn) 
    37: { 
    38:     volatile uint32_t *vectors = (volatile uint32_t *) NVIC_FLASH_VECTOR_ADDRESS; 
    39:  
    40:     // Return the vector 
0x00002C68 B580      PUSH     {r7,lr}
    41:     return vectors[IRQn + 16]; 
0x00002C6A 0080      LSLS     r0,r0,#2
0x00002C6C 6C00      LDR      r0,[r0,#0x40]
    31:     if (vector_static != vector) { 
    32:         error("No support for relocating vector table"); 
    33:     } 
0x00002C6E 4288      CMP      r0,r1
0x00002C70 D100      BNE      0x00002C74
    34: } 
0x00002C72 BD80      POP      {r7,pc}
    32:         error("No support for relocating vector table"); 
0x00002C74 A001      ADR      r0,{pc}+0x08  ; @0x00002C7C
0x00002C76 F001FF03  BL.W     error (0x00004A80)
0x00002C7A 46C0      MOV      r8,r8

@kjbracey
Copy link
Contributor

kjbracey commented May 7, 2019

Okay, the compiler is seeing that NVIC_FLASH_VECTOR_ADDRESS is 0, ie NULL, and is deliberately optimising NVIC_GetVector to an undefined instruction - trapping because you're accessing an array at NULL.

That optimisation can be suppressed by -fno-delete-null-pointer-check.

We currently have that globally on for GCC, but it's always irked me. I'd rather not turn it on globally for clang - I'd like to take the opportunity to restrict that flag to places where it's needed - basically this.

Unlike GCC, clang doesn't let you control that via pragma - otherwise I would have suggested sticking that into cmsis_nvic.c.

What I suggest instead is defining NVIC_FLASH_VECTOR_ADDRESS in the same way NVIC_RAM_VECTOR_ADDRESS is defined. Make it address a proper symbol in the linker map, rather than a zero (null) constant.

Then no compiler will be able to do any funny optimisations.

Or another possibility would be to avoid NULL by defining it as 0x00000040, incorporating the +16 offset, and doing vectors[IRQn]. That would produce optimal code.

@kjbracey
Copy link
Contributor

kjbracey commented May 7, 2019

I'm still not liking the volatile - even if it avoids the issue, it's not a properly targetted fix. There's no fundamental reason volatile should suppress that particular optimisation, so the problem could just come back. We need something that specifically deals with the null pointer constant handling.

@kjbracey
Copy link
Contributor

kjbracey commented May 7, 2019

And I'm a little narked at clang here. If the compiler is going to go to all the trouble of spotting that I'm deliberately dereferencing a NULL pointer, and starts optimising code to be an undefined instruction on that basis - optimising my undefined behaviour, it would be nice if it emitted just one ickle warning?

…n ARMC6

On ARMC6 with optimization level "-Os", the two functions NVIC_SetVector/NVIC_GetVector
will be translated to illegal instruction for trapping due to NVIC_FLASH_VECTOR_ADDRESS
defined as direct 0. Fixed by defining NVIC_FLASH_VECTOR_ADDRESS as a symbol instead to
avoid such optimization error.
@ccli8 ccli8 force-pushed the nuvoton_nano130_fix-vectab-virtual branch from cb1c4a5 to fcab482 Compare May 8, 2019 03:14
@ccli8
Copy link
Contributor Author

ccli8 commented May 8, 2019

@kjbracey-arm Many thanks for your detailed explanation. Finally I adopt your suggestion by defining NVIC_FLASH_VECTOR_ADDRESS as a symbol instead of a direct 0 to avoid such unwanted optimization.

Copy link
Contributor

@kjbracey kjbracey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I'm not an expert on linker map syntax. Bonus marks for simplifying the ARM Compiler check.

Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

@0xc0170
Copy link
Contributor

0xc0170 commented May 10, 2019

CI started

@mbed-ci
Copy link

mbed-ci commented May 10, 2019

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 1
Build artifacts

@0xc0170 0xc0170 merged commit 22d78b4 into ARMmbed:master May 13, 2019
@ccli8 ccli8 deleted the nuvoton_nano130_fix-vectab-virtual branch May 14, 2019 05:11
kjbracey added a commit to kjbracey/mbed-os that referenced this pull request Dec 4, 2019
For GCC we're being cautious by passing the
`-fno-delete-null-pointer-checks`. This option prevents some
optimisation opportunities, so removing it can reduce code size.

One particular optimisation loss occurs in `Callback` where a test
similar to this occurs:

    extern void myfunc();

    inline void foo(void (*fnptr)())
    {
        if (fnptr) {
            do A;
        } else {
            do B;
        }
    };

    foo(myfunc);

With `-fno-delete-null-pointer-checks`, the compiler does not assume
that `&myfunc` is non-null, and inserts the "null check" - seeing if the
address is 0. But performing that test of the address is incorrect
anyway - if myfunc actually could be at address 0, we'd still want to do
A.

Anyway, we do not have an equivalent option enabled for either Clang or
IAR, and we have performed clean-ups avoiding issues with
apparently-null vector tables in Clang already, for example ARMmbed#10534.

Therefore it should(TM) be safe to remove the option for GCC. We do not
have general data or code at address 0, only vectors are likely to be
there, so it does not make sense to be globally restricting code
generation for that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants