Skip to content

Nuvoton: Fix mbedtls crypto ECC AC management failed #8985

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

ccli8
Copy link
Contributor

@ccli8 ccli8 commented Dec 6, 2018

Description

In mbed-os 5.11, Mbed TLS doesn't guarantee mbedtls_internal_ecp_init()/mbedtls_internal_ecp_free()
are paired. When they are not paired, system would hang in Nuvoton's ECC AC management. This PR majorly tries to release the limit by narrwoing ECC AC open period to just real ECC AC operation. With this modification, pairing mbedtls_internal_ecp_init()/mbedtls_internal_ecp_free() is not required, for example, multiple mbedtls_internal_ecp_init() and finally one mbedtls_internal_ecp_free() is allowed.

Related target

  • NUMAKER_PFM_NUC472
  • NUMAKER_PFM_M487/NUMAKER_IOT_M487

Related issue

#8927

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

ccli8 added 3 commits December 5, 2018 17:57
Mbed TLS doesn't guarantee mbedtls_internal_ecp_init()/mbedtls_internal_ecp_free()
are paired. To avoid multiple operations to the same ECC accelerator simultaneously,
we narrow ECC accelerator open period to just real ECC accelerator operation in
internal_run_eccop()/internal_run_modop().
@ciarmcom ciarmcom requested review from a team December 6, 2018 10:00
@ciarmcom
Copy link
Member

ciarmcom commented Dec 6, 2018

@ccli8, thank you for your changes.
@ARMmbed/mbed-os-crypto @ARMmbed/mbed-os-maintainers please review.

@0xc0170
Copy link
Contributor

0xc0170 commented Dec 6, 2018

@ccli8 Is this fixing the issue #8927 completely or there are still failures reported?

MBEDTLS_MPI_CHK(ecc_done ? 0 : -1);


MBEDTLS_MPI_CHK(ecc_done ? 0 : MBEDTLS_ERR_SSL_HW_ACCEL_FAILED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MBEDTLS_ERR_SSL_HW_ACCEL_FAILED is for hardware accelerators that do TLS record processing, could we please return MBEDTLS_ERR_PLATFORM_HW_ACCEL_FAILED instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanesca I fix it in 406d480.

MBEDTLS_MPI_CHK(ecc_done ? 0 : -1);


MBEDTLS_MPI_CHK(ecc_done ? 0 : MBEDTLS_ERR_SSL_HW_ACCEL_FAILED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MBEDTLS_ERR_SSL_HW_ACCEL_FAILED is for hardware accelerators that do TLS record processing, could we please return MBEDTLS_ERR_PLATFORM_HW_ACCEL_FAILED instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanesca As above

…ED when ECC H/W acceleratioin is failed

MBEDTLS_ERR_SSL_HW_ACCEL_FAILED is for hardware accelerators that do TLS record processing.
Replace it with MBEDTLS_ERR_PLATFORM_HW_ACCEL_FAILED when ECC H/W acceleratioin is failed.
@ccli8
Copy link
Contributor Author

ccli8 commented Dec 7, 2018

Is this fixing the issue #8927 completely or there are still failures reported?

@0xc0170 No. This PR just fixes tls connection failure (simple-mbed-cloud-client-tests-dev_mgmt-connect/Initialize Simple PDMC or simple-mbed-cloud-client-tests-dev_mgmt-update/Initialize Simple PDMC) on mbed-os 5.11 rc or master branch. Other PDMC failures are under checking.

@kjbracey
Copy link
Contributor

kjbracey commented Dec 7, 2018

In mbed-os 5.11, Mbed TLS doesn't guarantee mbedtls_internal_ecp_init()/mbedtls_internal_ecp_free()
are paired.

Is that by design, or an error? It to me like the intent is that they be paired, and by code inspection they seem to be for a single operation.

Is this a thread-safety issue? If it is, then taking a mutex in init and releasing it in free might be the way to go?

Looking at the patch, you are implementing your own locking on the internal op. That locking is unpleasant because it isn't using a mutex, just busy-waiting. That busy-waiting will fail the instant anyone uses different thread priorities. It would be better to use a mutex, but then that may be a bit heavyweight if it's in an inner loop. (Not sure if it is).

Given that contention on the ECP unit is likely to be low, I think you really would be best off just taking a mutex in init, and releasing it in free, minimising the overhead.

@yanesca
Copy link
Contributor

yanesca commented Dec 7, 2018

Is that by design, or an error?

It is by design. The calls are not necessarily paired. If there are more than one mbedtls_ecp_point context in the application (or even in Mbed TLS itself) their lifetime can overlap in any way. Therefore for example calling init on both before calling free on either is perfectly acceptable.

@kjbracey
Copy link
Contributor

kjbracey commented Dec 7, 2018

Looking at the code, the mbedtls_internal_ecp_init/free calls being discussed here are made as a pair within a single operation (mbedtls_ecp_mul_restartable or mbedtls_ecp_muladd_restartable).

That's distinct from the mbedtls_ecp_point_init/free which are indeed arbitary overlapping lifetime.

@kjbracey
Copy link
Contributor

kjbracey commented Dec 7, 2018

The config file description for MBEDTLS_ECP_INTERNAL_ALT says:

The functions mbedtls_internal_ecp_init and mbedtls_internal_ecp_deinit are called before and after each point operation and provide an opportunity to implement optimized set up and tear down instructions.

(Someone has apparently renamed deinit to free without updating the comment. Possibly using init/free is confusing, because it does make it look like an arbitrary lifetime thing like a point or context).

@yanesca
Copy link
Contributor

yanesca commented Dec 7, 2018

Indeed these are completely different functions than the ones that can overlap. These should be called in pairs and as far as I can tell they are.

@yanesca
Copy link
Contributor

yanesca commented Dec 7, 2018

I have found an issue that might be the reason for this. I'll put up a PR with the fix soon.

@kjbracey
Copy link
Contributor

kjbracey commented Dec 7, 2018

Rereading the patch more closely, I see that the busy-wait loop was already there, and does follow the locking pattern I was suggesting. If there is indeed an mbed TLS issue, then you may not need to move it now, but I'd still like to see it turned into a real mutex as soon as possible.

That loop is going to bite us at some point - I know some of the PDMC code likes using different thread priorities, and it's just asking for deadlock.

@yanesca
Copy link
Contributor

yanesca commented Dec 7, 2018

I have raised a PR with the fix: #9005

@ccli8 Can you please test if it resolves the issue you are having?

@ccli8
Copy link
Contributor Author

ccli8 commented Dec 10, 2018

@yanesca The ECC double initialization issue is produced in PDMC Greentea test. After applying #9005, the issue disappears.

@kjbracey-arm This PR will close because #9005 has fixed the issue. I will follow your suggestion and raise another PR which will replace busy-wait with mutex.

@ccli8
Copy link
Contributor Author

ccli8 commented Dec 11, 2018

@kjbracey-arm I am replacing busy-wait loop with mutex, but fail to pass PDMC Greentest. The error scenario is that for the same mbedtls_sha256_context, mbedtls_sha256_init() is called in one thread but mbedtls_sha256_free() is called in another thread, so I meet error in unlock mutex in mbedtls_sha256_free(). It seems reasonable that mbedtls_sha256_init()/mbedtls_sha256_free() for the same mbedtls_sha256_context can be called in different threads. Note Nuvoton's SHA accelerator doesn't support context save & restore, so I need to lock mutex for the whole lifetime of ctx and cannot for just partial SHA computing. Or I can just roll back to busy-wait loop?

void mbedtls_sha256_init(mbedtls_sha256_context *ctx)
{
    /* Try locking mutex. On success, go H/W SHA, else go S/W SHA. */
}

void mbedtls_sha256_free(mbedtls_sha256_context *ctx)
{
    /* If H/W SHA, unlock mutex */
}

@kjbracey
Copy link
Contributor

If you really can't context save and restore, rather than use a mutex, you could use a semaphore. However, if you wait for the semaphore, that opens you up to priority inversion problems - a little better than the busy wait loop, but can still deadlock.

Another alternative is a separate task that picks up SHA operations from a queue. Gets rid of deadlock. But...

Regardless of mechanism, if the hardware is exclusively held for the entire duration of a SHA context, that could cause significant scheduling problems. A long-running normal priority SHA operation could totally lock out some high-priority time-critical small SHA from a high-priority thread. That's conceptually the sort of thing that would kill WiSUN, say - unable to send a secure ack because someone is encrypting a big packet payload to send. (We hit that bug in Nanostack - failed because had only 1 context. Added second context to make it work).

So I think the solution may have to be that you use a semaphore, but don't wait, and if the semaphore is not available, use a software fallback.

But then if you never wait for the semaphore, then it doesn't have to be an actual RTOS semaphore - a simple flag like you have now would do.

@ccli8 ccli8 deleted the nuvoton_fix_crypto_ecc branch December 20, 2018 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants