Skip to content

SPE: Fix up atomic usage #9617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 19, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -47,35 +47,30 @@ psa_handle_t psa_hndl_mgr_handle_create(psa_handle_manager_t *handle_mgr, void *
// Get active partition id - Needed for requester identification
spm_partition_t *curr_part_ptr = get_active_partition();
int32_t current_pid = ((curr_part_ptr != NULL) ? curr_part_ptr->partition_id : PSA_NSPE_IDENTIFIER);
uint32_t expected = UINT16_MAX;

// Avoid passing UINT16_MAX. Start again from 0 if reached.
// The reason for this is that we use the 16 upper bits to store the handle's index in the handles pool (for performance reasons)
core_util_atomic_cas_u32((uint32_t *)(&(handle_mgr->handle_generator)),
&expected,
PSA_HANDLE_MGR_INVALID_HANDLE
);

// Generate a new handle identifier
uint32_t tmp_handle = core_util_atomic_incr_u32(&(handle_mgr->handle_generator), 1);
uint32_t new_handle = PSA_HANDLE_MGR_INVALID_HANDLE;
uint32_t pool_ix = 0;
uint32_t tmp_handle;
do {
tmp_handle = core_util_atomic_incr_u16(&(handle_mgr->handle_generator), 1);
} while (tmp_handle == PSA_HANDLE_MGR_INVALID_HANDLE);
Copy link
Contributor

@alzix alzix Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PSA_HANDLE_MGR_INVALID_HANDLE defined to be uint16_t.
i think it is better to compare to PSA_NULL_HANDLE instead.
PSA_HANDLE_MGR_INVALID_HANDLE is only used for static handle_generator initialisation.

Suggested change
} while (tmp_handle == PSA_HANDLE_MGR_INVALID_HANDLE);
} while (tmp_handle == PSA_NULL_HANDLE);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one was debatable - all those constants are so coupled in their assumptions, it's a bit hard to tell which to use. I went for the PSA_HANDLE_MGR_INVALID_HANDLE as it is the direct 16-bit value of the handle_generator, despite having just popped it into a uint32_t ready for composition.

I think in principle PSA_NULL_HANDLE doesn't have to be PSA_HANDLE_MGR_INVALID_HANDLE zero-extended - if it wasn't my form is correct. (It might have been simpler if the invalidness was represented by a too-big value in the slot indicator, rather than a special generator count value, but maybe having the null value be 0 is convenient).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PSA_NULL_HANDLE is part of PSA FF spec. The purpose of having it as it simplifies programming model. For example it allows us to call psa_close() regardless of psa_connect() status, which simplifies error handling logic.
PSA_HANDLE_MGR_INVALID_HANDLE is zero as it is initialises our handle_generator.
The connection between them is rather occasional.

psa_handle_t new_handle = PSA_NULL_HANDLE;

// Look for a vacant space in handles pool for the generated handle
for (pool_ix = 0; pool_ix < handle_mgr->pool_size; pool_ix++) {
for (uint32_t pool_ix = 0; pool_ix < handle_mgr->pool_size; pool_ix++) {

expected = PSA_HANDLE_MGR_INVALID_HANDLE;
psa_handle_t expected = PSA_NULL_HANDLE;

// Write the handles pool index in the upper 16 bits of the handle
new_handle = ((pool_ix << PSA_HANDLE_MGR_HANDLE_INDEX_POS) | tmp_handle);
psa_handle_t desired_handle = ((pool_ix << PSA_HANDLE_MGR_HANDLE_INDEX_POS) | tmp_handle);

// Store the generated handle in the handles pool
if (core_util_atomic_cas_u32((uint32_t *)(&(handle_mgr->handles_pool[pool_ix].handle)),
if (core_util_atomic_cas_s32(&(handle_mgr->handles_pool[pool_ix].handle),
&expected,
new_handle
desired_handle
)) {

// Handle is successfully stored in handles pool
new_handle = desired_handle;

// Store the handle memory in the handles pool, "coupled" with the stored handle
handle_mgr->handles_pool[pool_ix].handle_mem = handle_mem;
Expand All @@ -90,7 +85,7 @@ psa_handle_t psa_hndl_mgr_handle_create(psa_handle_manager_t *handle_mgr, void *

// Handle creation should only occur after a successful memory allocation
// and is not expected to fail.
SPM_ASSERT(pool_ix != handle_mgr->pool_size);
SPM_ASSERT(new_handle != PSA_NULL_HANDLE);

return new_handle;
}
Expand Down Expand Up @@ -123,9 +118,9 @@ void psa_hndl_mgr_handle_destroy(psa_handle_manager_t *handle_mgr, psa_handle_t
SPM_PANIC("[ERROR] Request for destroy by non-owner or friend!\n");
}

handle_mgr->handles_pool[pool_ix].handle = PSA_NULL_HANDLE;
handle_mgr->handles_pool[pool_ix].handle_owner = PSA_HANDLE_MGR_INVALID_FRIEND_OWNER;
handle_mgr->handles_pool[pool_ix].handle_friend = PSA_HANDLE_MGR_INVALID_FRIEND_OWNER;
core_util_atomic_store_s32(&(handle_mgr->handles_pool[pool_ix].handle), PSA_NULL_HANDLE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do agree that .handle_owner should be set last, as it returns the handle to the pool. And your change is actually fixing a bug in our implementation. Great catch!
But IMO the atomic store is redundant here. .handle is an aligned 32bit value thus single bus transaction will be sufficient for storing the value.
The code here written with an assumption that handle is owned as long as it not set to PSA_NULL_HANDLE. Thus only the owner will be accessing it. If this assumption is wrong we need to rewrite the whole code. Otherwise single store instruction is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handle must be written last. Only the atomic store will guarantee that - it has a memory ordering semantic. Without the atomic store, the compiler is free to reorder those 3 stores, so might not perform the handle write last.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining it. You are correct!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the memory ordering, which is the most likely practical issue, there's the formal issue that doing a non-atomic write to NULL at the same time as a cas reads it is undefined behaviour. In theory there could be tearing effects. (On a 16-bit CPU this 32-bit handle would tear. On ARM a 64-bit variable would tear).

Plain reads and writes are only permitted if there is no possibility of a write and a read happening at the same time - either due being ordered against other synchronised accesses, like the handle_friend here is ordered by the atomic handle accesses, or by mutexes and the like.

}


Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ extern "C" {

/* ------------------------------------ Definitions ---------------------------------- */

#define PSA_HANDLE_MGR_INVALID_HANDLE ((uint32_t)PSA_NULL_HANDLE)
#define PSA_HANDLE_MGR_INVALID_HANDLE ((uint16_t)PSA_NULL_HANDLE)

#define PSA_HANDLE_MGR_INVALID_FRIEND_OWNER 0 // Denoting invalid friend or invalid owner

Expand All @@ -80,8 +80,10 @@ typedef struct psa_handle_item_t {

typedef struct psa_handle_manager_t {

uint32_t handle_generator; /* A counter supplying handle numbers. */
uint32_t pool_size; /* The maximum number of handles that pool can contain. */
// Handle generator uses only 16 bits, and wraps.
// The reason for this is that we use the 16 upper bits to store the handle's index in the handles pool (for performance reasons)
uint16_t handle_generator; /* A counter supplying handle numbers. */
uint16_t pool_size; /* The maximum number of handles that pool can contain. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would really prefer to have uint32_t for a size.
This will waste 4B in RAM but i will sleep better when next time this code will be modified/maintained.
Alternatively, please add casting in the for loop compare statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a fundamental assumption throughout all the shifting and masking that these are both 16 bits, so I felt that making very sure that we don't have more than 16 bits in the state variables seemed like a good plan in itself, aside from the memory saving.

A cast isn't necessary - there's no problem comparing different width types of the same signedness - but the loop could certainly be changed to using uint16_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, pool_ix needs to be 32-bit, else casted before the left shift. Bleurgh.

psa_handle_item_t *handles_pool; /* Holds couples of handles and their memory "blocks". */

} psa_handle_manager_t;
Expand Down
16 changes: 8 additions & 8 deletions components/TARGET_PSA/TARGET_MBED_SPM/COMPONENT_SPE/spm_common.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,11 @@
*/

#include "cmsis_os2.h"
#include "mbed_critical.h"
#include "psa_defs.h"
#include "spm_internal.h"
#include "spm_panic.h"

bool core_util_atomic_cas_u8(volatile uint8_t *ptr, uint8_t *expectedCurrentValue, uint8_t desiredValue);

inline void validate_iovec(
const void *in_vec,
const uint32_t in_len,
Expand All @@ -42,18 +41,19 @@ inline void validate_iovec(

inline void channel_state_switch(uint8_t *current_state, uint8_t expected_state, uint8_t new_state)
{
uint8_t backup_expected = expected_state;
if (!core_util_atomic_cas_u8(current_state, &expected_state, new_state)) {
uint8_t actual_state = expected_state;
if (!core_util_atomic_cas_u8(current_state, &actual_state, new_state)) {
SPM_PANIC("channel in incorrect processing state: %d while %d is expected!\n",
expected_state, backup_expected);
actual_state, expected_state);
}
}

inline void channel_state_assert(uint8_t *current_state, uint8_t expected_state)
inline void channel_state_assert(const uint8_t *current_state, uint8_t expected_state)
{
if (*current_state != expected_state) {
uint8_t actual_state = core_util_atomic_load_u8(current_state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced atomic load is necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only wouldn't be necessary if you know for certain no other threads could be modifying it. If you knew you were the "owner" due to a previous cas, so you knew that you were in a state that no-one else would modify, you'd be fine.

But as this is an assert, a lot of your assertable cases would actually then make this undefined behaviour if it was a plain load. Getting the atomic load gets you to the assert in a defined way in the event of failure.

I don't actually know if the atomics are needed at all here - I guess it's a sort of mechanism where exactly 1 thread can transition from IDLE to ACTIVE, and then it gets control?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the atomic load isn't actually particularly expensive on current Mbed OS platforms - it will just be inlined as a normal LDRB, with the compiler being a bit stricter about ordering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomic operations at handler allocations are needed for sure.
state change is also required to be atomic since it changed in SPM context and asserted in secure partition context.
atomic for write in destroy handle operation, you have convinced me it is needed to make sure compiler wont optimize stuff.
Atomic read still looks redundant to me.
perhaps may only be valid when using buffered writes...
Better safe than sorry :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state change is also required to be atomic since it changed in SPM context and asserted in secure partition context.

So those are different threads, not just the other side of the security barrier in the same thread? If they are different threads, then the non-atomic read is undefined behaviour.

I think you can construct a convincing argument why the undefined behaviour won't cause a problem in this case, based on assumptions about the platform and how limited the compiler's hostility is, but compilers really are getting quite mean about exploiting undefined behaviour these days, so I really don't want to go anywhere near it. Making the code legal doesn't cost much, and serves as a documentation reminder. Plus atomics are hard enough to reason about as it is without trying to do "unsynchronised atomics".

Even if you never see a problem in practice on current hardware, one day some mean person is going to run you against a testing tool like this which would chew you straight out for that unsynchronised load.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

if (actual_state != expected_state) {
SPM_PANIC("channel in incorrect processing state: %d while %d is expected!\n",
*current_state, expected_state);
actual_state, expected_state);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ void channel_state_switch(uint8_t *current_state, uint8_t expected_state, uint8_
* @param[in] current_state - current state
* @param[in] expected_state - expected state
*/
void channel_state_assert(uint8_t *current_state, uint8_t expected_state);
void channel_state_assert(const uint8_t *current_state, uint8_t expected_state);

#ifdef __cplusplus
}
Expand Down