Skip to content

Commit 05ac693

Browse files
committed
esp32: Fix hang in taskYIELD() on riscv CPUs when IRQs disabled.
Regression introduced in 337742f. The hang occurs because the esp32 port was calling "from ISR" port-layer functions to set/clear the interrupt mask. FreeRTOS kernel therefore doesn't know the CPU is in a critical section. In taskYIELD() the riscv port layer blocks after yielding until it knows the yield has happened, and would block indefinitely if IRQs are disabled (until INT WDT triggers). Moving to the "public" portENTER_CRITICAL/portEXIT_CRITICAL API means that FreeRTOS knows we're in a critical section and can react accordingly. Adds a regression test for this case (should be safe to run on all ports). On single core CPUs, this should result in almost exactly the same behaviour apart from fixing this case. On dual core CPUs, we now have cross-CPU mutual exclusion for atomic sections. This also shouldn't change anything, mostly because all the code which enters an atomic section runs on the same CPU. If it does change something, it will be to fix a thread safety bug. There is some risk that this change triggers a FreeRTOS crash where there is a call to a blocking FreeRTOS API with interrupts disabled. Previously this code might have worked, but was probably thread unsafe and would have hung in some circumstances. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
1 parent 197becb commit 05ac693

File tree

4 files changed

+48
-6
lines changed

4 files changed

+48
-6
lines changed

ports/esp32/mphalport.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,8 @@ TaskHandle_t mp_main_task_handle;
5757
static uint8_t stdin_ringbuf_array[260];
5858
ringbuf_t stdin_ringbuf = {stdin_ringbuf_array, sizeof(stdin_ringbuf_array), 0, 0};
5959

60+
portMUX_TYPE mp_atomic_mux = portMUX_INITIALIZER_UNLOCKED;
61+
6062
// Check the ESP-IDF error code and raise an OSError if it's not ESP_OK.
6163
#if MICROPY_ERROR_REPORTING <= MICROPY_ERROR_REPORTING_NORMAL
6264
void check_esp_err_(esp_err_t code)

ports/esp32/mphalport.h

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@ extern TaskHandle_t mp_main_task_handle;
5454

5555
extern ringbuf_t stdin_ringbuf;
5656

57+
extern portMUX_TYPE mp_atomic_mux;
58+
5759
// Check the ESP-IDF error code and raise an OSError if it's not ESP_OK.
5860
#if MICROPY_ERROR_REPORTING <= MICROPY_ERROR_REPORTING_NORMAL
5961
#define check_esp_err(code) check_esp_err_(code)
@@ -63,12 +65,21 @@ void check_esp_err_(esp_err_t code);
6365
void check_esp_err_(esp_err_t code, const char *func, const int line, const char *file);
6466
#endif
6567

66-
// Note: these "critical nested" macros do not ensure cross-CPU exclusion,
67-
// the only disable interrupts on the current CPU. To full manage exclusion
68-
// one should use portENTER_CRITICAL/portEXIT_CRITICAL instead.
69-
#include "freertos/FreeRTOS.h"
70-
#define MICROPY_BEGIN_ATOMIC_SECTION() portSET_INTERRUPT_MASK_FROM_ISR()
71-
#define MICROPY_END_ATOMIC_SECTION(state) portCLEAR_INTERRUPT_MASK_FROM_ISR(state)
68+
static inline mp_uint_t mp_begin_atomic_section(void) {
69+
portENTER_CRITICAL(&mp_atomic_mux);
70+
return 0;
71+
}
72+
73+
static inline void mp_end_atomic_section(mp_uint_t state) {
74+
(void)state;
75+
portEXIT_CRITICAL(&mp_atomic_mux);
76+
}
77+
78+
// Note: These atomic macros disable interrupts on the calling CPU, and on SMP
79+
// systems also protect against concurrent access to an atomic section on the
80+
// other CPU.
81+
#define MICROPY_BEGIN_ATOMIC_SECTION() mp_begin_atomic_section()
82+
#define MICROPY_END_ATOMIC_SECTION(state) mp_end_atomic_section(state)
7283

7384
uint32_t mp_hal_ticks_us(void);
7485
__attribute__((always_inline)) static inline uint32_t mp_hal_ticks_cpu(void) {

tests/extmod/machine_disable_irq.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Test executing Python code while IRQs disabled.
2+
try:
3+
from machine import disable_irq, enable_irq
4+
from time import ticks_us
5+
except ImportError:
6+
print("SKIP")
7+
raise SystemExit
8+
9+
10+
# Structured to also nest disable_irq() calls
11+
def f(n):
12+
st_irq = disable_irq()
13+
if n:
14+
f(n - 1)
15+
else:
16+
# busy-wait in a tight loop for 1ms, to simulate doing some work in a critical section
17+
# (can't wait for time_us() to increment as not all ports will "tick" with interrupts off.)
18+
for _ in range(100):
19+
ticks_us()
20+
enable_irq(st_irq)
21+
22+
23+
for nest in range(3):
24+
print(nest)
25+
for _ in range(5):
26+
f(nest)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
0
2+
1
3+
2

0 commit comments

Comments
 (0)