Skip to content

Patch for an Uart issue in ISR #5446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions drivers/Serial.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
*/
#include "drivers/Serial.h"
#include "platform/mbed_wait_api.h"
#include "platform/mbed_critical.h"

#if DEVICE_SERIAL

Expand All @@ -36,12 +37,42 @@ int Serial::_putc(int c) {
return _base_putc(c);
}

int Serial::getc() {
// Mutex is already held
if(!core_util_is_isr_active())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this codeblock achieving - getting getc/putc to be accesible from interrupts? Can you provide a bit more details (a reason to change this) - how Stream::getc() helps here with invokation if it's interrupt active?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the risk of behavior inconsistency I mentioned in the PR description - I'm adding these because in the original code, when it is in ISR context, calling putc()/getc() in Stream class raises this error:
Error - reading from a file in an ISR or critical section**
So my change here is to call _base_putc/_base_getc in ISR context.

{
return Stream::getc();
}
else
{
return _base_getc();
}
}

int Serial::putc(int c) {
// Mutex is already held
if(!core_util_is_isr_active())
{
return Stream::putc(c);
}
else
{
return _base_putc(c);
}
}

void Serial::lock() {
_mutex.lock();
if(!core_util_is_isr_active())
{
_mutex.lock();
}
}

void Serial::unlock() {
_mutex.unlock();
if(!core_util_is_isr_active())
{
_mutex.unlock();
}
}

} // namespace mbed
Expand Down
2 changes: 2 additions & 0 deletions drivers/Serial.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,8 @@ class Serial : public SerialBase, public Stream, private NonCopyable<Serial> {
{
return SerialBase::writeable();
}
int getc();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing virtual for these methods? (Hint: use commit message to contain details for changes, in this case it would be a problem with having virtual getc/putc and how this change fixes the problem)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the getc/putc in Serial class inherits from Stream class, and in Stream class, putc/getc are not virtual functions, so that's why I removed the virtual here. in Serial class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@0xc0170 - he's not removing virtual from the existing code, he's changing his earlier commit, which added a getc() to Serial to override the non-virtual method it inherits from Stream.

But that lack of virtual in Stream is why this won't work properly. The resulting FileHandle/Stream will not use this getc() unless you're accessing it through a Serial pointer.

And on top of that, even if it is called, you've created a mutex-locked class that doesn't lock reliably - it has a mutex it will claim when called from thread, but that won't stop it being called from interrupt. It's effectively "claim the lock, but only in certain contexts" - not really a lock any more.

If you really want to create an "non-mutex, callable from interrupt" version of any of these HAL classes, I believe the standard pattern is to derive from them, and override their virtual lock/unlock methods to no-ops (or enter critical section, maybe). So derive an UnlockedSerial from Serial, if that's what you really want - I think that should work in your case.

In the case of Serial though, you can instead use RawSerial, which has null lock by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deriving an UnlockedSerial form Serial sounds just another version of switching to RawSerial and UARTSerial. @sg- Any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deriving an UnlockedSerial form Serial sounds just another version of switching to RawSerial

Well, yes. And actually it would be trickier than I was thinking, as it's not just the lock in the HAL class, like the other HAL classes, it's the fact that all the calls use the C library. Serial is designed as a C stream-based class.

Serial::putc is supposed to be "call fputc on my FileHandle", and as the C library is not callable from interrupt, Serial::putc cannot be used from interrupt. Similarly for Serial::scanf, etc.

If you want something that gives you simple getc and putc that are interrupt-safe, not using the C library, then that is RawSerial. As its header says:

/** A serial port (UART) for communication with other serial devices
 * This is a variation of the Serial class that doesn't use streams,
 * thus making it safe to use in interrupt handlers with the RTOS.
 *

Your patch is an incomplete attempt to make Serial act like RawSerial if called from interrupt - "don't claim the mutex, and don't use the C library", but it only covers the API that RawSerial has. If that's all you're using, you may as well just use RawSerial, rather than attempt to bend Serial.

An UnlockedSerial would sort of work - it would let you access the FileHandle API, which would be useable from interrupt. You'd just have to avoid the C library-based API.

class UnlockedSerial : public Serial {
      // override lock/unlock with empty functions
}

UnlockedSerial serial;
// get past Stream's bizarre protection of its FileHandle base -
// we can't call serial.read() directly
FileHandle &fh =serial;
n = fh.read(buffer, 1);  // only reading 1 is ever safe, if from IRQ - have to precheck that 1 with serial.readable().

In general FileHandles are not expected to be useable from interrupt, but it happens that Stream's implementation is, given an appropriate lock.

Not sure what that would really gain you though - it's just a very complicated way to keep using two horrible classes instead of the sensible RawSerial and UARTSerial though. Stream's broken FileHandle isn't very useful - eg the fact you have to read 1 above.

int putc(int c);

protected:
virtual int _getc();
Expand Down
6 changes: 4 additions & 2 deletions drivers/SerialBase.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,17 @@ void SerialBase::format(int bits, Parity parity, int stop_bits) {

int SerialBase::readable() {
lock();
int ret = serial_readable(&_serial);
int ret = 0;
ret = serial_readable(&_serial);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. SerialBase.cpp: the change of separating initialization of 'ret' is to avoid the compiler optimizing it. It is observed that on ublox EVK NINA-B1, 'ret' is always returning false as it has been optimized.

I fail to see how this is different - how this can produce a different output by a compiler. Are there any optimization that you are seeing with debug/release here? I recall from your report that return value is set to false in the release build.

The output should be the same as in the current form. Isn't the problem in the below layer - HAL ? How serial_readable is implemented for your target that you are using? @

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the serial_readable() on this platform actually references a certain register, which may look 'constant' to the compiler in O3 optimization profile in my experience.
I've tried three ways: (all with release profile)

  1. original -> serial_readable() always returns false
  2. separating initialization (ret = 0; ret = serial_readable(), this is trying to tell the compiler that the value of this variable may change) -> works
  3. adding volatile to ret in the original code -> works
    I didn't disassembly the binary to see the actual assembly code, but the three binaries actually differs: the (2) and (3) are pretty close, but (2) and (3) are far different from (1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which compiler is this? I wasn't aware we had any compiler profile in use that would do cross-compilation-unit optimisations - so as serial_readable itself isn't inline, if it had a problem with a missing volatile or something, I wouldn't have expected anything you do in a caller to affect such a problem. But apparently it is...?

And if there was a platform problem, the same bug would be affecting other code - this class isn't the only place serial_readable is used.

Looking at the platform code (Nina B1->NRF52->SDK11), I don't see an obvious flaw:

__STATIC_INLINE bool nrf_uart_event_check(NRF_UART_Type * p_reg, nrf_uart_event_t event)
{
    return (bool)*(volatile uint32_t *)((uint8_t *)p_reg + (uint32_t)event);
}

int serial_readable(serial_t *obj)
{
(void)obj;
#if DEVICE_SERIAL_ASYNCH
    if (UART_CB.rx_active) {
        return 0;
    }
#endif
    return (nrf_uart_event_check(UART_INSTANCE, NRF_UART_EVENT_RXDRDY));
}

NRF_UART_Type has is a struct with a bunch of volatile fields, it gets cast to non-volatile uint8_t *, but back to volatile uint32_t * before dereferencing. Should be okay.

Only thing I'd say is "hmm, only one serial port supported at once. Really?" Any chance you've somehow got some interference - two serial ports being accessed? Maybe stdin/stdout messing with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I'm using GCC ARM 6-2017-q2 update, and I don't see other messages running. I doubted the same thing before I get the ublox board as you commented that there are a bunch of volatile fields in the low level implementation which looked okay, but when I get the board and gave it a try, adding volatile or touching the ret variable in the caller actually helps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this really were to be a compiler bug, which I don't think we've concluded yet, I would want to see it fixed by a pragma lowering optimisation in serial_readable or maybe nrf_uart_event_check - or maybe even a global toolchain option change.

This sort of "spelling change that really shouldn't cause a difference" in the caller doesn't really cut it as a workaround - it's too inexplicable why it should work, and could easily be accidentally "unfixed" later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as syntax of the pragma to change the optimization level varies among compilers, is there a suggested way to do that?
btw, confirming if this is actually a compiler bug requires a look at the assembly code, is there a way to do that without hardware debugger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soleilplanet -O3 is not used in any of our profiles.

I've isolated the readable member function and the code generated with GCC 6.3 seems fine.

  1. original -> serial_readable() always returns false
  2. separating initialization (ret = 0; ret = serial_readable(), this is trying to tell the compiler that the value of this variable may change) -> works
  3. adding volatile to ret in the original code -> works

(1) and (2) should not be different; ret and is address is not used while 0 is assigned to it therefore, the 0 assignment can be skipped. Of course (3) is different.

btw, confirming if this is actually a compiler bug requires a look at the assembly code, is there a way to do that without hardware debugger?

You can use Objdump to dump the code of the function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to precise gcc profiles since it is what @soleilplanet use apparently.

@soleilplanet Have you test with other compilers and reproduced the issue ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pan- I don't see this problem with GCC ARM 5.4.1 on ubuntu either, this happens on Windows environment.
For the disassembly, I just realized that there is a symbol map file generated, I'll check that with objdump.

unlock();
return ret;
}


int SerialBase::writeable() {
lock();
int ret = serial_writable(&_serial);
int ret = 0;
ret = serial_writable(&_serial);
unlock();
return ret;
}
Expand Down