Skip to content

Fix hardware flow control on NRF52 series #8292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 26, 2018
Merged

Fix hardware flow control on NRF52 series #8292

merged 2 commits into from
Oct 26, 2018

Conversation

marcuschangarm
Copy link
Contributor

@marcuschangarm marcuschangarm commented Oct 2, 2018

Description

Due to buggy flow control logic in the UARTE, the stop signal is not being set as it is supposed to when the the module is not ready to receive data.

This commit signals the sender to halt transmitting when a DMA buffer is full and only continue again when the atomic FIFO buffer has been emptied. This allows platforms with hardware flow control to minimize all buffers and rely on flow control instead.

Pull request type

[x] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Breaking change

@marcuschangarm
Copy link
Contributor Author

@naveenkaje @TacoGrandeTX

@@ -331,6 +331,20 @@ static void nordic_nrf5_uart_timeout_handler(uint32_t instance)
/* No activity detected, no timeout set. */
nordic_nrf5_uart_state[instance].ticker_is_running = false;

/* Check if hardware flow control is set and signal sender to stop.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code is repeated 3 times. Would it make sense to wrap this up in it's own function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's being called twice. It interrupt handler is convoluted as it is, I don't think adding another function will make it easier to read.

@naveenkaje
Copy link
Contributor

Please provide reference to the bug in UARTE (such as errata from Nordic) / a Github issue? It will be useful.

@marcuschangarm
Copy link
Contributor Author

The bug is an artifact of the UARTE API, because there is no safe way to flush the DMA buffer and swap it for a new one. When you follow the documentation and flush the DMA you'll get a ~6 ms period where the UARTE is not ready to receive. During this period the hardware flow control is inactive because the hardware flow control is only ready when the UARTE is ready to receive.

This PR manually toggles the flow control pin during this timeout window.

@TacoGrandeTX
Copy link
Contributor

Sounds like a dicey problem! Do you have a test that showcases the problem and fix? DMA buffers are just 1 byte - right?

@marcuschangarm
Copy link
Contributor Author

DMA buffers are at least 5 bytes. You can connect a logic analyzer to any application. You'll notice the RTS line is never set even though it should be.

@deepikabhavnani
Copy link

Build requested by @marcuschangarm
Query : Is it for 5.10.1 or 5.101.2?

/morph build

@mbed-ci
Copy link

mbed-ci commented Oct 4, 2018

Build : SUCCESS

Build number : 3241
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8292/

Triggering tests

/morph test
/morph mbed2-build

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 5, 2018

/morph export-build

@mbed-ci
Copy link

mbed-ci commented Oct 5, 2018

@mbed-ci
Copy link

mbed-ci commented Oct 5, 2018

@theotherjimmy
Copy link
Contributor

/morph test

NRF52_DK timed out.

@theotherjimmy
Copy link
Contributor

@marcuschangarm Is there a chance that the NRF52 timeout could have been caused by this change?

@mbed-ci
Copy link

mbed-ci commented Oct 6, 2018

@NirSonnenschein
Copy link
Contributor

failures in tests for NRF52_DK timeouts and sync issues. this doesn't seem incidental. @marcuschangarm please take a look

@marcuschangarm
Copy link
Contributor Author

/morph test

@mbed-ci
Copy link

mbed-ci commented Oct 8, 2018

@cmonr
Copy link
Contributor

cmonr commented Oct 9, 2018

/morph build

New commit > new SHA > new Build step needed.

@mbed-ci
Copy link

mbed-ci commented Oct 9, 2018

Build : SUCCESS

Build number : 3292
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8292/

Triggering tests

/morph test
/morph export-build
/morph mbed2-build

@NirSonnenschein
Copy link
Contributor

@marcuschangarm failures in test look PR related, please take a look.

@marcuschangarm
Copy link
Contributor Author

@studavekar @NirSonnenschein
The failures occur in a test with heavy UART traffic. I suspect the CI infrastructure not following the flow control signal causes the Tx buffer to overflow.

@studavekar
Copy link
Contributor

studavekar commented Oct 16, 2018

@studavekar @NirSonnenschein
The failures occur in a test with heavy UART traffic. I suspect the CI infrastructure not following the flow control signal causes the Tx buffer to overflow.

@marcuschangarm not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

We also need to make review the test, we should avoid heavy uart traffic in tests. If we want to stress uart it should part of a separate test.

Failure :

  1. [1539363423.79][CONN][RXD] :144::FAIL: Expected 'passed' Was 'pased' - passed --> pased

@cmonr
Copy link
Contributor

cmonr commented Oct 16, 2018

@c1728p9

not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

Thoughts?

@c1728p9
Copy link
Contributor

c1728p9 commented Oct 16, 2018

not sure if Daplink supports flow control from host <-->daplink<---- hw fw ctrl ---> device.

I took a look at the DAPLink code and flow control is enabled for the NRF52 boards.

@marcuschangarm
Copy link
Contributor Author

@c1728p9
Does DAPLink support flow control on the host side as well?

@c1728p9
Copy link
Contributor

c1728p9 commented Oct 17, 2018

@marcuschangarm yep, it is built into the USB CDC protocol and can't be turned off.

@marcuschangarm
Copy link
Contributor Author

@c1728p9
So if the CI enables flow control then it should work all the way down?

@marcuschangarm
Copy link
Contributor Author

The failing test is for a component that is not supported by the NRF52_DK target.

Both the rtc_time and rtc_time_conv HAL tests are missing this header:

#if !DEVICE_RTC
#error [NOT_SUPPORTED] RTC API not supported for this target
#endif

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 24, 2018

@mprse Can you review ^^ ?

Due to buggy flow control logic in the UARTE, the stop signal
is not being set as it is supposed to when the the module is
not ready to receive data.

This commit signals the sender to halt transmitting when a DMA
buffer is full and only continue again when the atomic FIFO
buffer has been emptied. This allows platforms with hardware
flow control to minimize all buffers and rely on flow control
instead.
@c1728p9
Copy link
Contributor

c1728p9 commented Oct 24, 2018

So if the CI enables flow control then it should work all the way down?

Yep

@mprse
Copy link
Contributor

mprse commented Oct 25, 2018

Both the rtc_time and rtc_time_conv HAL tests are missing this header:

#if !DEVICE_RTC
#error [NOT_SUPPORTED] RTC API not supported for this target
#endif

These tests verifies time conversion functions from the following library which is available for all targets and does not require RTC. Also even when RTC is not supported on the specific platform it is then replaced by LPTICKER if available (which is NRF52_DK case I think):

#if DEVICE_RTC
static void (*_rtc_init)(void) = rtc_init;
static int (*_rtc_isenabled)(void) = rtc_isenabled;
static time_t (*_rtc_read)(void) = rtc_read;
static void (*_rtc_write)(time_t t) = rtc_write;
#elif DEVICE_LPTICKER
#include "drivers/LowPowerTimer.h"
#define US_PER_SEC 1000000
static SingletonPtr<mbed::LowPowerTimer> _rtc_lp_timer;
static uint64_t _rtc_lp_base;
static bool _rtc_enabled;
static void _rtc_lpticker_init(void)
{
_rtc_lp_timer->start();
_rtc_enabled = true;
}
static int _rtc_lpticker_isenabled(void)
{
return (_rtc_enabled == true);
}
static time_t _rtc_lpticker_read(void)
{
return _rtc_lp_timer->read_high_resolution_us() / US_PER_SEC + _rtc_lp_base;
}
static void _rtc_lpticker_write(time_t t)
{
_rtc_lp_base = t;
}
static void (*_rtc_init)(void) = _rtc_lpticker_init;
static int (*_rtc_isenabled)(void) = _rtc_lpticker_isenabled;
static time_t (*_rtc_read)(void) = _rtc_lpticker_read;
static void (*_rtc_write)(time_t t) = _rtc_lpticker_write;
#else /* DEVICE_LPTICKER */
static void (*_rtc_init)(void) = NULL;
static int (*_rtc_isenabled)(void) = NULL;
static time_t (*_rtc_read)(void) = NULL;
static void (*_rtc_write)(time_t t) = NULL;
#endif /* DEVICE_LPTICKER */

I can confirm that the failing test sends lots of data between host and device using uart. So this is a great test for this PR 😅 . Looking at the last results we have failure on IAR and ARM compilers.
For IAR there is a regular test timeout case, so we could consider increasing the test execution timeout.
For ARM we got some kind of communication error:
:144::FAIL: Expected 'passed' Was 'pased'
Looks like one byte has been lost during the transmission which is bad and may cause some unexpected issues in the future, so I think this needs to be investigated.

@marcuschangarm
Copy link
Contributor Author

So, what's the plan?

  • Characters are missing, but the CI doesn't support flow control.
  • Tests are failing for components we don't claim support for.

Currently flow control is not supported by the CI and enabling it
causes unwanted side effects.
@marcuschangarm
Copy link
Contributor Author

/morph build

@mbed-ci
Copy link

mbed-ci commented Oct 25, 2018

Build : SUCCESS

Build number : 3470
Build artifacts/logs : http://mbed-os.s3-website-eu-west-1.amazonaws.com/?prefix=builds/8292/

Triggering tests

/morph test
/morph export-build
/morph mbed2-build

@mbed-ci
Copy link

mbed-ci commented Oct 26, 2018

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 26, 2018

Eclipse error, but all good there

/morph export-build

@mbed-ci
Copy link

mbed-ci commented Oct 26, 2018

@mbed-ci
Copy link

mbed-ci commented Oct 26, 2018

@cmonr cmonr merged commit a5b7573 into ARMmbed:master Oct 26, 2018
@cmonr cmonr removed the needs: CI label Oct 26, 2018
@marcuschangarm marcuschangarm deleted the fix_flow branch November 14, 2018 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.