Skip to content

STM32F7: Do not generate redundant IN tokens #11103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 9, 2019

Conversation

desowin
Copy link
Contributor

@desowin desowin commented Jul 24, 2019

Description

When STM32F746-DISCO board was being used in (unsupported) USBHost mode,
the communication was unreliable. Our investigation revealed that the
problem lied in redundant IN tokens that the host generated even though
it shouldn't. This could lead to endless high-frequency NAKs being
received from device, which caused watchdog reset as USBHost spent all
time in interrupt handlers.

In our application the clocks frequencies are:

  • HCLK = 48 MHz
  • APB1 = 6 MHz
  • APB2 = 12 MHz

We have captured the raw USB High-Speed traffic using OpenVizsla.
Without this change, when USB MSD device connected to the system
responded to IN with NAK, there were excessive IN tokens generated about
667 ns after the NAK. With this commit the IN tokens are generated no
sooner than 10 us after the NAK.

The high frequency of the IN/NAK pairs is not the biggest problem.
The biggest problem is that the USB Host did continue to send the IN token
after DATA and ACK packets were received from device - without any request
from upper layer (USB MSD).

The USB MSD devices won't have extra data available on Bulk IN endpoint
after the expected data was received by Host. In such case IN/NAK cycle
time is only houndreds of nanoseconds, the MCU has no time for anything else.

The problem manifested not only on Bulk endpoints, but also during
Control transfers. Example correct scenario (when this fix is applied):

  • SETUP stage
    • SETUP [host -> address 0 endpoint 0]
    • DATA0 [80 06 00 01 00 00 08 00] [CRC16: EB 94]
    • ACK
  • DATA stage
    • IN
    • NAK
      ... the IN/NAK repeated multiple time until device was ready
    • IN
    • DATA1 [12 01 10 02 00 00 00 40] [CRC16: 55 41]
    • ACK
  • STATUS stage
    • OUT
    • DATA1 ZLP
    • ACK

Without this commit, in DATA stage, after the ACK was received, the host
did send extra IN to which device responded with STALL. On bus it was:

  • DATA stage
    ...
    • IN
    • DATA1 [12 01 10 02 00 00 00 40] [CRC16: 55 41]
    • IN
    • STALL
    • IN
    • STALL
  • STATUS stage
    • OUT
    • DATA1 ZLP
    • STALL

In the fault case the next SETUP was sent only after 510 ms, which
indicates timeout in upper layer.

With this commit the next SETUP is sent 120 us after the STATUS stage ACK.

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

Release Notes

When STM32F746-DISCO board was being used in (unsupported) USBHost mode,
the communication was unreliable. Our investigation revealed that the
problem lied in redundant IN tokens that the host generated even though
it shouldn't. This could lead to endless high-frequency NAKs being
received from device, which caused watchdog reset as USBHost spent all
time in interrupt handlers.

In our application the clocks frequencies are:
  * HCLK = 48 MHz
  * APB1 = 6 MHz
  * APB2 = 12 MHz

We have captured the raw USB High-Speed traffic using OpenVizsla.
Without this change, when USB MSD device connected to the system
responded to IN with NAK, there were excessive IN tokens generated about
667 ns after the NAK. With this commit the IN tokens are generated no
sooner than 10 us after the NAK.

The high frequency of the IN/NAK pairs is not the biggest problem.
The biggest problem is that the USB Host did continue to send the IN token
after DATA and ACK packets were received from device - *without* any request
from upper layer (USB MSD).

The USB MSD devices won't have extra data available on Bulk IN endpoint
after the expected data was received by Host. In such case IN/NAK cycle
time is only houndreds of nanoseconds, the MCU has no time for anything else.

The problem manifested not only on Bulk endpoints, but also during
Control transfers. Example correct scenario (when this fix is applied):
  * SETUP stage
    * SETUP [host -> address 0 endpoint 0]
    * DATA0 [80 06 00 01 00 00 08 00] [CRC16: EB 94]
    * ACK
  * DATA stage
    * IN
    * NAK
    ... the IN/NAK repeated multiple time until device was ready
    * IN
    * DATA1 [12 01 10 02 00 00 00 40] [CRC16: 55 41]
    * ACK
  * STATUS stage
    * OUT
    * DATA1 ZLP
    * ACK

Without this commit, in DATA stage, after the ACK was received, the host
did send extra IN to which device responded with STALL. On bus it was:
  * DATA stage
    ...
    * IN
    * DATA1 [12 01 10 02 00 00 00 40] [CRC16: 55 41]
    * IN
    * STALL
    * IN
    * STALL
  * STATUS stage
    * OUT
    * DATA1 ZLP
    * STALL

In the fault case the next SETUP was sent only after 510 ms, which
indicates timeout in upper layer.

With this commit the next SETUP is sent 120 us after the STATUS stage ACK.
@ciarmcom ciarmcom requested a review from a team July 24, 2019 11:00
@ciarmcom
Copy link
Member

@desowin, thank you for your changes.
@ARMmbed/mbed-os-maintainers please review.

@bulislaw
Copy link
Member

@ARMmbed/team-st-mcd please review.

@SeppoTakalo SeppoTakalo requested a review from a team July 25, 2019 10:00
@mbed-ci
Copy link

mbed-ci commented Jul 25, 2019

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 1
Build artifacts

@jeromecoutant
Copy link
Collaborator

ST_INTERNAL_REF 70627

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants