Skip to content

Fix KL25Z connect problem with some USB 3.0 hosts #414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 5, 2014

Conversation

mjrgh
Copy link
Contributor

@mjrgh mjrgh commented Jul 25, 2014

This change fixes a problem I've encountered using the Freescale KL25Z as a USB client device when attaching to a USB 3.0 port on a host PC. I've personally observed the problem on two PCs, one a Dell running Windows 8.1 and the other a custom-built Windows 7 machine with a Gigabyte mobo. Several other people on the Web have reported what appears to be the same problem. It even appears to happen with some Macs - so it's apparently not Windows-specific or Windows-version-specific. But it does seem to be USB 3.0 specific, and one report speculated that the common element is the Intel Haswell chip set on the host side.

The problem can be worked around by using USB 2.1 ports (many PCs have a mix of 2.1 and 3.0 port hardware) or by connecting a USB 2.1 hub between the device and host. I have USB 2.1 ports on both of my testing machines, and this solution does work for me. But I wanted to fix it at the software level if possible, so that my project would work for people who might only have USB 3.0 ports available.

I tracked down the problem to a timing issue that seems to cause the device and host to get out of sync with respect to flow direction on the USB connection during the initial handshake. It seems that the host can issue a CONTROL OUT token at a point where the client is trying to send the response for a previous CONTROL IN token. I found a note in some USB 3.0 reference material that this situation can occur, and that the client can resolve it by sending a zero-length ACK, which tells the host that the client isn't ready and that the host request should be repeated.

My fix sends a zero-length reply in USBDevice::controlOut() when the control flow error is detected, and also explicitly sends the pending CONTROL_IN packet. This appears to fix it reliably on my machine. I'm not entirely confident about the fix, though, since I don't feel I understand the USB protocol or the mbed code in enough depth. I'm submitting the change here in the hope that someone with a deeper knowledge of the mbed code can determine if this is a good fix, and if not can hopefully use it as the starting point for a better solution.

Note that I submitted the same pull request via the mbed source control system, but I was advised by another mbed user that I should also submit it directly on github. Apologies for any confusion caused by the redundant submissions.

This change fixes a problem I've encountered using the Freescale KL25Z as a USB client device when attaching to a USB 3.0 port on a host PC.  I've personally observed the problem on two PCs, one a Dell running Windows 8.1 and the other a custom-built Windows 7 machine with a Gigabyte mobo.  Several other people on the Web have reported what appears to be the same problem.  It even appears to happen with some Macs - so it's apparently not Windows-specific or Windows-version-specific.  But it does seem to be USB 3.0 specific, and one report speculated that the common element is the Intel Haswell chip set on the host side.

The problem can be worked around by using USB 2.1 ports (many PCs have a mix of 2.1 and 3.0 port hardware) or by connecting a USB 2.1 hub between the device and host.  I have USB 2.1 ports on both of my testing machines, and this solution does work for me.  But I wanted to fix it at the software level if possible, so that my project would work for people who might only have USB 3.0 ports available.

I tracked down the problem to a timing issue that seems to cause the device and host to get out of sync with respect to flow direction on the USB connection during the initial handshake. It seems that the host can issue a CONTROL OUT token at a point where the client is trying to send the response for a previous CONTROL IN token. I found a note in some USB 3.0 reference material that this situation can occur, and that the client can resolve it by sending a zero-length ACK, which tells the host that the client isn't ready and that the host request should be repeated. 

My fix sends a zero-length reply in USBDevice::controlOut() when the control flow error is detected, and also explicitly sends the pending CONTROL_IN packet.  This appears to fix it reliably on my machine.  I'm not entirely confident about the fix, though, since I don't feel I understand the USB protocol or the mbed code in enough depth.  I'm submitting the change here in the hope that someone with a deeper knowledge of the mbed code can determine if this is a good fix, and if not can hopefully use it as the starting point for a better solution.  

Note that I submitted the same pull request via the mbed source control system, but I was advised by another mbed user that I should also submit it directly on github.  Apologies for any confusion caused by the redundant submissions.
@larsgk
Copy link

larsgk commented Jul 26, 2014

Great find! :)

So - would this potentially solve the problems I have encountered with the
haswell usb3 ports & usb serial? That would be awesome!

#193

br
Lars

On Fri, Jul 25, 2014 at 11:52 PM, mjrgh [email protected] wrote:

This change fixes a problem I've encountered using the Freescale KL25Z as
a USB client device when attaching to a USB 3.0 port on a host PC. I've
personally observed the problem on two PCs, one a Dell running Windows 8.1
and the other a custom-built Windows 7 machine with a Gigabyte mobo.
Several other people on the Web have reported what appears to be the same
problem. It even appears to happen with some Macs - so it's apparently not
Windows-specific or Windows-version-specific. But it does seem to be USB
3.0 specific, and one report speculated that the common element is the
Intel Haswell chip set on the host side.

The problem can be worked around by using USB 2.1 ports (many PCs have a
mix of 2.1 and 3.0 port hardware) or by connecting a USB 2.1 hub between
the device and host. I have USB 2.1 ports on both of my testing machines,
and this solution does work for me. But I wanted to fix it at the software
level if possible, so that my project would work for people who might only
have USB 3.0 ports available.

I tracked down the problem to a timing issue that seems to cause the
device and host to get out of sync with respect to flow direction on the
USB connection during the initial handshake. It seems that the host can
issue a CONTROL OUT token at a point where the client is trying to send the
response for a previous CONTROL IN token. I found a note in some USB 3.0
reference material that this situation can occur, and that the client can
resolve it by sending a zero-length ACK, which tells the host that the
client isn't ready and that the host request should be repeated.

My fix sends a zero-length reply in USBDevice::controlOut() when the
control flow error is detected, and also explicitly sends the pending
CONTROL_IN packet. This appears to fix it reliably on my machine. I'm not
entirely confident about the fix, though, since I don't feel I understand
the USB protocol or the mbed code in enough depth. I'm submitting the
change here in the hope that someone with a deeper knowledge of the mbed
code can determine if this is a good fix, and if not can hopefully use it
as the starting point for a better solution.

Note that I submitted the same pull request via the mbed source control
system, but I was advised by another mbed user that I should also submit it
directly on github. Apologies for any confusion caused by the redundant

submissions.

You can merge this Pull Request by running

git pull https://github.com/mjrgh/mbed patch-1

Or view, comment on, or merge it at:

#414
Commit Summary

  • Fix KL25Z connect problem with some USB 3.0 hosts

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#414.

@mjrgh
Copy link
Contributor Author

mjrgh commented Jul 26, 2014

Lars - I don't have a Mac to test on, but I think it's very likely that you're having the same issue that this fix is intended to address. The description of the protocol failure from the Apple engineer exactly matches the symptom I was seeing - that's the same range of points in the handshake it was failing in my setup, and it had exactly the same variability/randomness. The way the mbed stack is written, it stalls the connection if it detects a flow control mismatch during the handshake, and the host interprets that as failing to respond to a configuration data request (exactly what the Apple engineer described the host seeing). I'd definitely be interested to hear if this fix works for you, if you have a chance to give it a try.

@bogdanm
Copy link
Contributor

bogdanm commented Jul 28, 2014

Thanks a lot for your pull request! I like it, but I have a feeling it should happen on all targets, not just Freescale ones. But that's something that can be implemented in the future, so before I accept this, could you please sign the mbed contributor agreement at https://mbed.org/contributor_agreement/ ? It's a required step for all mbed contributors.

Thanks,
Bogdan

@mbednotifications
Copy link

Good news: I just tested this on a mac and it seems to do the trick! :)
might need more extensive testing though.

br
Lars

On Sat, Jul 26, 2014 at 7:09 PM, mjrgh [email protected] wrote:

Lars - I don't have a Mac to test on, but I think it's very likely that
you're having the same issue that this fix is intended to address. The
description of the protocol failure from the Apple engineer exactly matches
the symptom I was seeing - that's the same range of points in the handshake
it was failing in my setup, and it had exactly the same
variability/randomness. The way the mbed stack is written, it stalls the
connection if it detects a flow control mismatch during the handshake, and
the host interprets that as failing to respond to a configuration data
request (exactly what the Apple engineer described the host seeing). I'd
definitely be interested to hear if this fix works for you, if you have a
chance to give it a try.

Reply to this email directly or view it on GitHub
#414 (comment).

--
You received this message because you are subscribed to the Google Groups
"mbed-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/d/optout.

@0xc0170
Copy link
Contributor

0xc0170 commented Jul 29, 2014

Lars, thanks for sharing. Keep us updated.

@0xc0170
Copy link
Contributor

0xc0170 commented Aug 4, 2014

@mjrgh , as Bogdan pointed in the message above, please sign the mbed contributor agreeement, share your nick on mbed with us as I am not able to find you there.

Thanks,
0xc0170

@Sissors
Copy link
Contributor

Sissors commented Aug 5, 2014

This is the his mbed name: https://mbed.org/users/mjr/ (but he indeed hasn't signed it yet).

Anyway reason for my comment, this question was asked on mbed site: http://mbed.org/questions/4252/USBKeyboard-on-KL25Z-only-works-intermit/, he describes similar issue, and new lib also solved it for him. So this really seems to be the solution.

@mjrgh
Copy link
Contributor Author

mjrgh commented Aug 5, 2014

Sorry I didn't get back to this sooner - I signed the agreement, so should be set to go. Let me know if you need anything else on my end.

bogdanm added a commit that referenced this pull request Aug 5, 2014
Fix KL25Z connect problem with some USB 3.0 hosts
@bogdanm bogdanm merged commit ddc5340 into ARMmbed:master Aug 5, 2014
yogpan01 pushed a commit to yogpan01/mbed that referenced this pull request Jul 21, 2016
@c1728p9
Copy link
Contributor

c1728p9 commented Jan 6, 2018

I was looking at the USB code and came across this workaround for Kinetis devices and followed it back to this PR.

I did some investigation and it appears that the cause of this is the OUT packet sent during the status phase of a Control IN transfer (used to indicate success) is wrongly handled as a controlOut request. The reason this doesn't happen on every Control IN transfer is because the Kinetis implementation of EP0readStage sets the data toggle of the RX endpoint 0 to DATA0, which causes the DATA1 packet of the Status OUT to get ignored because of the wrong data toggle.

The likely cause of this intermittent problem is the window of time between setting RX endpoint 0 to DATA1 in EP0read() and setting it back to DATA0 in EP0readStage(). If the OUT status packet is received during this window (RX endpoint 0 listening for DATA1) it won't be ignored and will fill the RX endpoint 0 buffer, causing the subsequent Setup transfer to be dropped.

For reference here is a logic capture of this occurring (this PR was reverted when running the test):

error_during_out_packet

As a possible alternative solution I added code to force the value expected by EP0read() to DATA0 when the transfer is a Control IN. This is done by replacing this line in endpointReadResult:

        if (setup && ((buffer[6] == 0) || ((buffer[0] >> 7) & 1)))  // if no setup data stage or an IN data stage

CC @mjrgh @larsgk @0xc0170 @Sissors

c1728p9 added a commit to c1728p9/mbed-os that referenced this pull request Jan 18, 2018
Make the following improvements and fixes:

1.
Update the Kinetis USB driver so that endpointReadResult only reads the
result of the last read and does not trigger a new read. Instead
move the code to trigger new reads into endpointRead.

2.
Fix the race condition in controlIn caused by a call to
EP0read() followed immediately by  EP0readStage(). This is done by
setting up to read the next setup packet (ignoring the status stage)
in endpointReadResult rather than in EP0readStage. This makes the
function EP0readStage unnecissary.

3.
Remove the Kinetis workaround in controlOut in USBDevice.cpp since
point 2 fixes this bug. For more info on this see the PR which
added this workaround - ARMmbed#414
kegilbert pushed a commit to kegilbert/mbed-os that referenced this pull request Jan 25, 2018
Make the following improvements and fixes:

1.
Update the Kinetis USB driver so that endpointReadResult only reads the
result of the last read and does not trigger a new read. Instead
move the code to trigger new reads into endpointRead.

2.
Fix the race condition in controlIn caused by a call to
EP0read() followed immediately by  EP0readStage(). This is done by
setting up to read the next setup packet (ignoring the status stage)
in endpointReadResult rather than in EP0readStage. This makes the
function EP0readStage unnecissary.

3.
Remove the Kinetis workaround in controlOut in USBDevice.cpp since
point 2 fixes this bug. For more info on this see the PR which
added this workaround - ARMmbed#414
cmonr pushed a commit that referenced this pull request Jan 27, 2018
Make the following improvements and fixes:

1.
Update the Kinetis USB driver so that endpointReadResult only reads the
result of the last read and does not trigger a new read. Instead
move the code to trigger new reads into endpointRead.

2.
Fix the race condition in controlIn caused by a call to
EP0read() followed immediately by  EP0readStage(). This is done by
setting up to read the next setup packet (ignoring the status stage)
in endpointReadResult rather than in EP0readStage. This makes the
function EP0readStage unnecissary.

3.
Remove the Kinetis workaround in controlOut in USBDevice.cpp since
point 2 fixes this bug. For more info on this see the PR which
added this workaround - #414
cmonr pushed a commit that referenced this pull request Jan 27, 2018
Make the following improvements and fixes:

1.
Update the Kinetis USB driver so that endpointReadResult only reads the
result of the last read and does not trigger a new read. Instead
move the code to trigger new reads into endpointRead.

2.
Fix the race condition in controlIn caused by a call to
EP0read() followed immediately by  EP0readStage(). This is done by
setting up to read the next setup packet (ignoring the status stage)
in endpointReadResult rather than in EP0readStage. This makes the
function EP0readStage unnecissary.

3.
Remove the Kinetis workaround in controlOut in USBDevice.cpp since
point 2 fixes this bug. For more info on this see the PR which
added this workaround - #414
cmonr pushed a commit that referenced this pull request Jan 27, 2018
Make the following improvements and fixes:

1.
Update the Kinetis USB driver so that endpointReadResult only reads the
result of the last read and does not trigger a new read. Instead
move the code to trigger new reads into endpointRead.

2.
Fix the race condition in controlIn caused by a call to
EP0read() followed immediately by  EP0readStage(). This is done by
setting up to read the next setup packet (ignoring the status stage)
in endpointReadResult rather than in EP0readStage. This makes the
function EP0readStage unnecissary.

3.
Remove the Kinetis workaround in controlOut in USBDevice.cpp since
point 2 fixes this bug. For more info on this see the PR which
added this workaround - #414
ccli8 pushed a commit to ccli8/mbed-os that referenced this pull request Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants