[ONME-2844] Supporting non-blocking connect() #3457

hasnainvirk · 2016-12-16T13:37:07Z

A few new error codes are added to nsapi_error_t and
support for non-blocking socket connect is added.
Nanostack's connect call will be non-blocking.
Whereas LWIP connect call is currently blocking, and it could be changed now
to be non-blocking.

A few new error codes are added to nsapi_error_t and support for non-blocking socket connect is added. Nanostack's connect call will be non-blocking. Whereas LWIP connect call is currently blocking, and it could be changed now to be non-blocking.

hasnainvirk · 2016-12-16T13:41:43Z

@sg- @c1728p9 @bridadan @geky Please review

geky · 2016-12-16T17:25:31Z

cc @kjbracey-arm as well

geky · 2016-12-16T17:41:18Z

features/netsocket/nsapi_types.h

@@ -48,6 +48,9 @@ enum nsapi_error {
    NSAPI_ERROR_DHCP_FAILURE  = -3010,     /*!< DHCP failed to complete successfully */
    NSAPI_ERROR_AUTH_FAILURE  = -3011,     /*!< connection to access point failed */
    NSAPI_ERROR_DEVICE_ERROR  = -3012,     /*!< failure interfacing with the network processor */
+    NSAPI_ERROR_IN_PROGRESS   = -3013,     /*!< operation (eg connect) in progress */


What's the distinction between NSAPI_ERROR_IN_PROGRESS and NSAPI_ERROR_WOULD_BLOCK?

As I understand it, the distinctions seem to be:

"would block" - "that call had no effect, but you can try the same thing again later and it might work"
"in progress"- "that call started something, but it hasn't finished, check back for status later"

In the case of "connect", checking back after "in progress" could either be trying connect again (in which case you can expect "already in progress" or "is connected" errors), or just going ahead and trying to read/write (in which case you could get "not connected").

So "would block" means you have to try the same call again. "In progress" means you could just start trying the next call, as it will be accepted once the in-progress thing finishes (or maybe it'll even get queued).

Distinction doesn't actually matter to the blocking abstraction here though, as it does just keep calling connect().

Having written that, I guess it follows that the blocking loop here could respond to "would block" by retrying. But POSIX doesn't have "would block" as a possible connect() return, and I guess this shouldn't encourage that.

Also, only interpreting new codes has the handy effect of ensuring this patch doesn't change existing behaviour for any current stacks.

Fair enough, that makes sense. Thanks for clarifying.

Out of curiousity, is it valid to go straight to a non-blocking send loop after starting a non-blocking connect? Would send return NSAPI_ERROR_WOULD_BLOCK or NSAPI_ERROR_NO_CONNECTION at that point?

It would be sort of be fine. BSD/POSIX would return ENOTCONN, so I'd expect stacks to give NO_CONNECTION to match.

Unfortunately in NSAPI, NSAPI will also give you NO_CONNECTION in place of ECONNFAILED, ECONNRESET or ETIMEDOUT. So you wouldn't be able to distinguish between a failed connection and one which just hadn't completed yet.

I guess you have to keep calling connect(), like this blocking loop.

And to make that work, you also have to make sure you do get a sensible error return from connect() after failure. What you wouldn't want is for it to give up on a connection in the background, and then have the next connect call just start it again.

That's generally handled by one or both of - having sticky errors on sockets, so next socket call after an error returns the error code, and/or not permitting more than 1 connect() attempt on any socket. Nanostack does the latter.

Update - apparently network stacks are not as consistent as I thought on this. BSD 4.4 actually doesn't return the sticky error code on connect(), and does permit a second connect, so this loop wouldn't work plugged raw into BSD.

https://stackoverflow.com/questions/17769964/linux-sockets-non-blocking-connect
https://cr.yp.to/docs/connect.html

Ew. Well, we can demand that underlying stacks handle this if they're supplying non-blocking connect. Doesn't seem unreasonable. They could incorporate their own sticky error handling or equivalent event handling, or call getsockopt(SO_ERROR) first in their connect handler.

geky · 2016-12-16T17:47:15Z

features/netsocket/TCPSocket.cpp

+    _write_in_progress = false;
+
+    /* Non-blocking connect gives "EISCONN" once done - convert to OK for blocking mode if we became connected during this call */
+    if (ret == NSAPI_ERROR_IS_CONNECTED && blocking_connect_in_progress) {


Should this check just be ret == NSAPI_ERROR_IS_CONNECTED && _timeout > 0? What if a socket returns NSAPI_ERROR_IS_CONNECTED on the first call?

Scratch that, I just realized what that implies : )

geky · 2016-12-16T18:02:34Z

This is really well done. The non-blocking connect has been missing for a while now, and this seems like a graceful way to integrate it.

I am a bit concerned that when we introduce a non-blocking connect to lwip, we are going to have issues with existing programs that set the socket to non-blocking before connecting. This does seem the best path forward though.

It would be nice if #3265 was merged first, since it includes a few tests on the network connect function.

kjbracey · 2016-12-19T08:16:34Z

The one omission here is the non-blocking resolution - if the user passes a host name in non-blocking mode, it will block for the resolution, then proceed with a non-blocking connect.

But that's another set of work, and this is consistent with other host-name-taking calls.

Would be nice to do something about that some time as it stops DNS resolution working with mbed client over 6LoWPAN, which is why https://github.com/ARMmbed/mbed-os-example-client has to have #ifdef MESH around its URI.

geky · 2016-12-19T18:13:45Z

@kjbracey-arm, that's a good point, it sounds like this is a part of several prs required to introduce a fully non-blocking connect api.

Maybe we should create an issue to track this?

Other than that, this pr looks good to come in as is, @hasnainvirk, thanks for the patch!

0xc0170 · 2016-12-21T14:48:34Z

@kjbracey-arm @c1728p9 happy with this patch? please review

kjbracey

I worked on this with Hasnain - I'm happy with it.

geky reviewed Dec 16, 2016

View reviewed changes

geky requested review from c1728p9 and kjbracey December 16, 2016 18:03

geky approved these changes Dec 19, 2016

View reviewed changes

kjbracey mentioned this pull request Dec 20, 2016

lwip - Fixed error codes for failed TCP connect #3403

Merged

0xc0170 added the needs: review label Dec 21, 2016

kjbracey approved these changes Dec 21, 2016

View reviewed changes

c1728p9 approved these changes Dec 21, 2016

View reviewed changes

0xc0170 added ready for merge release-version: 5.3.2 and removed needs: review ready for merge labels Dec 22, 2016

0xc0170 merged commit bba527f into ARMmbed:master Dec 23, 2016

kjbracey mentioned this pull request Jan 20, 2017

Don't send events on close() #3374

Merged

geky mentioned this pull request Jan 20, 2017

nsapi: Change initial state of sockets to allow events #3619

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ONME-2844] Supporting non-blocking connect() #3457

[ONME-2844] Supporting non-blocking connect() #3457

Uh oh!

hasnainvirk commented Dec 16, 2016 •

edited

Loading

Uh oh!

hasnainvirk commented Dec 16, 2016

Uh oh!

geky commented Dec 16, 2016

Uh oh!

geky Dec 16, 2016

Uh oh!

kjbracey Dec 19, 2016 •

edited

Loading

Uh oh!

kjbracey Dec 19, 2016

Uh oh!

geky Dec 19, 2016

Uh oh!

kjbracey Dec 20, 2016

Uh oh!

kjbracey Dec 20, 2016

Uh oh!

geky Dec 16, 2016

Uh oh!

geky Dec 16, 2016

Uh oh!

geky commented Dec 16, 2016

Uh oh!

kjbracey commented Dec 19, 2016

Uh oh!

geky commented Dec 19, 2016

Uh oh!

0xc0170 commented Dec 21, 2016

Uh oh!

kjbracey left a comment

Uh oh!

Uh oh!

[ONME-2844] Supporting non-blocking connect() #3457

[ONME-2844] Supporting non-blocking connect() #3457

Uh oh!

Conversation

hasnainvirk commented Dec 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hasnainvirk commented Dec 16, 2016

Uh oh!

geky commented Dec 16, 2016

Uh oh!

geky Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

kjbracey Dec 19, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kjbracey Dec 19, 2016

Choose a reason for hiding this comment

Uh oh!

geky Dec 19, 2016

Choose a reason for hiding this comment

Uh oh!

kjbracey Dec 20, 2016

Choose a reason for hiding this comment

Uh oh!

kjbracey Dec 20, 2016

Choose a reason for hiding this comment

Uh oh!

geky Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

geky Dec 16, 2016

Choose a reason for hiding this comment

Uh oh!

geky commented Dec 16, 2016

Uh oh!

kjbracey commented Dec 19, 2016

Uh oh!

geky commented Dec 19, 2016

Uh oh!

0xc0170 commented Dec 21, 2016

Uh oh!

kjbracey left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hasnainvirk commented Dec 16, 2016 •

edited

Loading

kjbracey Dec 19, 2016 •

edited

Loading