Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

SeppoTakalo · 2018-12-21T13:46:07Z

Description

Allow LwIP TCP retransmissions to be configured and tune those smaller.

Currently, LwIP segment retransmission time is 12, which is very long
time as each timeout doubles the retransmission timeout.
Make that to 6 as that is same what we use in Nanostack.

LwIP TCP engine uses 500 ms slow timer and each retransmit will double the previous timeout.

          /* Double retransmission time-out unless we are trying to
           * connect to somebody (i.e., we are in SYN_SENT). */
          if (pcb->state != SYN_SENT) {
            u8_t backoff_idx = LWIP_MIN(pcb->nrtx, sizeof(tcp_backoff)-1);
            pcb->rto = ((pcb->sa >> 3) + pcb->sv) << tcp_backoff[backoff_idx];
          }

TCP timeouts come from table:

static const u8_t tcp_backoff[13] =
    { 1, 2, 3, 4, 5, 6, 7, 7, 7, 7, 7, 7, 7};

And to my best understanding, this pcb->sv is 6. so 6 << tcp_backoff[7] will already lead to 6 minutes. Going through all timeouts will eventually take something like 40 minutes, unless I'm mistaken by values of pcb->sa and pcb->sv

This will fix the issue where Client+HTTP update stops into DNS queries not going through because timeouted TCP sessions have already eaten the full heap.

Also change the number of Freescale EMAC driver ring buffers down to 3. Earlier it was using 16+8 buffers, each 1500 bytes. This was the biggest consumer of LwIP heap.

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@kjbracey-arm
@teetak01
@yogpan01

ciarmcom · 2018-12-21T14:00:26Z

@SeppoTakalo, thank you for your changes.
@yogpan01 @teetak01 @kjbracey-arm@ARMmbed/mbed-os-ipcore @ARMmbed/mbed-os-maintainers please review.

0xc0170 · 2019-01-02T10:01:22Z

Please review reviewers

features/lwipstack/lwipopts.h

features/lwipstack/mbed_lib.json

kjbracey · 2019-01-02T10:07:27Z

features/netsocket/emac-drivers/TARGET_Freescale_EMAC/mbed_lib.json

@@ -1,7 +1,7 @@
 {
    "name": "kinetis-emac",
    "config": {
-        "rx-ring-len": 16,
-        "tx-ring-len": 8
+        "rx-ring-len": 2,


Doing this without correspondingly reducing the lwIP mem config is a little "sneaky" - effectively you've giving this specific platform a huge memory pool for lwIP, because it was set big in a target override apparently to make room for these buffers.

I'd like you to either

a) reduce the target-override memory pool for lwIP by the corresponding amount - if you can (but I assume you do need the extra space, so you can't).
b) reduce by less, and raise the default memory pool (and other target overrides?) by the difference - so that all platforms get the extra memory you've found you need. Maybe the other targets with big pools can also have their buffer count reduced, so they don't need to get bigger.

I'll lower the lwip.mem-size value for some amount..
This 16, was actually causing memory_manager->alloc_heap(ENET_ETH_MAX_FLEN, ENET_BUFF_ALIGNMENT); called 16 times.. Where ENET_ETH_MAX_FLEN is 1522 bytes and alignment is 16. I'm unsure of how much the allocator overhead is.

Total memory usage is something between 24352 bytes 24608 bytes.

As a result, I lowered the lwip.mem-size for Freescale boards from 36560 to 16384 (~20 kB less), which still seems to pass the Cloud Client FW update test.

If 16K is required with 2 RX buffers, that does suggest cloud wants about 12K of lwIP mem workspace, so maybe that should be the default mem-size? (I believe we assume that target defaults should be good for cloud client). And then other targets should be doing 12K + their buffer size?

I guess the issue is that any upward increase like that might not be appropriate for a patch release, so should be done separately. Something to follow up.

Kintis EMAC is consuming 16 rinbuffers for input, and 8 buffers for output. This is over-use because input packets are immediately allocated from heap when passed to LwIP. Therefore the number can be creatly reduced.

Currently, LwIP segment retransmission time is 12, which is very long time as each timeout doubles the retransmission timeout. Make that to 6 as that is same what we use in Nanostack.

SeppoTakalo · 2019-01-02T17:12:42Z

Updated the PR based on review feedback.

Should now be ready for testing.

cmonr · 2019-01-03T00:34:52Z

@kjbracey-arm @michalpasztamobica @yogpan01 @teetak01 When y'all get a chance.

SeppoTakalo · 2019-01-03T09:17:44Z

@0xc0170 Please start the tests for this.

0xc0170 · 2019-01-03T09:21:24Z

We will do as soon as 5.11.1 RC completes (currently in CI, should be completed in 1-2h),

0xc0170 · 2019-01-03T10:51:26Z

Test started

0xc0170 · 2019-01-03T13:40:07Z

Exporters failures are related to the CI packages, we are investigating what has changed, will restart once the root cause found cc @ARMmbed/mbed-os-test

0xc0170 · 2019-01-03T14:10:16Z

Exporter restarted

mbed-ci · 2019-01-03T17:31:36Z

Test run: FAILED

Summary: 2 of 11 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

jenkins-ci/mbed-os-ci_exporter
jenkins-ci/mbed-os-ci_greentea-test

cmonr · 2019-01-04T01:22:55Z

CI job restarted: jenkins-ci/mbed-os-ci_greentea-test

ciarmcom requested review from kjbracey, teetak01, yogpan01 and a team December 21, 2018 14:00

ciarmcom added the needs: review label Dec 21, 2018

0xc0170 approved these changes Dec 21, 2018

View reviewed changes

michalpasztamobica reviewed Jan 2, 2019

View reviewed changes

features/lwipstack/lwipopts.h Outdated Show resolved Hide resolved

kjbracey reviewed Jan 2, 2019

View reviewed changes

0xc0170 added needs: work and removed needs: review labels Jan 2, 2019

Seppo Takalo added 2 commits January 2, 2019 19:05

Don't consume 36 kB just for Ethernet buffers.

50eb243

Kintis EMAC is consuming 16 rinbuffers for input, and 8 buffers for output. This is over-use because input packets are immediately allocated from heap when passed to LwIP. Therefore the number can be creatly reduced.

Allow LwIP TCP retransmissions to be configured and tune those smaller.

f3bbd2b

Currently, LwIP segment retransmission time is 12, which is very long time as each timeout doubles the retransmission timeout. Make that to 6 as that is same what we use in Nanostack.

SeppoTakalo force-pushed the lwip_tcp_timeout branch from ed18962 to f3bbd2b Compare January 2, 2019 17:05

cmonr added needs: review and removed needs: work labels Jan 3, 2019

kjbracey approved these changes Jan 3, 2019

View reviewed changes

0xc0170 added the release-version: 5.11.2 label Jan 3, 2019

0xc0170 added needs: CI and removed needs: review labels Jan 3, 2019

teetak01 approved these changes Jan 3, 2019

View reviewed changes

0xc0170 added ready for merge and removed needs: CI labels Jan 4, 2019

0xc0170 merged commit 5a2469d into ARMmbed:master Jan 4, 2019

0xc0170 removed the ready for merge label Jan 4, 2019

SeppoTakalo deleted the lwip_tcp_timeout branch January 7, 2019 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

Uh oh!

SeppoTakalo commented Dec 21, 2018

Uh oh!

ciarmcom commented Dec 21, 2018

Uh oh!

0xc0170 commented Jan 2, 2019

Uh oh!

Uh oh!

Uh oh!

kjbracey Jan 2, 2019

Uh oh!

SeppoTakalo Jan 2, 2019

Uh oh!

kjbracey Jan 3, 2019

Uh oh!

SeppoTakalo commented Jan 2, 2019

Uh oh!

cmonr commented Jan 3, 2019

Uh oh!

SeppoTakalo commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

mbed-ci commented Jan 3, 2019

Uh oh!

cmonr commented Jan 4, 2019

Uh oh!

Uh oh!

Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

Uh oh!

Conversation

SeppoTakalo commented Dec 21, 2018

Description

Pull request type

Reviewers

Uh oh!

ciarmcom commented Dec 21, 2018

Uh oh!

0xc0170 commented Jan 2, 2019

Uh oh!

Uh oh!

Uh oh!

kjbracey Jan 2, 2019

Choose a reason for hiding this comment

Uh oh!

SeppoTakalo Jan 2, 2019

Choose a reason for hiding this comment

Uh oh!

kjbracey Jan 3, 2019

Choose a reason for hiding this comment

Uh oh!

SeppoTakalo commented Jan 2, 2019

Uh oh!

cmonr commented Jan 3, 2019

Uh oh!

SeppoTakalo commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

0xc0170 commented Jan 3, 2019

Uh oh!

mbed-ci commented Jan 3, 2019

Test run: FAILED

Uh oh!

cmonr commented Jan 4, 2019

Uh oh!

Uh oh!