Skip to content

Allow LwIP TCP retransmissions to be configured and tune those smaller. #9183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 4, 2019

Conversation

SeppoTakalo
Copy link
Contributor

Description

Allow LwIP TCP retransmissions to be configured and tune those smaller.

Currently, LwIP segment retransmission time is 12, which is very long
time as each timeout doubles the retransmission timeout.
Make that to 6 as that is same what we use in Nanostack.

LwIP TCP engine uses 500 ms slow timer and each retransmit will double the previous timeout.

          /* Double retransmission time-out unless we are trying to
           * connect to somebody (i.e., we are in SYN_SENT). */
          if (pcb->state != SYN_SENT) {
            u8_t backoff_idx = LWIP_MIN(pcb->nrtx, sizeof(tcp_backoff)-1);
            pcb->rto = ((pcb->sa >> 3) + pcb->sv) << tcp_backoff[backoff_idx];
          }

TCP timeouts come from table:

static const u8_t tcp_backoff[13] =
    { 1, 2, 3, 4, 5, 6, 7, 7, 7, 7, 7, 7, 7};

And to my best understanding, this pcb->sv is 6. so 6 << tcp_backoff[7] will already lead to 6 minutes. Going through all timeouts will eventually take something like 40 minutes, unless I'm mistaken by values of pcb->sa and pcb->sv

This will fix the issue where Client+HTTP update stops into DNS queries not going through because timeouted TCP sessions have already eaten the full heap.

Also change the number of Freescale EMAC driver ring buffers down to 3. Earlier it was using 16+8 buffers, each 1500 bytes. This was the biggest consumer of LwIP heap.

Pull request type

[X] Fix
[ ] Refactor
[ ] Target update
[ ] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@kjbracey-arm
@teetak01
@yogpan01

@ciarmcom ciarmcom requested review from kjbracey, teetak01, yogpan01 and a team December 21, 2018 14:00
@ciarmcom
Copy link
Member

@SeppoTakalo, thank you for your changes.
@yogpan01 @teetak01 @kjbracey-arm@ARMmbed/mbed-os-ipcore @ARMmbed/mbed-os-maintainers please review.

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 2, 2019

Please review reviewers

@@ -1,7 +1,7 @@
{
"name": "kinetis-emac",
"config": {
"rx-ring-len": 16,
"tx-ring-len": 8
"rx-ring-len": 2,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this without correspondingly reducing the lwIP mem config is a little "sneaky" - effectively you've giving this specific platform a huge memory pool for lwIP, because it was set big in a target override apparently to make room for these buffers.

I'd like you to either

a) reduce the target-override memory pool for lwIP by the corresponding amount - if you can (but I assume you do need the extra space, so you can't).
b) reduce by less, and raise the default memory pool (and other target overrides?) by the difference - so that all platforms get the extra memory you've found you need. Maybe the other targets with big pools can also have their buffer count reduced, so they don't need to get bigger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll lower the lwip.mem-size value for some amount..
This 16, was actually causing memory_manager->alloc_heap(ENET_ETH_MAX_FLEN, ENET_BUFF_ALIGNMENT); called 16 times.. Where ENET_ETH_MAX_FLEN is 1522 bytes and alignment is 16. I'm unsure of how much the allocator overhead is.

Total memory usage is something between 24352 bytes 24608 bytes.

As a result, I lowered the lwip.mem-size for Freescale boards from 36560 to 16384 (~20 kB less), which still seems to pass the Cloud Client FW update test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 16K is required with 2 RX buffers, that does suggest cloud wants about 12K of lwIP mem workspace, so maybe that should be the default mem-size? (I believe we assume that target defaults should be good for cloud client). And then other targets should be doing 12K + their buffer size?

I guess the issue is that any upward increase like that might not be appropriate for a patch release, so should be done separately. Something to follow up.

Seppo Takalo added 2 commits January 2, 2019 19:05
Kintis EMAC is consuming 16 rinbuffers for input, and 8 buffers for output.
This is over-use because input packets are immediately allocated from
heap when passed to LwIP. Therefore the number can be creatly reduced.
Currently, LwIP segment retransmission time is 12, which is very long
time as each timeout doubles the retransmission timeout.
Make that to 6 as that is same what we use in Nanostack.
@SeppoTakalo
Copy link
Contributor Author

Updated the PR based on review feedback.

Should now be ready for testing.

@cmonr
Copy link
Contributor

cmonr commented Jan 3, 2019

@kjbracey-arm @michalpasztamobica @yogpan01 @teetak01 When y'all get a chance.

@SeppoTakalo
Copy link
Contributor Author

@0xc0170 Please start the tests for this.

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 3, 2019

We will do as soon as 5.11.1 RC completes (currently in CI, should be completed in 1-2h),

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 3, 2019

Test started

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 3, 2019

Exporters failures are related to the CI packages, we are investigating what has changed, will restart once the root cause found cc @ARMmbed/mbed-os-test

@0xc0170
Copy link
Contributor

0xc0170 commented Jan 3, 2019

Exporter restarted

@mbed-ci
Copy link

mbed-ci commented Jan 3, 2019

Test run: FAILED

Summary: 2 of 11 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_exporter
  • jenkins-ci/mbed-os-ci_greentea-test

@cmonr
Copy link
Contributor

cmonr commented Jan 4, 2019

CI job restarted: jenkins-ci/mbed-os-ci_greentea-test

@0xc0170 0xc0170 merged commit 5a2469d into ARMmbed:master Jan 4, 2019
@SeppoTakalo SeppoTakalo deleted the lwip_tcp_timeout branch January 7, 2019 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants