Skip to content

Commit 24adbc1

Browse files
Eric Dumazetdavem330
authored andcommitted
tcp: fix SO_RCVLOWAT hangs with fat skbs
We autotune rcvbuf whenever SO_RCVLOWAT is set to account for 100% overhead in tcp_set_rcvlowat() This works well when skb->len/skb->truesize ratio is bigger than 0.5 But if we receive packets with small MSS, we can end up in a situation where not enough bytes are available in the receive queue to satisfy RCVLOWAT setting. As our sk_rcvbuf limit is hit, we send zero windows in ACK packets, preventing remote peer from sending more data. Even autotuning does not help, because it only triggers at the time user process drains the queue. If no EPOLLIN is generated, this can not happen. Note poll() has a similar issue, after commit c700448 ("tcp: Respect SO_RCVLOWAT in tcp_poll().") Fixes: 03f45c8 ("tcp: avoid extra wakeups for SO_RCVLOWAT users") Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 92db978 commit 24adbc1

File tree

3 files changed

+26
-4
lines changed

3 files changed

+26
-4
lines changed

include/net/tcp.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1420,6 +1420,19 @@ static inline int tcp_full_space(const struct sock *sk)
14201420
return tcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf));
14211421
}
14221422

1423+
/* We provision sk_rcvbuf around 200% of sk_rcvlowat.
1424+
* If 87.5 % (7/8) of the space has been consumed, we want to override
1425+
* SO_RCVLOWAT constraint, since we are receiving skbs with too small
1426+
* len/truesize ratio.
1427+
*/
1428+
static inline bool tcp_rmem_pressure(const struct sock *sk)
1429+
{
1430+
int rcvbuf = READ_ONCE(sk->sk_rcvbuf);
1431+
int threshold = rcvbuf - (rcvbuf >> 3);
1432+
1433+
return atomic_read(&sk->sk_rmem_alloc) > threshold;
1434+
}
1435+
14231436
extern void tcp_openreq_init_rwin(struct request_sock *req,
14241437
const struct sock *sk_listener,
14251438
const struct dst_entry *dst);

net/ipv4/tcp.c

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -476,9 +476,17 @@ static void tcp_tx_timestamp(struct sock *sk, u16 tsflags)
476476
static inline bool tcp_stream_is_readable(const struct tcp_sock *tp,
477477
int target, struct sock *sk)
478478
{
479-
return (READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq) >= target) ||
480-
(sk->sk_prot->stream_memory_read ?
481-
sk->sk_prot->stream_memory_read(sk) : false);
479+
int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
480+
481+
if (avail > 0) {
482+
if (avail >= target)
483+
return true;
484+
if (tcp_rmem_pressure(sk))
485+
return true;
486+
}
487+
if (sk->sk_prot->stream_memory_read)
488+
return sk->sk_prot->stream_memory_read(sk);
489+
return false;
482490
}
483491

484492
/*

net/ipv4/tcp_input.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4757,7 +4757,8 @@ void tcp_data_ready(struct sock *sk)
47574757
const struct tcp_sock *tp = tcp_sk(sk);
47584758
int avail = tp->rcv_nxt - tp->copied_seq;
47594759

4760-
if (avail < sk->sk_rcvlowat && !sock_flag(sk, SOCK_DONE))
4760+
if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
4761+
!sock_flag(sk, SOCK_DONE))
47614762
return;
47624763

47634764
sk->sk_data_ready(sk);

0 commit comments

Comments
 (0)