@@ -571,7 +571,97 @@ duplicate packet is received.
571
571
572
572
* TcpExtTCPDSACKOfoRecv
573
573
The TCP stack receives a DSACK, which indicate an out of order
574
- duplciate packet is received.
574
+ duplicate packet is received.
575
+
576
+ TCP out of order
577
+ ===============
578
+ * TcpExtTCPOFOQueue
579
+ The TCP layer receives an out of order packet and has enough memory
580
+ to queue it.
581
+
582
+ * TcpExtTCPOFODrop
583
+ The TCP layer receives an out of order packet but doesn't have enough
584
+ memory, so drops it. Such packets won't be counted into
585
+ TcpExtTCPOFOQueue.
586
+
587
+ * TcpExtTCPOFOMerge
588
+ The received out of order packet has an overlay with the previous
589
+ packet. the overlay part will be dropped. All of TcpExtTCPOFOMerge
590
+ packets will also be counted into TcpExtTCPOFOQueue.
591
+
592
+ TCP PAWS
593
+ =======
594
+ PAWS (Protection Against Wrapped Sequence numbers) is an algorithm
595
+ which is used to drop old packets. It depends on the TCP
596
+ timestamps. For detail information, please refer the `timestamp wiki `_
597
+ and the `RFC of PAWS `_.
598
+
599
+ .. _RFC of PAWS : https://tools.ietf.org/html/rfc1323#page-17
600
+ .. _timestamp wiki : https://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_timestamps
601
+
602
+ * TcpExtPAWSActive
603
+ Packets are dropped by PAWS in Syn-Sent status.
604
+
605
+ * TcpExtPAWSEstab
606
+ Packets are dropped by PAWS in any status other than Syn-Sent.
607
+
608
+ TCP ACK skip
609
+ ===========
610
+ In some scenarios, kernel would avoid sending duplicate ACKs too
611
+ frequently. Please find more details in the tcp_invalid_ratelimit
612
+ section of the `sysctl document `_. When kernel decides to skip an ACK
613
+ due to tcp_invalid_ratelimit, kernel would update one of below
614
+ counters to indicate the ACK is skipped in which scenario. The ACK
615
+ would only be skipped if the received packet is either a SYN packet or
616
+ it has no data.
617
+
618
+ .. _sysctl document : https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
619
+
620
+ * TcpExtTCPACKSkippedSynRecv
621
+ The ACK is skipped in Syn-Recv status. The Syn-Recv status means the
622
+ TCP stack receives a SYN and replies SYN+ACK. Now the TCP stack is
623
+ waiting for an ACK. Generally, the TCP stack doesn't need to send ACK
624
+ in the Syn-Recv status. But in several scenarios, the TCP stack need
625
+ to send an ACK. E.g., the TCP stack receives the same SYN packet
626
+ repeately, the received packet does not pass the PAWS check, or the
627
+ received packet sequence number is out of window. In these scenarios,
628
+ the TCP stack needs to send ACK. If the ACk sending frequency is higher than
629
+ tcp_invalid_ratelimit allows, the TCP stack will skip sending ACK and
630
+ increase TcpExtTCPACKSkippedSynRecv.
631
+
632
+
633
+ * TcpExtTCPACKSkippedPAWS
634
+ The ACK is skipped due to PAWS (Protect Against Wrapped Sequence
635
+ numbers) check fails. If the PAWS check fails in Syn-Recv, Fin-Wait-2
636
+ or Time-Wait statuses, the skipped ACK would be counted to
637
+ TcpExtTCPACKSkippedSynRecv, TcpExtTCPACKSkippedFinWait2 or
638
+ TcpExtTCPACKSkippedTimeWait. In all other statuses, the skipped ACK
639
+ would be counted to TcpExtTCPACKSkippedPAWS.
640
+
641
+ * TcpExtTCPACKSkippedSeq
642
+ The sequence number is out of window and the timestamp passes the PAWS
643
+ check and the TCP status is not Syn-Recv, Fin-Wait-2, and Time-Wait.
644
+
645
+ * TcpExtTCPACKSkippedFinWait2
646
+ The ACK is skipped in Fin-Wait-2 status, the reason would be either
647
+ PAWS check fails or the received sequence number is out of window.
648
+
649
+ * TcpExtTCPACKSkippedTimeWait
650
+ Tha ACK is skipped in Time-Wait status, the reason would be either
651
+ PAWS check failed or the received sequence number is out of window.
652
+
653
+ * TcpExtTCPACKSkippedChallenge
654
+ The ACK is skipped if the ACK is a challenge ACK. The RFC 5961 defines
655
+ 3 kind of challenge ACK, please refer `RFC 5961 section 3.2 `_,
656
+ `RFC 5961 section 4.2 `_ and `RFC 5961 section 5.2 `_. Besides these
657
+ three scenarios, In some TCP status, the linux TCP stack would also
658
+ send challenge ACKs if the ACK number is before the first
659
+ unacknowledged number (more strict than `RFC 5961 section 5.2 `_).
660
+
661
+ .. _RFC 5961 section 3.2 : https://tools.ietf.org/html/rfc5961#page-7
662
+ .. _RFC 5961 section 4.2 : https://tools.ietf.org/html/rfc5961#page-9
663
+ .. _RFC 5961 section 5.2 : https://tools.ietf.org/html/rfc5961#page-11
664
+
575
665
576
666
examples
577
667
=======
@@ -1188,3 +1278,151 @@ Run nstat on server B::
1188
1278
We have deleted the default route on server B. Server B couldn't find
1189
1279
a route for the 8.8.8.8 IP address, so server B increased
1190
1280
IpOutNoRoutes.
1281
+
1282
+ TcpExtTCPACKSkippedSynRecv
1283
+ ------------------------
1284
+ In this test, we send 3 same SYN packets from client to server. The
1285
+ first SYN will let server create a socket, set it to Syn-Recv status,
1286
+ and reply a SYN/ACK. The second SYN will let server reply the SYN/ACK
1287
+ again, and record the reply time (the duplicate ACK reply time). The
1288
+ third SYN will let server check the previous duplicate ACK reply time,
1289
+ and decide to skip the duplicate ACK, then increase the
1290
+ TcpExtTCPACKSkippedSynRecv counter.
1291
+
1292
+ Run tcpdump to capture a SYN packet::
1293
+
1294
+ nstatuser@nstat-a:~$ sudo tcpdump -c 1 -w /tmp/syn.pcap port 9000
1295
+ tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
1296
+
1297
+ Open another terminal, run nc command::
1298
+
1299
+ nstatuser@nstat-a:~$ nc nstat-b 9000
1300
+
1301
+ As the nstat-b didn't listen on port 9000, it should reply a RST, and
1302
+ the nc command exited immediately. It was enough for the tcpdump
1303
+ command to capture a SYN packet. A linux server might use hardware
1304
+ offload for the TCP checksum, so the checksum in the /tmp/syn.pcap
1305
+ might be not correct. We call tcprewrite to fix it::
1306
+
1307
+ nstatuser@nstat-a:~$ tcprewrite --infile=/tmp/syn.pcap --outfile=/tmp/syn_fixcsum.pcap --fixcsum
1308
+
1309
+ On nstat-b, we run nc to listen on port 9000::
1310
+
1311
+ nstatuser@nstat-b:~$ nc -lkv 9000
1312
+ Listening on [0.0.0.0] (family 0, port 9000)
1313
+
1314
+ On nstat-a, we blocked the packet from port 9000, or nstat-a would send
1315
+ RST to nstat-b::
1316
+
1317
+ nstatuser@nstat-a:~$ sudo iptables -A INPUT -p tcp --sport 9000 -j DROP
1318
+
1319
+ Send 3 SYN repeatly to nstat-b::
1320
+
1321
+ nstatuser@nstat-a:~$ for i in {1..3}; do sudo tcpreplay -i ens3 /tmp/syn_fixcsum.pcap; done
1322
+
1323
+ Check snmp cunter on nstat-b::
1324
+
1325
+ nstatuser@nstat-b:~$ nstat | grep -i skip
1326
+ TcpExtTCPACKSkippedSynRecv 1 0.0
1327
+
1328
+ As we expected, TcpExtTCPACKSkippedSynRecv is 1.
1329
+
1330
+ TcpExtTCPACKSkippedPAWS
1331
+ ----------------------
1332
+ To trigger PAWS, we could send an old SYN.
1333
+
1334
+ On nstat-b, let nc listen on port 9000::
1335
+
1336
+ nstatuser@nstat-b:~$ nc -lkv 9000
1337
+ Listening on [0.0.0.0] (family 0, port 9000)
1338
+
1339
+ On nstat-a, run tcpdump to capture a SYN::
1340
+
1341
+ nstatuser@nstat-a:~$ sudo tcpdump -w /tmp/paws_pre.pcap -c 1 port 9000
1342
+ tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
1343
+
1344
+ On nstat-a, run nc as a client to connect nstat-b::
1345
+
1346
+ nstatuser@nstat-a:~$ nc -v nstat-b 9000
1347
+ Connection to nstat-b 9000 port [tcp/*] succeeded!
1348
+
1349
+ Now the tcpdump has captured the SYN and exit. We should fix the
1350
+ checksum::
1351
+
1352
+ nstatuser@nstat-a:~$ tcprewrite --infile /tmp/paws_pre.pcap --outfile /tmp/paws.pcap --fixcsum
1353
+
1354
+ Send the SYN packet twice::
1355
+
1356
+ nstatuser@nstat-a:~$ for i in {1..2}; do sudo tcpreplay -i ens3 /tmp/paws.pcap; done
1357
+
1358
+ On nstat-b, check the snmp counter::
1359
+
1360
+ nstatuser@nstat-b:~$ nstat | grep -i skip
1361
+ TcpExtTCPACKSkippedPAWS 1 0.0
1362
+
1363
+ We sent two SYN via tcpreplay, both of them would let PAWS check
1364
+ failed, the nstat-b replied an ACK for the first SYN, skipped the ACK
1365
+ for the second SYN, and updated TcpExtTCPACKSkippedPAWS.
1366
+
1367
+ TcpExtTCPACKSkippedSeq
1368
+ --------------------
1369
+ To trigger TcpExtTCPACKSkippedSeq, we send packets which have valid
1370
+ timestamp (to pass PAWS check) but the sequence number is out of
1371
+ window. The linux TCP stack would avoid to skip if the packet has
1372
+ data, so we need a pure ACK packet. To generate such a packet, we
1373
+ could create two sockets: one on port 9000, another on port 9001. Then
1374
+ we capture an ACK on port 9001, change the source/destination port
1375
+ numbers to match the port 9000 socket. Then we could trigger
1376
+ TcpExtTCPACKSkippedSeq via this packet.
1377
+
1378
+ On nstat-b, open two terminals, run two nc commands to listen on both
1379
+ port 9000 and port 9001::
1380
+
1381
+ nstatuser@nstat-b:~$ nc -lkv 9000
1382
+ Listening on [0.0.0.0] (family 0, port 9000)
1383
+
1384
+ nstatuser@nstat-b:~$ nc -lkv 9001
1385
+ Listening on [0.0.0.0] (family 0, port 9001)
1386
+
1387
+ On nstat-a, run two nc clients::
1388
+
1389
+ nstatuser@nstat-a:~$ nc -v nstat-b 9000
1390
+ Connection to nstat-b 9000 port [tcp/*] succeeded!
1391
+
1392
+ nstatuser@nstat-a:~$ nc -v nstat-b 9001
1393
+ Connection to nstat-b 9001 port [tcp/*] succeeded!
1394
+
1395
+ On nstat-a, run tcpdump to capture an ACK::
1396
+
1397
+ nstatuser@nstat-a:~$ sudo tcpdump -w /tmp/seq_pre.pcap -c 1 dst port 9001
1398
+ tcpdump: listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
1399
+
1400
+ On nstat-b, send a packet via the port 9001 socket. E.g. we sent a
1401
+ string 'foo' in our example::
1402
+
1403
+ nstatuser@nstat-b:~$ nc -lkv 9001
1404
+ Listening on [0.0.0.0] (family 0, port 9001)
1405
+ Connection from nstat-a 42132 received!
1406
+ foo
1407
+
1408
+ On nstat-a, the tcpdump should have caputred the ACK. We should check
1409
+ the source port numbers of the two nc clients::
1410
+
1411
+ nstatuser@nstat-a:~$ ss -ta '( dport = :9000 || dport = :9001 )' | tee
1412
+ State Recv-Q Send-Q Local Address:Port Peer Address:Port
1413
+ ESTAB 0 0 192.168.122.250:50208 192.168.122.251:9000
1414
+ ESTAB 0 0 192.168.122.250:42132 192.168.122.251:9001
1415
+
1416
+ Run tcprewrite, change port 9001 to port 9000, chagne port 42132 to
1417
+ port 50208::
1418
+
1419
+ nstatuser@nstat-a:~$ tcprewrite --infile /tmp/seq_pre.pcap --outfile /tmp/seq.pcap -r 9001:9000 -r 42132:50208 --fixcsum
1420
+
1421
+ Now the /tmp/seq.pcap is the packet we need. Send it to nstat-b::
1422
+
1423
+ nstatuser@nstat-a:~$ for i in {1..2}; do sudo tcpreplay -i ens3 /tmp/seq.pcap; done
1424
+
1425
+ Check TcpExtTCPACKSkippedSeq on nstat-b::
1426
+
1427
+ nstatuser@nstat-b:~$ nstat | grep -i skip
1428
+ TcpExtTCPACKSkippedSeq 1 0.0
0 commit comments