Skip to content

Commit 8470e43

Browse files
committed
Merge branch 'net-cacheline-optimizations'
Coco Li says: ==================== Analyze and Reorganize core Networking Structs to optimize cacheline consumption Currently, variable-heavy structs in the networking stack is organized chronologically, logically and sometimes by cacheline access. This patch series attempts to reorganize the core networking stack variables to minimize cacheline consumption during the phase of data transfer. Specifically, we looked at the TCP/IP stack and the fast path definition in TCP. For documentation purposes, we also added new files for each core data structure we considered, although not all ended up being modified due to the amount of existing cacheline they span in the fast path. In the documentation, we recorded all variables we identified on the fast path and the reasons. We also hope that in the future when variables are added/modified, the document can be referred to and updated accordingly to reflect the latest variable organization. Tested: Our tests were run with neper tcp_rr using tcp traffic. The tests have $cpu number of threads and variable number of flows (see below). Tests were run on 6.5-rc1 Efficiency is computed as cpu seconds / throughput (one tcp_rr round trip). The following result shows efficiency delta before and after the patch series is applied. On AMD platforms with 100Gb/s NIC and 256Mb L3 cache: IPv4 Flows with patches clean kernel Percent reduction 30k 0.0001736538065 0.0002741191042 -36.65% 20k 0.0001583661752 0.0002712559158 -41.62% 10k 0.0001639148817 0.0002951800751 -44.47% 5k 0.0001859683866 0.0003320642536 -44.00% 1k 0.0002035190546 0.0003152056382 -35.43% IPv6 Flows with patches clean kernel Percent reduction 30k 0.000202535503 0.0003275329163 -38.16% 20k 0.0002020654777 0.0003411304786 -40.77% 10k 0.0002122427035 0.0003803674705 -44.20% 5k 0.0002348776729 0.0004030403953 -41.72% 1k 0.0002237384583 0.0002813646157 -20.48% On Intel platforms with 200Gb/s NIC and 105Mb L3 cache: IPv6 Flows with patches clean kernel Percent reduction 30k 0.0006296537873 0.0006370427753 -1.16% 20k 0.0003451029365 0.0003628016076 -4.88% 10k 0.0003187646958 0.0003346835645 -4.76% 5k 0.0002954676348 0.000311807592 -5.24% 1k 0.0001909169342 0.0001848069709 3.31% v8 changes: 1. Update net_device_read_txrx cache group maximum 2. Update MAINTAINERS for documentations 3. Skip __cache_group variables in scripts/kernel-doc ==================== Signed-off-by: David S. Miller <[email protected]>
2 parents 7453d7a + 18fd64d commit 8470e43

File tree

13 files changed

+842
-15
lines changed

13 files changed

+842
-15
lines changed

Documentation/networking/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,7 @@ Contents:
7575
mptcp-sysctl
7676
multiqueue
7777
napi
78+
net_cachelines/index
7879
netconsole
7980
netdev-features
8081
netdevices
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
.. Copyright (C) 2023 Google LLC
3+
===================================
4+
Common Networking Struct Cachelines
5+
===================================
6+
7+
.. toctree::
8+
:maxdepth: 1
9+
10+
inet_connection_sock
11+
inet_sock
12+
net_device
13+
netns_ipv4_sysctl
14+
snmp
15+
tcp_sock
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
.. Copyright (C) 2023 Google LLC
3+
=====================================================
4+
inet_connection_sock struct fast path usage breakdown
5+
=====================================================
6+
7+
Type Name fastpath_tx_access fastpath_rx_access comment
8+
..struct ..inet_connection_sock
9+
struct_inet_sock icsk_inet read_mostly read_mostly tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
10+
struct_request_sock_queue icsk_accept_queue - -
11+
struct_inet_bind_bucket icsk_bind_hash read_mostly - tcp_set_state
12+
struct_inet_bind2_bucket icsk_bind2_hash read_mostly - tcp_set_state,inet_put_port
13+
unsigned_long icsk_timeout read_mostly - inet_csk_reset_xmit_timer,tcp_connect
14+
struct_timer_list icsk_retransmit_timer read_mostly - inet_csk_reset_xmit_timer,tcp_connect
15+
struct_timer_list icsk_delack_timer read_mostly - inet_csk_reset_xmit_timer,tcp_connect
16+
u32 icsk_rto read_write - tcp_cwnd_validate,tcp_schedule_loss_probe,tcp_connect_init,tcp_connect,tcp_write_xmit,tcp_push_one
17+
u32 icsk_rto_min - -
18+
u32 icsk_delack_max - -
19+
u32 icsk_pmtu_cookie read_write - tcp_sync_mss,tcp_current_mss,tcp_send_syn_data,tcp_connect_init,tcp_connect
20+
struct_tcp_congestion_ops icsk_ca_ops read_write - tcp_cwnd_validate,tcp_tso_segs,tcp_ca_dst_init,tcp_connect_init,tcp_connect,tcp_write_xmit
21+
struct_inet_connection_sock_af_ops icsk_af_ops read_mostly - tcp_finish_connect,tcp_send_syn_data,tcp_mtup_init,tcp_mtu_check_reprobe,tcp_mtu_probe,tcp_connect_init,tcp_connect,__tcp_transmit_skb
22+
struct_tcp_ulp_ops* icsk_ulp_ops - -
23+
void* icsk_ulp_data - -
24+
u8:5 icsk_ca_state read_write - tcp_cwnd_application_limited,tcp_set_ca_state,tcp_enter_cwr,tcp_tso_should_defer,tcp_mtu_probe,tcp_schedule_loss_probe,tcp_write_xmit,__tcp_transmit_skb
25+
u8:1 icsk_ca_initialized read_write - tcp_init_transfer,tcp_init_congestion_control,tcp_init_transfer,tcp_finish_connect,tcp_connect
26+
u8:1 icsk_ca_setsockopt - -
27+
u8:1 icsk_ca_dst_locked write_mostly - tcp_ca_dst_init,tcp_connect_init,tcp_connect
28+
u8 icsk_retransmits write_mostly - tcp_connect_init,tcp_connect
29+
u8 icsk_pending read_write - inet_csk_reset_xmit_timer,tcp_connect,tcp_check_probe_timer,__tcp_push_pending_frames,tcp_rearm_rto,tcp_event_new_data_sent,tcp_event_new_data_sent
30+
u8 icsk_backoff write_mostly - tcp_write_queue_purge,tcp_connect_init
31+
u8 icsk_syn_retries - -
32+
u8 icsk_probes_out - -
33+
u16 icsk_ext_hdr_len read_mostly - __tcp_mtu_to_mss,tcp_mtu_to_rss,tcp_mtu_probe,tcp_write_xmit,tcp_mtu_to_mss,
34+
struct_icsk_ack_u8 pending read_write read_write inet_csk_ack_scheduled,__tcp_cleanup_rbuf,tcp_cleanup_rbuf,inet_csk_clear_xmit_timer,tcp_event_ack-sent,inet_csk_reset_xmit_timer
35+
struct_icsk_ack_u8 quick read_write write_mostly tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_select_window,__tcp_cleanup_rbuf
36+
struct_icsk_ack_u8 pingpong - -
37+
struct_icsk_ack_u8 retry write_mostly read_write inet_csk_clear_xmit_timer,tcp_rearm_rto,tcp_event_new_data_sent,tcp_write_xmit,__tcp_send_ack,tcp_send_ack,
38+
struct_icsk_ack_u8 ato read_mostly write_mostly tcp_dec_quickack_mode,tcp_event_ack_sent,__tcp_transmit_skb,__tcp_send_ack,tcp_send_ack
39+
struct_icsk_ack_unsigned_long timeout read_write read_write inet_csk_reset_xmit_timer,tcp_connect
40+
struct_icsk_ack_u32 lrcvtime read_write - tcp_finish_connect,tcp_connect,tcp_event_data_sent,__tcp_transmit_skb
41+
struct_icsk_ack_u16 rcv_mss write_mostly read_mostly __tcp_select_window,__tcp_cleanup_rbuf,tcp_initialize_rcv_mss,tcp_connect_init
42+
struct_icsk_mtup_int search_high read_write - tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_check_reprobe,tcp_write_xmit
43+
struct_icsk_mtup_int search_low read_write - tcp_mtu_probe,tcp_mtu_check_reprobe,tcp_write_xmit,tcp_sync_mss,tcp_connect_init,tcp_mtup_init
44+
struct_icsk_mtup_u32:31 probe_size read_write - tcp_mtup_init,tcp_connect_init,__tcp_transmit_skb
45+
struct_icsk_mtup_u32:1 enabled read_write - tcp_mtup_init,tcp_sync_mss,tcp_connect_init,tcp_mtu_probe,tcp_write_xmit
46+
struct_icsk_mtup_u32 probe_timestamp read_write - tcp_mtup_init,tcp_connect_init,tcp_mtu_check_reprobe,tcp_mtu_probe
47+
u32 icsk_probes_tstamp - -
48+
u32 icsk_user_timeout - -
49+
u64[104/sizeof(u64)] icsk_ca_priv - -
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
.. Copyright (C) 2023 Google LLC
3+
=====================================================
4+
inet_connection_sock struct fast path usage breakdown
5+
=====================================================
6+
7+
Type Name fastpath_tx_access fastpath_rx_access comment
8+
..struct ..inet_sock
9+
struct_sock sk read_mostly read_mostly tcp_init_buffer_space,tcp_init_transfer,tcp_finish_connect,tcp_connect,tcp_send_rcvq,tcp_send_syn_data
10+
struct_ipv6_pinfo* pinet6 - -
11+
be16 inet_sport read_mostly - __tcp_transmit_skb
12+
be32 inet_daddr read_mostly - ip_select_ident_segs
13+
be32 inet_rcv_saddr - -
14+
be16 inet_dport read_mostly - __tcp_transmit_skb
15+
u16 inet_num - -
16+
be32 inet_saddr - -
17+
s16 uc_ttl read_mostly - __ip_queue_xmit/ip_select_ttl
18+
u16 cmsg_flags - -
19+
struct_ip_options_rcu* inet_opt read_mostly - __ip_queue_xmit
20+
u16 inet_id read_mostly - ip_select_ident_segs
21+
u8 tos read_mostly - ip_queue_xmit
22+
u8 min_ttl - -
23+
u8 mc_ttl - -
24+
u8 pmtudisc - -
25+
u8:1 recverr - -
26+
u8:1 is_icsk - -
27+
u8:1 freebind - -
28+
u8:1 hdrincl - -
29+
u8:1 mc_loop - -
30+
u8:1 transparent - -
31+
u8:1 mc_all - -
32+
u8:1 nodefrag - -
33+
u8:1 bind_address_no_port - -
34+
u8:1 recverr_rfc4884 - -
35+
u8:1 defer_connect read_mostly - tcp_sendmsg_fastopen
36+
u8 rcv_tos - -
37+
u8 convert_csum - -
38+
int uc_index - -
39+
int mc_index - -
40+
be32 mc_addr - -
41+
struct_ip_mc_socklist* mc_list - -
42+
struct_inet_cork_full cork read_mostly - __tcp_transmit_skb
43+
struct local_port_range - -

0 commit comments

Comments
 (0)