Skip to content

Commit 0b215b9

Browse files
Eric Dumazetdavem330
authored andcommitted
ipv6: gro: do not use slow memcmp() in ipv6_gro_receive()
ipv6_gro_receive() compares 34 bytes using slow memcmp(), while handcoding with a couple of ipv6_addr_equal() is much faster. Before this patch, "perf top -e cycles:pp -C <cpu>" would see memcmp() using ~10% of cpu cycles on a 40Gbit NIC receiving IPv6 TCP traffic. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 5e1abdc commit 0b215b9

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

net/ipv6/ip6_offload.c

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -229,14 +229,21 @@ static struct sk_buff *ipv6_gro_receive(struct list_head *head,
229229
* XXX skbs on the gro_list have all been parsed and pulled
230230
* already so we don't need to compare nlen
231231
* (nlen != (sizeof(*iph2) + ipv6_exthdrs_len(iph2, &ops)))
232-
* memcmp() alone below is suffcient, right?
232+
* memcmp() alone below is sufficient, right?
233233
*/
234234
if ((first_word & htonl(0xF00FFFFF)) ||
235-
memcmp(&iph->nexthdr, &iph2->nexthdr,
236-
nlen - offsetof(struct ipv6hdr, nexthdr))) {
235+
!ipv6_addr_equal(&iph->saddr, &iph2->saddr) ||
236+
!ipv6_addr_equal(&iph->daddr, &iph2->daddr) ||
237+
*(u16 *)&iph->nexthdr != *(u16 *)&iph2->nexthdr) {
238+
not_same_flow:
237239
NAPI_GRO_CB(p)->same_flow = 0;
238240
continue;
239241
}
242+
if (unlikely(nlen > sizeof(struct ipv6hdr))) {
243+
if (memcmp(iph + 1, iph2 + 1,
244+
nlen - sizeof(struct ipv6hdr)))
245+
goto not_same_flow;
246+
}
240247
/* flush if Traffic Class fields are different */
241248
NAPI_GRO_CB(p)->flush |= !!(first_word & htonl(0x0FF00000));
242249
NAPI_GRO_CB(p)->flush |= flush;

0 commit comments

Comments
 (0)