Skip to content

Commit 70e71ca

Browse files
committed
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
2 parents bae41e4 + 00c83b0 commit 70e71ca

File tree

1,336 files changed

+70846
-29177
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,336 files changed

+70846
-29177
lines changed

Documentation/ABI/testing/sysfs-class-net

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,3 +216,11 @@ Contact: [email protected]
216216
Description:
217217
Indicates the interface protocol type as a decimal value. See
218218
include/uapi/linux/if_arp.h for all possible values.
219+
220+
What: /sys/class/net/<iface>/phys_switch_id
221+
Date: November 2014
222+
KernelVersion: 3.19
223+
224+
Description:
225+
Indicates the unique physical switch identifier of a switch this
226+
port belongs to, as a string.

Documentation/Changes

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ o <http://www.iptables.org/downloads.html>
383383

384384
Ip-route2
385385
---------
386-
o <ftp://ftp.tux.org/pub/net/ip-routing/iproute2-2.2.4-now-ss991023.tar.gz>
386+
o <https://www.kernel.org/pub/linux/utils/net/iproute2/>
387387

388388
OProfile
389389
--------
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
btmrvl
2+
------
3+
4+
Required properties:
5+
6+
- compatible : must be "btmrvl,cfgdata"
7+
8+
Optional properties:
9+
10+
- btmrvl,cal-data : Calibration data downloaded to the device during
11+
initialization. This is an array of 28 values(u8).
12+
13+
- btmrvl,gpio-gap : gpio and gap (in msecs) combination to be
14+
configured.
15+
16+
Example:
17+
18+
GPIO pin 13 is configured as a wakeup source and GAP is set to 100 msecs
19+
in below example.
20+
21+
btmrvl {
22+
compatible = "btmrvl,cfgdata";
23+
24+
btmrvl,cal-data = /bits/ 8 <
25+
0x37 0x01 0x1c 0x00 0xff 0xff 0xff 0xff 0x01 0x7f 0x04 0x02
26+
0x00 0x00 0xba 0xce 0xc0 0xc6 0x2d 0x00 0x00 0x00 0x00 0x00
27+
0x00 0x00 0xf0 0x00>;
28+
btmrvl,gpio-gap = <0x0d64>;
29+
};

Documentation/devicetree/bindings/bus/bcma.txt

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@ Required properties:
88

99
The cores on the AXI bus are automatically detected by bcma with the
1010
memory ranges they are using and they get registered afterwards.
11+
Automatic detection of the IRQ number is not working on
12+
BCM47xx/BCM53xx ARM SoCs. To assign IRQ numbers to the cores, provide
13+
them manually through device tree. Use an interrupt-map to specify the
14+
IRQ used by the devices on the bus. The first address is just an index,
15+
because we do not have any special register.
1116

1217
The top-level axi bus may contain children representing attached cores
1318
(devices). This is needed since some hardware details can't be auto
@@ -22,6 +27,22 @@ Example:
2227
ranges = <0x00000000 0x18000000 0x00100000>;
2328
#address-cells = <1>;
2429
#size-cells = <1>;
30+
#interrupt-cells = <1>;
31+
interrupt-map-mask = <0x000fffff 0xffff>;
32+
interrupt-map =
33+
/* Ethernet Controller 0 */
34+
<0x00024000 0 &gic GIC_SPI 147 IRQ_TYPE_LEVEL_HIGH>,
35+
36+
/* Ethernet Controller 1 */
37+
<0x00025000 0 &gic GIC_SPI 148 IRQ_TYPE_LEVEL_HIGH>;
38+
39+
/* PCIe Controller 0 */
40+
<0x00012000 0 &gic GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>,
41+
<0x00012000 1 &gic GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>,
42+
<0x00012000 2 &gic GIC_SPI 128 IRQ_TYPE_LEVEL_HIGH>,
43+
<0x00012000 3 &gic GIC_SPI 129 IRQ_TYPE_LEVEL_HIGH>,
44+
<0x00012000 4 &gic GIC_SPI 130 IRQ_TYPE_LEVEL_HIGH>,
45+
<0x00012000 5 &gic GIC_SPI 131 IRQ_TYPE_LEVEL_HIGH>;
2546

2647
chipcommon {
2748
reg = <0x00000000 0x1000>;

Documentation/devicetree/bindings/net/amd-xgbe.txt

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,10 @@ Required properties:
77
- PCS registers
88
- interrupt-parent: Should be the phandle for the interrupt controller
99
that services interrupts for this device
10-
- interrupts: Should contain the amd-xgbe interrupt
10+
- interrupts: Should contain the amd-xgbe interrupt(s). The first interrupt
11+
listed is required and is the general device interrupt. If the optional
12+
amd,per-channel-interrupt property is specified, then one additional
13+
interrupt for each DMA channel supported by the device should be specified
1114
- clocks:
1215
- DMA clock for the amd-xgbe device (used for calculating the
1316
correct Rx interrupt watchdog timer value on a DMA channel
@@ -23,14 +26,19 @@ Optional properties:
2326
- mac-address: mac address to be assigned to the device. Can be overridden
2427
by UEFI.
2528
- dma-coherent: Present if dma operations are coherent
29+
- amd,per-channel-interrupt: Indicates that Rx and Tx complete will generate
30+
a unique interrupt for each DMA channel - this requires an additional
31+
interrupt be configured for each DMA channel
2632

2733
Example:
2834
xgbe@e0700000 {
2935
compatible = "amd,xgbe-seattle-v1a";
3036
reg = <0 0xe0700000 0 0x80000>,
3137
<0 0xe0780000 0 0x80000>;
3238
interrupt-parent = <&gic>;
33-
interrupts = <0 325 4>;
39+
interrupts = <0 325 4>,
40+
<0 326 1>, <0 327 1>, <0 328 1>, <0 329 1>;
41+
amd,per-channel-interrupt;
3442
clocks = <&xgbe_dma_clk>, <&xgbe_ptp_clk>;
3543
clock-names = "dma_clk", "ptp_clk";
3644
phy-handle = <&phy>;

Documentation/devicetree/bindings/net/can/c_can.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ Bosch C_CAN/D_CAN controller Device Tree Bindings
44
Required properties:
55
- compatible : Should be "bosch,c_can" for C_CAN controllers and
66
"bosch,d_can" for D_CAN controllers.
7+
Can be "ti,dra7-d_can", "ti,am3352-d_can" or
8+
"ti,am4372-d_can".
79
- reg : physical base address and size of the C_CAN/D_CAN
810
registers map
911
- interrupts : property with a value describing the interrupt
@@ -12,6 +14,9 @@ Required properties:
1214
Optional properties:
1315
- ti,hwmods : Must be "d_can<n>" or "c_can<n>", n being the
1416
instance number
17+
- syscon-raminit : Handle to system control region that contains the
18+
RAMINIT register, register offset to the RAMINIT
19+
register and the CAN instance number (0 offset).
1520

1621
Note: "ti,hwmods" field is used to fetch the base address and irq
1722
resources from TI, omap hwmod data base during device registration.

Documentation/devicetree/bindings/net/dsa/dsa.txt

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Required properties:
1010
- dsa,ethernet : Should be a phandle to a valid Ethernet device node
1111
- dsa,mii-bus : Should be a phandle to a valid MDIO bus device node
1212

13-
Optionnal properties:
13+
Optional properties:
1414
- interrupts : property with a value describing the switch
1515
interrupt number (not supported by the driver)
1616

@@ -23,6 +23,13 @@ Each of these switch child nodes should have the following required properties:
2323
- #address-cells : Must be 1
2424
- #size-cells : Must be 0
2525

26+
A switch child node has the following optional property:
27+
28+
- eeprom-length : Set to the length of an EEPROM connected to the
29+
switch. Must be set if the switch can not detect
30+
the presence and/or size of a connected EEPROM,
31+
otherwise optional.
32+
2633
A switch may have multiple "port" children nodes
2734

2835
Each port children node must have the following mandatory properties:

Documentation/devicetree/bindings/net/micrel.txt

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,32 @@ Optional properties:
66

77
- micrel,led-mode : LED mode value to set for PHYs with configurable LEDs.
88

9-
Configure the LED mode with single value. The list of PHYs and
10-
the bits that are currently supported:
9+
Configure the LED mode with single value. The list of PHYs and the
10+
bits that are currently supported:
1111

12-
KSZ8001: register 0x1e, bits 15..14
13-
KSZ8041: register 0x1e, bits 15..14
14-
KSZ8021: register 0x1f, bits 5..4
15-
KSZ8031: register 0x1f, bits 5..4
16-
KSZ8051: register 0x1f, bits 5..4
12+
KSZ8001: register 0x1e, bits 15..14
13+
KSZ8041: register 0x1e, bits 15..14
14+
KSZ8021: register 0x1f, bits 5..4
15+
KSZ8031: register 0x1f, bits 5..4
16+
KSZ8051: register 0x1f, bits 5..4
17+
KSZ8081: register 0x1f, bits 5..4
18+
KSZ8091: register 0x1f, bits 5..4
1719

18-
See the respective PHY datasheet for the mode values.
20+
See the respective PHY datasheet for the mode values.
21+
22+
- micrel,rmii-reference-clock-select-25-mhz: RMII Reference Clock Select
23+
bit selects 25 MHz mode
24+
25+
Setting the RMII Reference Clock Select bit enables 25 MHz rather
26+
than 50 MHz clock mode.
27+
28+
Note that this option in only needed for certain PHY revisions with a
29+
non-standard, inverted function of this configuration bit.
30+
Specifically, a clock reference ("rmii-ref" below) is always needed to
31+
actually select a mode.
1932

2033
- clocks, clock-names: contains clocks according to the common clock bindings.
2134

22-
supported clocks:
23-
- KSZ8021, KSZ8031: "rmii-ref": The RMII refence input clock. Used
24-
to determine the XI input clock.
35+
supported clocks:
36+
- KSZ8021, KSZ8031, KSZ8081, KSZ8091: "rmii-ref": The RMII reference
37+
input clock. Used to determine the XI input clock.

Documentation/devicetree/bindings/net/phy.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ Optional Properties:
1919
specifications. If neither of these are specified, the default is to
2020
assume clause 22. The compatible list may also contain other
2121
elements.
22-
- max-speed: Maximum PHY supported speed (10, 100, 1000...)
2322

2423
If the phy's identifier is known then the list may contain an entry
2524
of the form: "ethernet-phy-idAAAA.BBBB" where
@@ -29,6 +28,8 @@ Optional Properties:
2928
4 hex digits. This is the chip vendor OUI bits 19:24,
3029
followed by 10 bits of a vendor specific ID.
3130

31+
- max-speed: Maximum PHY supported speed (10, 100, 1000...)
32+
3233
Example:
3334

3435
ethernet-phy@0 {

Documentation/devicetree/bindings/net/sh_eth.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Required properties:
99
"renesas,ether-r8a7779" if the device is a part of R8A7779 SoC.
1010
"renesas,ether-r8a7790" if the device is a part of R8A7790 SoC.
1111
"renesas,ether-r8a7791" if the device is a part of R8A7791 SoC.
12+
"renesas,ether-r8a7793" if the device is a part of R8A7793 SoC.
1213
"renesas,ether-r8a7794" if the device is a part of R8A7794 SoC.
1314
"renesas,ether-r7s72100" if the device is a part of R7S72100 SoC.
1415
- reg: offset and length of (1) the E-DMAC/feLic register block (required),

Documentation/networking/bonding.txt

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2230,11 +2230,8 @@ balance-rr: This mode is the only mode that will permit a single
22302230

22312231
It is possible to adjust TCP/IP's congestion limits by
22322232
altering the net.ipv4.tcp_reordering sysctl parameter. The
2233-
usual default value is 3, and the maximum useful value is 127.
2234-
For a four interface balance-rr bond, expect that a single
2235-
TCP/IP stream will utilize no more than approximately 2.3
2236-
interface's worth of throughput, even after adjusting
2237-
tcp_reordering.
2233+
usual default value is 3. But keep in mind TCP stack is able
2234+
to automatically increase this when it detects reorders.
22382235

22392236
Note that the fraction of packets that will be delivered out of
22402237
order is highly variable, and is unlikely to be zero. The level

Documentation/networking/ip-sysctl.txt

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,9 +383,17 @@ tcp_orphan_retries - INTEGER
383383
may consume significant resources. Cf. tcp_max_orphans.
384384

385385
tcp_reordering - INTEGER
386-
Maximal reordering of packets in a TCP stream.
386+
Initial reordering level of packets in a TCP stream.
387+
TCP stack can then dynamically adjust flow reordering level
388+
between this initial value and tcp_max_reordering
387389
Default: 3
388390

391+
tcp_max_reordering - INTEGER
392+
Maximal reordering level of packets in a TCP stream.
393+
300 is a fairly conservative value, but you might increase it
394+
if paths are using per packet load balancing (like bonding rr mode)
395+
Default: 300
396+
389397
tcp_retrans_collapse - BOOLEAN
390398
Bug-to-bug compatibility with some broken printers.
391399
On retransmit try to send bigger packets to work around bugs in
@@ -1466,6 +1474,19 @@ suppress_frag_ndisc - INTEGER
14661474
1 - (default) discard fragmented neighbor discovery packets
14671475
0 - allow fragmented neighbor discovery packets
14681476

1477+
optimistic_dad - BOOLEAN
1478+
Whether to perform Optimistic Duplicate Address Detection (RFC 4429).
1479+
0: disabled (default)
1480+
1: enabled
1481+
1482+
use_optimistic - BOOLEAN
1483+
If enabled, do not classify optimistic addresses as deprecated during
1484+
source address selection. Preferred addresses will still be chosen
1485+
before optimistic addresses, subject to other ranking in the source
1486+
address selection algorithm.
1487+
0: disabled (default)
1488+
1: enabled
1489+
14691490
icmp/*:
14701491
ratelimit - INTEGER
14711492
Limit the maximal rates for sending ICMPv6 packets.

Documentation/networking/ipvlan.txt

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
2+
IPVLAN Driver HOWTO
3+
4+
Initial Release:
5+
Mahesh Bandewar <maheshb AT google.com>
6+
7+
1. Introduction:
8+
This is conceptually very similar to the macvlan driver with one major
9+
exception of using L3 for mux-ing /demux-ing among slaves. This property makes
10+
the master device share the L2 with it's slave devices. I have developed this
11+
driver in conjuntion with network namespaces and not sure if there is use case
12+
outside of it.
13+
14+
15+
2. Building and Installation:
16+
In order to build the driver, please select the config item CONFIG_IPVLAN.
17+
The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
18+
(CONFIG_IPVLAN=m).
19+
20+
21+
3. Configuration:
22+
There are no module parameters for this driver and it can be configured
23+
using IProute2/ip utility.
24+
25+
ip link add link <master-dev> <slave-dev> type ipvlan mode { l2 | L3 }
26+
27+
e.g. ip link add link ipvl0 eth0 type ipvlan mode l2
28+
29+
30+
4. Operating modes:
31+
IPvlan has two modes of operation - L2 and L3. For a given master device,
32+
you can select one of these two modes and all slaves on that master will
33+
operate in the same (selected) mode. The RX mode is almost identical except
34+
that in L3 mode the slaves wont receive any multicast / broadcast traffic.
35+
L3 mode is more restrictive since routing is controlled from the other (mostly)
36+
default namespace.
37+
38+
4.1 L2 mode:
39+
In this mode TX processing happens on the stack instance attached to the
40+
slave device and packets are switched and queued to the master device to send
41+
out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
42+
as well.
43+
44+
4.2 L3 mode:
45+
In this mode TX processing upto L3 happens on the stack instance attached
46+
to the slave device and packets are switched to the stack instance of the
47+
master device for the L2 processing and routing from that instance will be
48+
used before packets are queued on the outbound device. In this mode the slaves
49+
will not receive nor can send multicast / broadcast traffic.
50+
51+
52+
5. What to choose (macvlan vs. ipvlan)?
53+
These two devices are very similar in many regards and the specific use
54+
case could very well define which device to choose. if one of the following
55+
situations defines your use case then you can choose to use ipvlan -
56+
(a) The Linux host that is connected to the external switch / router has
57+
policy configured that allows only one mac per port.
58+
(b) No of virtual devices created on a master exceed the mac capacity and
59+
puts the NIC in promiscous mode and degraded performance is a concern.
60+
(c) If the slave device is to be put into the hostile / untrusted network
61+
namespace where L2 on the slave could be changed / misused.
62+
63+
64+
6. Example configuration:
65+
66+
+=============================================================+
67+
| Host: host1 |
68+
| |
69+
| +----------------------+ +----------------------+ |
70+
| | NS:ns0 | | NS:ns1 | |
71+
| | | | | |
72+
| | | | | |
73+
| | ipvl0 | | ipvl1 | |
74+
| +----------#-----------+ +-----------#----------+ |
75+
| # # |
76+
| ################################ |
77+
| # eth0 |
78+
+==============================#==============================+
79+
80+
81+
(a) Create two network namespaces - ns0, ns1
82+
ip netns add ns0
83+
ip netns add ns1
84+
85+
(b) Create two ipvlan slaves on eth0 (master device)
86+
ip link add link eth0 ipvl0 type ipvlan mode l2
87+
ip link add link eth0 ipvl1 type ipvlan mode l2
88+
89+
(c) Assign slaves to the respective network namespaces
90+
ip link set dev ipvl0 netns ns0
91+
ip link set dev ipvl1 netns ns1
92+
93+
(d) Now switch to the namespace (ns0 or ns1) to configure the slave devices
94+
- For ns0
95+
(1) ip netns exec ns0 bash
96+
(2) ip link set dev ipvl0 up
97+
(3) ip link set dev lo up
98+
(4) ip -4 addr add 127.0.0.1 dev lo
99+
(5) ip -4 addr add $IPADDR dev ipvl0
100+
(6) ip -4 route add default via $ROUTER dev ipvl0
101+
- For ns1
102+
(1) ip netns exec ns1 bash
103+
(2) ip link set dev ipvl1 up
104+
(3) ip link set dev lo up
105+
(4) ip -4 addr add 127.0.0.1 dev lo
106+
(5) ip -4 addr add $IPADDR dev ipvl1
107+
(6) ip -4 route add default via $ROUTER dev ipvl1

Documentation/networking/ixgbe.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ Other ethtool Commands:
138138
To enable Flow Director
139139
ethtool -K ethX ntuple on
140140
To add a filter
141-
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 0x178000a
141+
Use -U switch. e.g., ethtool -U ethX flow-type tcp4 src-ip 10.0.128.23
142142
action 1
143143
To see the list of filters currently present:
144144
ethtool -u ethX

0 commit comments

Comments
 (0)