Skip to content

Commit 0773e3a

Browse files
vladimirolteanPaolo Abeni
authored andcommitted
docs: net: dsa: update information about multiple CPU ports
DSA now supports multiple CPU ports, explain the use cases that are covered, the new UAPI, the permitted degrees of freedom, the driver API, and remove some old "hanging fruits". Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
1 parent acc43b7 commit 0773e3a

File tree

2 files changed

+128
-6
lines changed

2 files changed

+128
-6
lines changed

Documentation/networking/dsa/configuration.rst

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@ In this documentation the following Ethernet interfaces are used:
4949
*eth0*
5050
the master interface
5151

52+
*eth1*
53+
another master interface
54+
5255
*lan1*
5356
a slave interface
5457

@@ -360,3 +363,96 @@ the ``self`` flag) has been removed. This results in the following changes:
360363
361364
Script writers are therefore encouraged to use the ``master static`` set of
362365
flags when working with bridge FDB entries on DSA switch interfaces.
366+
367+
Affinity of user ports to CPU ports
368+
-----------------------------------
369+
370+
Typically, DSA switches are attached to the host via a single Ethernet
371+
interface, but in cases where the switch chip is discrete, the hardware design
372+
may permit the use of 2 or more ports connected to the host, for an increase in
373+
termination throughput.
374+
375+
DSA can make use of multiple CPU ports in two ways. First, it is possible to
376+
statically assign the termination traffic associated with a certain user port
377+
to be processed by a certain CPU port. This way, user space can implement
378+
custom policies of static load balancing between user ports, by spreading the
379+
affinities according to the available CPU ports.
380+
381+
Secondly, it is possible to perform load balancing between CPU ports on a per
382+
packet basis, rather than statically assigning user ports to CPU ports.
383+
This can be achieved by placing the DSA masters under a LAG interface (bonding
384+
or team). DSA monitors this operation and creates a mirror of this software LAG
385+
on the CPU ports facing the physical DSA masters that constitute the LAG slave
386+
devices.
387+
388+
To make use of multiple CPU ports, the firmware (device tree) description of
389+
the switch must mark all the links between CPU ports and their DSA masters
390+
using the ``ethernet`` reference/phandle. At startup, only a single CPU port
391+
and DSA master will be used - the numerically first port from the firmware
392+
description which has an ``ethernet`` property. It is up to the user to
393+
configure the system for the switch to use other masters.
394+
395+
DSA uses the ``rtnl_link_ops`` mechanism (with a "dsa" ``kind``) to allow
396+
changing the DSA master of a user port. The ``IFLA_DSA_MASTER`` u32 netlink
397+
attribute contains the ifindex of the master device that handles each slave
398+
device. The DSA master must be a valid candidate based on firmware node
399+
information, or a LAG interface which contains only slaves which are valid
400+
candidates.
401+
402+
Using iproute2, the following manipulations are possible:
403+
404+
.. code-block:: sh
405+
406+
# See the DSA master in current use
407+
ip -d link show dev swp0
408+
(...)
409+
dsa master eth0
410+
411+
# Static CPU port distribution
412+
ip link set swp0 type dsa master eth1
413+
ip link set swp1 type dsa master eth0
414+
ip link set swp2 type dsa master eth1
415+
ip link set swp3 type dsa master eth0
416+
417+
# CPU ports in LAG, using explicit assignment of the DSA master
418+
ip link add bond0 type bond mode balance-xor && ip link set bond0 up
419+
ip link set eth1 down && ip link set eth1 master bond0
420+
ip link set swp0 type dsa master bond0
421+
ip link set swp1 type dsa master bond0
422+
ip link set swp2 type dsa master bond0
423+
ip link set swp3 type dsa master bond0
424+
ip link set eth0 down && ip link set eth0 master bond0
425+
ip -d link show dev swp0
426+
(...)
427+
dsa master bond0
428+
429+
# CPU ports in LAG, relying on implicit migration of the DSA master
430+
ip link add bond0 type bond mode balance-xor && ip link set bond0 up
431+
ip link set eth0 down && ip link set eth0 master bond0
432+
ip link set eth1 down && ip link set eth1 master bond0
433+
ip -d link show dev swp0
434+
(...)
435+
dsa master bond0
436+
437+
Notice that in the case of CPU ports under a LAG, the use of the
438+
``IFLA_DSA_MASTER`` netlink attribute is not strictly needed, but rather, DSA
439+
reacts to the ``IFLA_MASTER`` attribute change of its present master (``eth0``)
440+
and migrates all user ports to the new upper of ``eth0``, ``bond0``. Similarly,
441+
when ``bond0`` is destroyed using ``RTM_DELLINK``, DSA migrates the user ports
442+
that were assigned to this interface to the first physical DSA master which is
443+
eligible, based on the firmware description (it effectively reverts to the
444+
startup configuration).
445+
446+
In a setup with more than 2 physical CPU ports, it is therefore possible to mix
447+
static user to CPU port assignment with LAG between DSA masters. It is not
448+
possible to statically assign a user port towards a DSA master that has any
449+
upper interfaces (this includes LAG devices - the master must always be the LAG
450+
in this case).
451+
452+
Live changing of the DSA master (and thus CPU port) affinity of a user port is
453+
permitted, in order to allow dynamic redistribution in response to traffic.
454+
455+
Physical DSA masters are allowed to join and leave at any time a LAG interface
456+
used as a DSA master; however, DSA will reject a LAG interface as a valid
457+
candidate for being a DSA master unless it has at least one physical DSA master
458+
as a slave device.

Documentation/networking/dsa/dsa.rst

Lines changed: 32 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -303,6 +303,20 @@ These frames are then queued for transmission using the master network device
303303
Ethernet switch will be able to process these incoming frames from the
304304
management interface and deliver them to the physical switch port.
305305

306+
When using multiple CPU ports, it is possible to stack a LAG (bonding/team)
307+
device between the DSA slave devices and the physical DSA masters. The LAG
308+
device is thus also a DSA master, but the LAG slave devices continue to be DSA
309+
masters as well (just with no user port assigned to them; this is needed for
310+
recovery in case the LAG DSA master disappears). Thus, the data path of the LAG
311+
DSA master is used asymmetrically. On RX, the ``ETH_P_XDSA`` handler, which
312+
calls ``dsa_switch_rcv()``, is invoked early (on the physical DSA master;
313+
LAG slave). Therefore, the RX data path of the LAG DSA master is not used.
314+
On the other hand, TX takes place linearly: ``dsa_slave_xmit`` calls
315+
``dsa_enqueue_skb``, which calls ``dev_queue_xmit`` towards the LAG DSA master.
316+
The latter calls ``dev_queue_xmit`` towards one physical DSA master or the
317+
other, and in both cases, the packet exits the system through a hardware path
318+
towards the switch.
319+
306320
Graphical representation
307321
------------------------
308322

@@ -629,6 +643,24 @@ Switch configuration
629643
PHY cannot be found. In this case, probing of the DSA switch continues
630644
without that particular port.
631645

646+
- ``port_change_master``: method through which the affinity (association used
647+
for traffic termination purposes) between a user port and a CPU port can be
648+
changed. By default all user ports from a tree are assigned to the first
649+
available CPU port that makes sense for them (most of the times this means
650+
the user ports of a tree are all assigned to the same CPU port, except for H
651+
topologies as described in commit 2c0b03258b8b). The ``port`` argument
652+
represents the index of the user port, and the ``master`` argument represents
653+
the new DSA master ``net_device``. The CPU port associated with the new
654+
master can be retrieved by looking at ``struct dsa_port *cpu_dp =
655+
master->dsa_ptr``. Additionally, the master can also be a LAG device where
656+
all the slave devices are physical DSA masters. LAG DSA masters also have a
657+
valid ``master->dsa_ptr`` pointer, however this is not unique, but rather a
658+
duplicate of the first physical DSA master's (LAG slave) ``dsa_ptr``. In case
659+
of a LAG DSA master, a further call to ``port_lag_join`` will be emitted
660+
separately for the physical CPU ports associated with the physical DSA
661+
masters, requesting them to create a hardware LAG associated with the LAG
662+
interface.
663+
632664
PHY devices and link management
633665
-------------------------------
634666

@@ -1095,9 +1127,3 @@ capable hardware, but does not enforce a strict switch device driver model. On
10951127
the other DSA enforces a fairly strict device driver model, and deals with most
10961128
of the switch specific. At some point we should envision a merger between these
10971129
two subsystems and get the best of both worlds.
1098-
1099-
Other hanging fruits
1100-
--------------------
1101-
1102-
- allowing more than one CPU/management interface:
1103-
http://comments.gmane.org/gmane.linux.network/365657

0 commit comments

Comments
 (0)