Skip to content

Commit 8599b52

Browse files
Veaceslav Falicodavem330
authored andcommitted
bonding: add an option to fail when any of arp_ip_target is inaccessible
Currently, we fail only when all of the ips in arp_ip_target are gone. However, in some situations we might need to fail if even one host from arp_ip_target becomes unavailable. All situations, obviously, rely on the idea that we need *completely* functional network, with all interfaces/addresses working correctly. One real world example might be: vlans on top on bond (hybrid port). If bond and vlans have ips assigned and we have their peers monitored via arp_ip_target - in case of switch misconfiguration (trunk/access port), slave driver malfunction or tagged/untagged traffic dropped on the way - we will be able to switch to another slave. Though any other configuration needs that if we need to have access to all arp_ip_targets. This patch adds this possibility by adding a new parameter - arp_all_targets (both as a module parameter and as a sysfs knob). It can be set to: 0 or any (the default) - which works exactly as it's working now - the slave is up if any of the arp_ip_targets are up. 1 or all - the slave is up if all of the arp_ip_targets are up. This parameter can be changed on the fly (via sysfs), and requires the mode to be active-backup and arp_validate to be enabled (it obeys the arp_validate config on which slaves to validate). Internally it's done through: 1) Add target_last_arp_rx[BOND_MAX_ARP_TARGETS] array to slave struct. It's an array of jiffies, meaning that slave->target_last_arp_rx[i] is the last time we've received arp from bond->params.arp_targets[i] on this slave. 2) If we successfully validate an arp from bond->params.arp_targets[i] in bond_validate_arp() - update the slave->target_last_arp_rx[i] with the current jiffies value. 3) When getting slave's last_rx via slave_last_rx(), we return the oldest time when we've received an arp from any address in bond->params.arp_targets[]. If the value of arp_all_targets == 0 - we still work the same way as before. Also, update the documentation to reflect the new parameter. v3->v4: Kill the forgotten rtnl_unlock(), rephrase the documentation part to be more clear, don't fail setting arp_all_targets if arp_validate is not set - it has no effect anyway but can be easier to set up. Also, print a warning if the last arp_ip_target is removed while the arp_interval is on, but not the arp_validate. v2->v3: Use _bh spinlock, remove useless rtnl_lock() and use jiffies for new arp_ip_target last arp, instead of slave_last_rx(). On bond_enslave(), use the same initialization value for target_last_arp_rx[] as is used for the default last_arp_rx, to avoid useless interface flaps. Also, instead of failing to remove the last arp_ip_target just print a warning - otherwise it might break existing scripts. v1->v2: Correctly handle adding/removing hosts in arp_ip_target - we need to shift/initialize all slave's target_last_arp_rx. Also, don't fail module loading on arp_all_targets misconfiguration, just disable it, and some minor style fixes. Signed-off-by: Veaceslav Falico <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent d7d35c6 commit 8599b52

File tree

4 files changed

+147
-14
lines changed

4 files changed

+147
-14
lines changed

Documentation/networking/bonding.txt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,25 @@ arp_validate
321321

322322
This option was added in bonding version 3.1.0.
323323

324+
arp_all_targets
325+
326+
Specifies the quantity of arp_ip_targets that must be reachable
327+
in order for the ARP monitor to consider a slave as being up.
328+
This option affects only active-backup mode for slaves with
329+
arp_validation enabled.
330+
331+
Possible values are:
332+
333+
any or 0
334+
335+
consider the slave up only when any of the arp_ip_targets
336+
is reachable
337+
338+
all or 1
339+
340+
consider the slave up only when all of the arp_ip_targets
341+
are reachable
342+
324343
downdelay
325344

326345
Specifies the time, in milliseconds, to wait before disabling

drivers/net/bonding/bond_main.c

Lines changed: 31 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@ static char *xmit_hash_policy;
104104
static int arp_interval = BOND_LINK_ARP_INTERV;
105105
static char *arp_ip_target[BOND_MAX_ARP_TARGETS];
106106
static char *arp_validate;
107+
static char *arp_all_targets;
107108
static char *fail_over_mac;
108109
static int all_slaves_active = 0;
109110
static struct bond_params bonding_defaults;
@@ -166,6 +167,8 @@ module_param(arp_validate, charp, 0);
166167
MODULE_PARM_DESC(arp_validate, "validate src/dst of ARP probes; "
167168
"0 for none (default), 1 for active, "
168169
"2 for backup, 3 for all");
170+
module_param(arp_all_targets, charp, 0);
171+
MODULE_PARM_DESC(arp_all_targets, "fail on any/all arp targets timeout; 0 for any (default), 1 for all");
169172
module_param(fail_over_mac, charp, 0);
170173
MODULE_PARM_DESC(fail_over_mac, "For active-backup, do not set all slaves to "
171174
"the same MAC; 0 for none (default), "
@@ -216,6 +219,12 @@ const struct bond_parm_tbl xmit_hashtype_tbl[] = {
216219
{ NULL, -1},
217220
};
218221

222+
const struct bond_parm_tbl arp_all_targets_tbl[] = {
223+
{ "any", BOND_ARP_TARGETS_ANY},
224+
{ "all", BOND_ARP_TARGETS_ALL},
225+
{ NULL, -1},
226+
};
227+
219228
const struct bond_parm_tbl arp_validate_tbl[] = {
220229
{ "none", BOND_ARP_VALIDATE_NONE},
221230
{ "active", BOND_ARP_VALIDATE_ACTIVE},
@@ -1483,7 +1492,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
14831492
struct slave *new_slave = NULL;
14841493
struct sockaddr addr;
14851494
int link_reporting;
1486-
int res = 0;
1495+
int res = 0, i;
14871496

14881497
if (!bond->params.use_carrier &&
14891498
slave_dev->ethtool_ops->get_link == NULL &&
@@ -1712,6 +1721,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
17121721

17131722
new_slave->last_arp_rx = jiffies -
17141723
(msecs_to_jiffies(bond->params.arp_interval) + 1);
1724+
for (i = 0; i < BOND_MAX_ARP_TARGETS; i++)
1725+
new_slave->target_last_arp_rx[i] = new_slave->last_arp_rx;
17151726

17161727
if (bond->params.miimon && !bond->params.use_carrier) {
17171728
link_reporting = bond_check_dev_link(bond, slave_dev, 1);
@@ -2610,16 +2621,20 @@ static void bond_arp_send_all(struct bonding *bond, struct slave *slave)
26102621

26112622
static void bond_validate_arp(struct bonding *bond, struct slave *slave, __be32 sip, __be32 tip)
26122623
{
2624+
int i;
2625+
26132626
if (!sip || !bond_has_this_ip(bond, tip)) {
26142627
pr_debug("bva: sip %pI4 tip %pI4 not found\n", &sip, &tip);
26152628
return;
26162629
}
26172630

2618-
if (bond_get_targets_ip(bond->params.arp_targets, sip) == -1) {
2631+
i = bond_get_targets_ip(bond->params.arp_targets, sip);
2632+
if (i == -1) {
26192633
pr_debug("bva: sip %pI4 not found in targets\n", &sip);
26202634
return;
26212635
}
26222636
slave->last_arp_rx = jiffies;
2637+
slave->target_last_arp_rx[i] = jiffies;
26232638
}
26242639

26252640
static int bond_arp_rcv(const struct sk_buff *skb, struct bonding *bond,
@@ -4409,6 +4424,7 @@ int bond_parse_parm(const char *buf, const struct bond_parm_tbl *tbl)
44094424
static int bond_check_params(struct bond_params *params)
44104425
{
44114426
int arp_validate_value, fail_over_mac_value, primary_reselect_value, i;
4427+
int arp_all_targets_value;
44124428

44134429
/*
44144430
* Convert string parameters.
@@ -4634,6 +4650,18 @@ static int bond_check_params(struct bond_params *params)
46344650
} else
46354651
arp_validate_value = 0;
46364652

4653+
arp_all_targets_value = 0;
4654+
if (arp_all_targets) {
4655+
arp_all_targets_value = bond_parse_parm(arp_all_targets,
4656+
arp_all_targets_tbl);
4657+
4658+
if (arp_all_targets_value == -1) {
4659+
pr_err("Error: invalid arp_all_targets_value \"%s\"\n",
4660+
arp_all_targets);
4661+
arp_all_targets_value = 0;
4662+
}
4663+
}
4664+
46374665
if (miimon) {
46384666
pr_info("MII link monitoring set to %d ms\n", miimon);
46394667
} else if (arp_interval) {
@@ -4698,6 +4726,7 @@ static int bond_check_params(struct bond_params *params)
46984726
params->num_peer_notif = num_peer_notif;
46994727
params->arp_interval = arp_interval;
47004728
params->arp_validate = arp_validate_value;
4729+
params->arp_all_targets = arp_all_targets_value;
47014730
params->updelay = updelay;
47024731
params->downdelay = downdelay;
47034732
params->use_carrier = use_carrier;

drivers/net/bonding/bond_sysfs.c

Lines changed: 69 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -443,6 +443,44 @@ static ssize_t bonding_store_arp_validate(struct device *d,
443443

444444
static DEVICE_ATTR(arp_validate, S_IRUGO | S_IWUSR, bonding_show_arp_validate,
445445
bonding_store_arp_validate);
446+
/*
447+
* Show and set arp_all_targets.
448+
*/
449+
static ssize_t bonding_show_arp_all_targets(struct device *d,
450+
struct device_attribute *attr,
451+
char *buf)
452+
{
453+
struct bonding *bond = to_bond(d);
454+
int value = bond->params.arp_all_targets;
455+
456+
return sprintf(buf, "%s %d\n", arp_all_targets_tbl[value].modename,
457+
value);
458+
}
459+
460+
static ssize_t bonding_store_arp_all_targets(struct device *d,
461+
struct device_attribute *attr,
462+
const char *buf, size_t count)
463+
{
464+
struct bonding *bond = to_bond(d);
465+
int new_value;
466+
467+
new_value = bond_parse_parm(buf, arp_all_targets_tbl);
468+
if (new_value < 0) {
469+
pr_err("%s: Ignoring invalid arp_all_targets value %s\n",
470+
bond->dev->name, buf);
471+
return -EINVAL;
472+
}
473+
pr_info("%s: setting arp_all_targets to %s (%d).\n",
474+
bond->dev->name, arp_all_targets_tbl[new_value].modename,
475+
new_value);
476+
477+
bond->params.arp_all_targets = new_value;
478+
479+
return count;
480+
}
481+
482+
static DEVICE_ATTR(arp_all_targets, S_IRUGO | S_IWUSR,
483+
bonding_show_arp_all_targets, bonding_store_arp_all_targets);
446484

447485
/*
448486
* Show and store fail_over_mac. User only allowed to change the
@@ -590,10 +628,11 @@ static ssize_t bonding_store_arp_targets(struct device *d,
590628
struct device_attribute *attr,
591629
const char *buf, size_t count)
592630
{
593-
__be32 newtarget;
594-
int i = 0, ret = -EINVAL;
595631
struct bonding *bond = to_bond(d);
596-
__be32 *targets;
632+
struct slave *slave;
633+
__be32 newtarget, *targets;
634+
unsigned long *targets_rx;
635+
int ind, i, j, ret = -EINVAL;
597636

598637
targets = bond->params.arp_targets;
599638
newtarget = in_aton(buf + 1);
@@ -611,35 +650,54 @@ static ssize_t bonding_store_arp_targets(struct device *d,
611650
goto out;
612651
}
613652

614-
i = bond_get_targets_ip(targets, 0); /* first free slot */
615-
if (i == -1) {
653+
ind = bond_get_targets_ip(targets, 0); /* first free slot */
654+
if (ind == -1) {
616655
pr_err("%s: ARP target table is full!\n",
617656
bond->dev->name);
618657
goto out;
619658
}
620659

621660
pr_info("%s: adding ARP target %pI4.\n", bond->dev->name,
622661
&newtarget);
623-
targets[i] = newtarget;
662+
/* not to race with bond_arp_rcv */
663+
write_lock_bh(&bond->lock);
664+
bond_for_each_slave(bond, slave, i)
665+
slave->target_last_arp_rx[ind] = jiffies;
666+
targets[ind] = newtarget;
667+
write_unlock_bh(&bond->lock);
624668
} else if (buf[0] == '-') {
625669
if ((newtarget == 0) || (newtarget == htonl(INADDR_BROADCAST))) {
626670
pr_err("%s: invalid ARP target %pI4 specified for removal\n",
627671
bond->dev->name, &newtarget);
628672
goto out;
629673
}
630674

631-
i = bond_get_targets_ip(targets, newtarget);
632-
if (i == -1) {
633-
pr_info("%s: unable to remove nonexistent ARP target %pI4.\n",
675+
ind = bond_get_targets_ip(targets, newtarget);
676+
if (ind == -1) {
677+
pr_err("%s: unable to remove nonexistent ARP target %pI4.\n",
634678
bond->dev->name, &newtarget);
635679
goto out;
636680
}
637681

682+
if (ind == 0 && !targets[1] && bond->params.arp_interval)
683+
pr_warn("%s: removing last arp target with arp_interval on\n",
684+
bond->dev->name);
685+
638686
pr_info("%s: removing ARP target %pI4.\n", bond->dev->name,
639687
&newtarget);
640-
for (; (i < BOND_MAX_ARP_TARGETS-1) && targets[i+1]; i++)
688+
689+
write_lock_bh(&bond->lock);
690+
bond_for_each_slave(bond, slave, i) {
691+
targets_rx = slave->target_last_arp_rx;
692+
j = ind;
693+
for (; (j < BOND_MAX_ARP_TARGETS-1) && targets[j+1]; j++)
694+
targets_rx[j] = targets_rx[j+1];
695+
targets_rx[j] = 0;
696+
}
697+
for (i = ind; (i < BOND_MAX_ARP_TARGETS-1) && targets[i+1]; i++)
641698
targets[i] = targets[i+1];
642699
targets[i] = 0;
700+
write_unlock_bh(&bond->lock);
643701
} else {
644702
pr_err("no command found in arp_ip_targets file for bond %s. Use +<addr> or -<addr>.\n",
645703
bond->dev->name);
@@ -1623,6 +1681,7 @@ static struct attribute *per_bond_attrs[] = {
16231681
&dev_attr_mode.attr,
16241682
&dev_attr_fail_over_mac.attr,
16251683
&dev_attr_arp_validate.attr,
1684+
&dev_attr_arp_all_targets.attr,
16261685
&dev_attr_arp_interval.attr,
16271686
&dev_attr_arp_ip_target.attr,
16281687
&dev_attr_downdelay.attr,

drivers/net/bonding/bonding.h

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,7 @@ struct bond_params {
144144
u8 num_peer_notif;
145145
int arp_interval;
146146
int arp_validate;
147+
int arp_all_targets;
147148
int use_carrier;
148149
int fail_over_mac;
149150
int updelay;
@@ -179,6 +180,7 @@ struct slave {
179180
int delay;
180181
unsigned long jiffies;
181182
unsigned long last_arp_rx;
183+
unsigned long target_last_arp_rx[BOND_MAX_ARP_TARGETS];
182184
s8 link; /* one of BOND_LINK_XXXX */
183185
s8 new_link;
184186
u8 backup:1, /* indicates backup slave. Value corresponds with
@@ -322,6 +324,9 @@ static inline bool bond_is_active_slave(struct slave *slave)
322324
#define BOND_FOM_ACTIVE 1
323325
#define BOND_FOM_FOLLOW 2
324326

327+
#define BOND_ARP_TARGETS_ANY 0
328+
#define BOND_ARP_TARGETS_ALL 1
329+
325330
#define BOND_ARP_VALIDATE_NONE 0
326331
#define BOND_ARP_VALIDATE_ACTIVE (1 << BOND_STATE_ACTIVE)
327332
#define BOND_ARP_VALIDATE_BACKUP (1 << BOND_STATE_BACKUP)
@@ -334,11 +339,31 @@ static inline int slave_do_arp_validate(struct bonding *bond,
334339
return bond->params.arp_validate & (1 << bond_slave_state(slave));
335340
}
336341

342+
/* Get the oldest arp which we've received on this slave for bond's
343+
* arp_targets.
344+
*/
345+
static inline unsigned long slave_oldest_target_arp_rx(struct bonding *bond,
346+
struct slave *slave)
347+
{
348+
int i = 1;
349+
unsigned long ret = slave->target_last_arp_rx[0];
350+
351+
for (; (i < BOND_MAX_ARP_TARGETS) && bond->params.arp_targets[i]; i++)
352+
if (time_before(slave->target_last_arp_rx[i], ret))
353+
ret = slave->target_last_arp_rx[i];
354+
355+
return ret;
356+
}
357+
337358
static inline unsigned long slave_last_rx(struct bonding *bond,
338359
struct slave *slave)
339360
{
340-
if (slave_do_arp_validate(bond, slave))
341-
return slave->last_arp_rx;
361+
if (slave_do_arp_validate(bond, slave)) {
362+
if (bond->params.arp_all_targets == BOND_ARP_TARGETS_ALL)
363+
return slave_oldest_target_arp_rx(bond, slave);
364+
else
365+
return slave->last_arp_rx;
366+
}
342367

343368
return slave->dev->last_rx;
344369
}
@@ -486,6 +511,7 @@ extern const struct bond_parm_tbl bond_lacp_tbl[];
486511
extern const struct bond_parm_tbl bond_mode_tbl[];
487512
extern const struct bond_parm_tbl xmit_hashtype_tbl[];
488513
extern const struct bond_parm_tbl arp_validate_tbl[];
514+
extern const struct bond_parm_tbl arp_all_targets_tbl[];
489515
extern const struct bond_parm_tbl fail_over_mac_tbl[];
490516
extern const struct bond_parm_tbl pri_reselect_tbl[];
491517
extern struct bond_parm_tbl ad_select_tbl[];

0 commit comments

Comments
 (0)