Skip to content

Commit fb07a82

Browse files
Kirill Tkhaidavem330
authored andcommitted
net: Move net:netns_ids destruction out of rtnl_lock() and document locking scheme
Currently, we unhash a dying net from netns_ids lists under rtnl_lock(). It's a leftover from the time when net::netns_ids was introduced. There was no net::nsid_lock, and rtnl_lock() was mostly need to order modification of alive nets nsid idr, i.e. for: for_each_net(tmp) { ... id = __peernet2id(tmp, net); idr_remove(&tmp->netns_ids, id); ... } Since we have net::nsid_lock, the modifications are protected by this local lock, and now we may introduce better scheme of netns_ids destruction. Let's look at the functions peernet2id_alloc() and get_net_ns_by_id(). Previous commits taught these functions to work well with dying net acquired from rtnl unlocked lists. And they are the only functions which can hash a net to netns_ids or obtain from there. And as easy to check, other netns_ids operating functions works with id, not with net pointers. So, we do not need rtnl_lock to synchronize cleanup_net() with all them. The another property, which is used in the patch, is that net is unhashed from net_namespace_list in the only place and by the only process. So, we avoid excess rcu_read_lock() or rtnl_lock(), when we'are iterating over the list in unhash_nsid(). All the above makes possible to keep rtnl_lock() locked only for net->list deletion, and completely avoid it for netns_ids unhashing and destruction. As these two doings may take long time (e.g., memory allocation to send skb), the patch should positively act on the scalability and signify decrease the time, which rtnl_lock() is held in cleanup_net(). Signed-off-by: Kirill Tkhai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 8ec59b4 commit fb07a82

File tree

1 file changed

+44
-18
lines changed

1 file changed

+44
-18
lines changed

net/core/net_namespace.c

Lines changed: 44 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -439,13 +439,40 @@ struct net *copy_net_ns(unsigned long flags,
439439
return net;
440440
}
441441

442+
static void unhash_nsid(struct net *net, struct net *last)
443+
{
444+
struct net *tmp;
445+
/* This function is only called from cleanup_net() work,
446+
* and this work is the only process, that may delete
447+
* a net from net_namespace_list. So, when the below
448+
* is executing, the list may only grow. Thus, we do not
449+
* use for_each_net_rcu() or rtnl_lock().
450+
*/
451+
for_each_net(tmp) {
452+
int id;
453+
454+
spin_lock_bh(&tmp->nsid_lock);
455+
id = __peernet2id(tmp, net);
456+
if (id >= 0)
457+
idr_remove(&tmp->netns_ids, id);
458+
spin_unlock_bh(&tmp->nsid_lock);
459+
if (id >= 0)
460+
rtnl_net_notifyid(tmp, RTM_DELNSID, id);
461+
if (tmp == last)
462+
break;
463+
}
464+
spin_lock_bh(&net->nsid_lock);
465+
idr_destroy(&net->netns_ids);
466+
spin_unlock_bh(&net->nsid_lock);
467+
}
468+
442469
static DEFINE_SPINLOCK(cleanup_list_lock);
443470
static LIST_HEAD(cleanup_list); /* Must hold cleanup_list_lock to touch */
444471

445472
static void cleanup_net(struct work_struct *work)
446473
{
447474
const struct pernet_operations *ops;
448-
struct net *net, *tmp;
475+
struct net *net, *tmp, *last;
449476
struct list_head net_kill_list;
450477
LIST_HEAD(net_exit_list);
451478

@@ -458,26 +485,25 @@ static void cleanup_net(struct work_struct *work)
458485

459486
/* Don't let anyone else find us. */
460487
rtnl_lock();
461-
list_for_each_entry(net, &net_kill_list, cleanup_list) {
488+
list_for_each_entry(net, &net_kill_list, cleanup_list)
462489
list_del_rcu(&net->list);
463-
list_add_tail(&net->exit_list, &net_exit_list);
464-
for_each_net(tmp) {
465-
int id;
466-
467-
spin_lock_bh(&tmp->nsid_lock);
468-
id = __peernet2id(tmp, net);
469-
if (id >= 0)
470-
idr_remove(&tmp->netns_ids, id);
471-
spin_unlock_bh(&tmp->nsid_lock);
472-
if (id >= 0)
473-
rtnl_net_notifyid(tmp, RTM_DELNSID, id);
474-
}
475-
spin_lock_bh(&net->nsid_lock);
476-
idr_destroy(&net->netns_ids);
477-
spin_unlock_bh(&net->nsid_lock);
490+
/* Cache last net. After we unlock rtnl, no one new net
491+
* added to net_namespace_list can assign nsid pointer
492+
* to a net from net_kill_list (see peernet2id_alloc()).
493+
* So, we skip them in unhash_nsid().
494+
*
495+
* Note, that unhash_nsid() does not delete nsid links
496+
* between net_kill_list's nets, as they've already
497+
* deleted from net_namespace_list. But, this would be
498+
* useless anyway, as netns_ids are destroyed there.
499+
*/
500+
last = list_last_entry(&net_namespace_list, struct net, list);
501+
rtnl_unlock();
478502

503+
list_for_each_entry(net, &net_kill_list, cleanup_list) {
504+
unhash_nsid(net, last);
505+
list_add_tail(&net->exit_list, &net_exit_list);
479506
}
480-
rtnl_unlock();
481507

482508
/*
483509
* Another CPU might be rcu-iterating the list, wait for it.

0 commit comments

Comments
 (0)