You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
net: vxlan: do not use vxlan_net before checking event type
Jesse Brandeburg reported that commit acaf4e7 caused a panic
when adding a network namespace while vxlan module was present in
the system:
[<ffffffff814d0865>] vxlan_lowerdev_event+0xf5/0x100
[<ffffffff816e9e5d>] notifier_call_chain+0x4d/0x70
[<ffffffff810912be>] __raw_notifier_call_chain+0xe/0x10
[<ffffffff810912d6>] raw_notifier_call_chain+0x16/0x20
[<ffffffff815d9610>] call_netdevice_notifiers_info+0x40/0x70
[<ffffffff815d9656>] call_netdevice_notifiers+0x16/0x20
[<ffffffff815e1bce>] register_netdevice+0x1be/0x3a0
[<ffffffff815e1dce>] register_netdev+0x1e/0x30
[<ffffffff814cb94a>] loopback_net_init+0x4a/0xb0
[<ffffffffa016ed6e>] ? lockd_init_net+0x6e/0xb0 [lockd]
[<ffffffff815d6bac>] ops_init+0x4c/0x150
[<ffffffff815d6d23>] setup_net+0x73/0x110
[<ffffffff815d725b>] copy_net_ns+0x7b/0x100
[<ffffffff81090e11>] create_new_namespaces+0x101/0x1b0
[<ffffffff81090f45>] copy_namespaces+0x85/0xb0
[<ffffffff810693d5>] copy_process.part.26+0x935/0x1500
[<ffffffff811d5186>] ? mntput+0x26/0x40
[<ffffffff8106a15c>] do_fork+0xbc/0x2e0
[<ffffffff811b7f2e>] ? ____fput+0xe/0x10
[<ffffffff81089c5c>] ? task_work_run+0xac/0xe0
[<ffffffff8106a406>] SyS_clone+0x16/0x20
[<ffffffff816ee689>] stub_clone+0x69/0x90
[<ffffffff816ee329>] ? system_call_fastpath+0x16/0x1b
Apparently loopback device is being registered first and thus we
receive an event notification when vxlan_net is not ready. Hence,
when we call net_generic() and request vxlan_net_id, we seem to
access garbage at that point in time. In setup_net() where we set
up a newly allocated network namespace, we traverse the list of
pernet ops ...
list_for_each_entry(ops, &pernet_list, list) {
error = ops_init(ops, net);
if (error < 0)
goto out_undo;
}
... and loopback_net_init() is invoked first here, so in the middle
of setup_net() we get this notification in vxlan. As currently we
only care about devices that unregister, move access through
net_generic() there. Fix is based on Cong Wang's proposal, but
only changes what is needed here. It sucks a bit as we only work
around the actual cure: right now it seems the only way to check if
a netns actually finished traversing all init ops would be to check
if it's part of net_namespace_list. But that I find quite expensive
each time we go through a notifier callback. Anyway, did a couple
of tests and it seems good for now.
Fixes: acaf4e7 ("net: vxlan: when lower dev unregisters remove vxlan dev as well")
Reported-by: Jesse Brandeburg <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Jesse Brandeburg <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
0 commit comments