Skip to content

Commit 275b471

Browse files
kuba-moodavem330
authored andcommitted
net: don't let netpoll invoke NAPI if in xmit context
Commit 0db3dc7 ("[NETPOLL]: tx lock deadlock fix") narrowed down the region under netif_tx_trylock() inside netpoll_send_skb(). (At that point in time netif_tx_trylock() would lock all queues of the device.) Taking the tx lock was problematic because driver's cleanup method may take the same lock. So the change made us hold the xmit lock only around xmit, and expected the driver to take care of locking within ->ndo_poll_controller(). Unfortunately this only works if netpoll isn't itself called with the xmit lock already held. Netpoll code is careful and uses trylock(). The drivers, however, may be using plain lock(). Printing while holding the xmit lock is going to result in rare deadlocks. Luckily we record the xmit lock owners, so we can scan all the queues, the same way we scan NAPI owners. If any of the xmit locks is held by the local CPU we better not attempt any polling. It would be nice if we could narrow down the check to only the NAPIs and the queue we're trying to use. I don't see a way to do that now. Reported-by: Roman Gushchin <[email protected]> Fixes: 0db3dc7 ("[NETPOLL]: tx lock deadlock fix") Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
1 parent 7d63b67 commit 275b471

File tree

1 file changed

+18
-1
lines changed

1 file changed

+18
-1
lines changed

net/core/netpoll.c

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,20 @@ static void queue_process(struct work_struct *work)
137137
}
138138
}
139139

140+
static int netif_local_xmit_active(struct net_device *dev)
141+
{
142+
int i;
143+
144+
for (i = 0; i < dev->num_tx_queues; i++) {
145+
struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
146+
147+
if (READ_ONCE(txq->xmit_lock_owner) == smp_processor_id())
148+
return 1;
149+
}
150+
151+
return 0;
152+
}
153+
140154
static void poll_one_napi(struct napi_struct *napi)
141155
{
142156
int work;
@@ -183,7 +197,10 @@ void netpoll_poll_dev(struct net_device *dev)
183197
if (!ni || down_trylock(&ni->dev_lock))
184198
return;
185199

186-
if (!netif_running(dev)) {
200+
/* Some drivers will take the same locks in poll and xmit,
201+
* we can't poll if local CPU is already in xmit.
202+
*/
203+
if (!netif_running(dev) || netif_local_xmit_active(dev)) {
187204
up(&ni->dev_lock);
188205
return;
189206
}

0 commit comments

Comments
 (0)