Skip to content

Commit d772794

Browse files
committed
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle were: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Torture-test updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) torture: Save a line in stutter_wait(): while -> for torture: Eliminate torture_runnable and perf_runnable torture: Make stutter less vulnerable to compilers and races locking/locktorture: Fix num reader/writer corner cases locking/locktorture: Fix rwsem reader_delay torture: Place all torture-test modules in one MAINTAINERS group rcutorture/kvm-build.sh: Skip build directory check rcutorture: Simplify functions.sh include path rcutorture: Simplify logging rcutorture/kvm-recheck-*: Improve result directory readability check rcutorture/kvm.sh: Support execution from any directory rcutorture/kvm.sh: Use consistent help text for --qemu-args rcutorture/kvm.sh: Remove unused variable, `alldone` rcutorture: Remove unused script, config2frag.sh rcutorture/configinit: Fix build directory error message rcutorture: Preempt RCU-preempt readers more vigorously torture: Reduce #ifdefs for preempt_schedule() rcu: Remove have_rcu_nocb_mask from tree_plugin.h rcu: Add comment giving debug strategy for double call_rcu() tracing, rcu: Hide trace event rcu_nocb_wake when not used ...
2 parents c148879 + 475c5ee commit d772794

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

81 files changed

+501
-748
lines changed

Documentation/RCU/Design/Data-Structures/Data-Structures.html

Lines changed: 34 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1097,7 +1097,8 @@ <h5>Quiescent-State and Grace-Period Tracking</h5>
10971097
its next exit from idle.
10981098
Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect
10991099
cases where a given operation has resulted in a quiescent state
1100-
for all flavors of RCU, for example, <tt>cond_resched_rcu_qs()</tt>.
1100+
for all flavors of RCU, for example, <tt>cond_resched()</tt>
1101+
when RCU has indicated a need for quiescent states.
11011102

11021103
<h5>RCU Callback Handling</h5>
11031104

@@ -1182,24 +1183,40 @@ <h3><a name="The rcu_dynticks Structure">
11821183
Its fields are as follows:
11831184

11841185
<pre>
1185-
1 int dynticks_nesting;
1186-
2 int dynticks_nmi_nesting;
1186+
1 long dynticks_nesting;
1187+
2 long dynticks_nmi_nesting;
11871188
3 atomic_t dynticks;
11881189
4 bool rcu_need_heavy_qs;
11891190
5 unsigned long rcu_qs_ctr;
11901191
6 bool rcu_urgent_qs;
11911192
</pre>
11921193

11931194
<p>The <tt>-&gt;dynticks_nesting</tt> field counts the
1194-
nesting depth of normal interrupts.
1195-
In addition, this counter is incremented when exiting dyntick-idle
1196-
mode and decremented when entering it.
1195+
nesting depth of process execution, so that in normal circumstances
1196+
this counter has value zero or one.
1197+
NMIs, irqs, and tracers are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
1198+
field.
1199+
Because NMIs cannot be masked, changes to this variable have to be
1200+
undertaken carefully using an algorithm provided by Andy Lutomirski.
1201+
The initial transition from idle adds one, and nested transitions
1202+
add two, so that a nesting level of five is represented by a
1203+
<tt>-&gt;dynticks_nmi_nesting</tt> value of nine.
11971204
This counter can therefore be thought of as counting the number
11981205
of reasons why this CPU cannot be permitted to enter dyntick-idle
1199-
mode, aside from non-maskable interrupts (NMIs).
1200-
NMIs are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
1201-
field, except that NMIs that interrupt non-dyntick-idle execution
1202-
are not counted.
1206+
mode, aside from process-level transitions.
1207+
1208+
<p>However, it turns out that when running in non-idle kernel context,
1209+
the Linux kernel is fully capable of entering interrupt handlers that
1210+
never exit and perhaps also vice versa.
1211+
Therefore, whenever the <tt>-&gt;dynticks_nesting</tt> field is
1212+
incremented up from zero, the <tt>-&gt;dynticks_nmi_nesting</tt> field
1213+
is set to a large positive number, and whenever the
1214+
<tt>-&gt;dynticks_nesting</tt> field is decremented down to zero,
1215+
the the <tt>-&gt;dynticks_nmi_nesting</tt> field is set to zero.
1216+
Assuming that the number of misnested interrupts is not sufficient
1217+
to overflow the counter, this approach corrects the
1218+
<tt>-&gt;dynticks_nmi_nesting</tt> field every time the corresponding
1219+
CPU enters the idle loop from process context.
12031220

12041221
</p><p>The <tt>-&gt;dynticks</tt> field counts the corresponding
12051222
CPU's transitions to and from dyntick-idle mode, so that this counter
@@ -1231,14 +1248,16 @@ <h3><a name="The rcu_dynticks Structure">
12311248
<tr><th>&nbsp;</th></tr>
12321249
<tr><th align="left">Quick Quiz:</th></tr>
12331250
<tr><td>
1234-
Why not just count all NMIs?
1235-
Wouldn't that be simpler and less error prone?
1251+
Why not simply combine the <tt>-&gt;dynticks_nesting</tt>
1252+
and <tt>-&gt;dynticks_nmi_nesting</tt> counters into a
1253+
single counter that just counts the number of reasons that
1254+
the corresponding CPU is non-idle?
12361255
</td></tr>
12371256
<tr><th align="left">Answer:</th></tr>
12381257
<tr><td bgcolor="#ffffff"><font color="ffffff">
1239-
It seems simpler only until you think hard about how to go about
1240-
updating the <tt>rcu_dynticks</tt> structure's
1241-
<tt>-&gt;dynticks</tt> field.
1258+
Because this would fail in the presence of interrupts whose
1259+
handlers never return and of handlers that manage to return
1260+
from a made-up interrupt.
12421261
</font></td></tr>
12431262
<tr><td>&nbsp;</td></tr>
12441263
</table>

Documentation/RCU/Design/Requirements/Requirements.html

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -581,7 +581,8 @@ <h3><a name="Publish-Subscribe Guarantee">Publish/Subscribe Guarantee</a></h3>
581581
DYNIX/ptx used an explicit memory barrier for publication, but had nothing
582582
resembling <tt>rcu_dereference()</tt> for subscription, nor did it
583583
have anything resembling the <tt>smp_read_barrier_depends()</tt>
584-
that was later subsumed into <tt>rcu_dereference()</tt>.
584+
that was later subsumed into <tt>rcu_dereference()</tt> and later
585+
still into <tt>READ_ONCE()</tt>.
585586
The need for these operations made itself known quite suddenly at a
586587
late-1990s meeting with the DEC Alpha architects, back in the days when
587588
DEC was still a free-standing company.
@@ -2797,7 +2798,7 @@ <h3><a name="Performance, Scalability, Response Time, and Reliability">
27972798
executing in usermode (which is one use case for
27982799
<tt>CONFIG_NO_HZ_FULL=y</tt>) or in the kernel.
27992800
That said, CPU-bound loops in the kernel must execute
2800-
<tt>cond_resched_rcu_qs()</tt> at least once per few tens of milliseconds
2801+
<tt>cond_resched()</tt> at least once per few tens of milliseconds
28012802
in order to avoid receiving an IPI from RCU.
28022803

28032804
<p>
@@ -3128,7 +3129,7 @@ <h3><a name="Tasks RCU">Tasks RCU</a></h3>
31283129
is to have implicit
31293130
read-side critical sections that are delimited by voluntary context
31303131
switches, that is, calls to <tt>schedule()</tt>,
3131-
<tt>cond_resched_rcu_qs()</tt>, and
3132+
<tt>cond_resched()</tt>, and
31323133
<tt>synchronize_rcu_tasks()</tt>.
31333134
In addition, transitions to and from userspace execution also delimit
31343135
tasks-RCU read-side critical sections.

Documentation/RCU/rcu_dereference.txt

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -122,11 +122,7 @@ o Be very careful about comparing pointers obtained from
122122
Note that if checks for being within an RCU read-side
123123
critical section are not required and the pointer is never
124124
dereferenced, rcu_access_pointer() should be used in place
125-
of rcu_dereference(). The rcu_access_pointer() primitive
126-
does not require an enclosing read-side critical section,
127-
and also omits the smp_read_barrier_depends() included in
128-
rcu_dereference(), which in turn should provide a small
129-
performance gain in some CPUs (e.g., the DEC Alpha).
125+
of rcu_dereference().
130126

131127
o The comparison is against a pointer that references memory
132128
that was initialized "a long time ago." The reason

Documentation/RCU/stallwarn.txt

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,12 +23,10 @@ o A CPU looping with preemption disabled. This condition can
2323
o A CPU looping with bottom halves disabled. This condition can
2424
result in RCU-sched and RCU-bh stalls.
2525

26-
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
27-
kernel without invoking schedule(). Note that cond_resched()
28-
does not necessarily prevent RCU CPU stall warnings. Therefore,
29-
if the looping in the kernel is really expected and desirable
30-
behavior, you might need to replace some of the cond_resched()
31-
calls with calls to cond_resched_rcu_qs().
26+
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
27+
without invoking schedule(). If the looping in the kernel is
28+
really expected and desirable behavior, you might need to add
29+
some calls to cond_resched().
3230

3331
o Booting Linux using a console connection that is too slow to
3432
keep up with the boot-time console-message rate. For example,

Documentation/RCU/whatisRCU.txt

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -600,8 +600,7 @@ don't forget about them when submitting patches making use of RCU!]
600600

601601
#define rcu_dereference(p) \
602602
({ \
603-
typeof(p) _________p1 = p; \
604-
smp_read_barrier_depends(); \
603+
typeof(p) _________p1 = READ_ONCE(p); \
605604
(_________p1); \
606605
})
607606

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2052,9 +2052,6 @@
20522052
This tests the locking primitive's ability to
20532053
transition abruptly to and from idle.
20542054

2055-
locktorture.torture_runnable= [BOOT]
2056-
Start locktorture running at boot time.
2057-
20582055
locktorture.torture_type= [KNL]
20592056
Specify the locking implementation to test.
20602057

@@ -3488,9 +3485,6 @@
34883485
the same as for rcuperf.nreaders.
34893486
N, where N is the number of CPUs
34903487

3491-
rcuperf.perf_runnable= [BOOT]
3492-
Start rcuperf running at boot time.
3493-
34943488
rcuperf.perf_type= [KNL]
34953489
Specify the RCU implementation to test.
34963490

@@ -3624,9 +3618,6 @@
36243618
Test RCU's dyntick-idle handling. See also the
36253619
rcutorture.shuffle_interval parameter.
36263620

3627-
rcutorture.torture_runnable= [BOOT]
3628-
Start rcutorture running at boot time.
3629-
36303621
rcutorture.torture_type= [KNL]
36313622
Specify the RCU implementation to test.
36323623

Documentation/circular-buffers.txt

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -220,8 +220,7 @@ before it writes the new tail pointer, which will erase the item.
220220

221221
Note the use of READ_ONCE() and smp_load_acquire() to read the
222222
opposition index. This prevents the compiler from discarding and
223-
reloading its cached value - which some compilers will do across
224-
smp_read_barrier_depends(). This isn't strictly needed if you can
223+
reloading its cached value. This isn't strictly needed if you can
225224
be sure that the opposition index will _only_ be used the once.
226225
The smp_load_acquire() additionally forces the CPU to order against
227226
subsequent memory references. Similarly, smp_store_release() is used

Documentation/locking/locktorture.txt

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,6 @@ torture_type Type of lock to torture. By default, only spinlocks will
5757

5858
o "rwsem_lock": read/write down() and up() semaphore pairs.
5959

60-
torture_runnable Start locktorture at boot time in the case where the
61-
module is built into the kernel, otherwise wait for
62-
torture_runnable to be set via sysfs before starting.
63-
By default it will begin once the module is loaded.
64-
6560

6661
** Torture-framework (RCU + locking) **
6762

Documentation/memory-barriers.txt

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -227,17 +227,20 @@ There are some minimal guarantees that may be expected of a CPU:
227227
(*) On any given CPU, dependent memory accesses will be issued in order, with
228228
respect to itself. This means that for:
229229

230-
Q = READ_ONCE(P); smp_read_barrier_depends(); D = READ_ONCE(*Q);
230+
Q = READ_ONCE(P); D = READ_ONCE(*Q);
231231

232232
the CPU will issue the following memory operations:
233233

234234
Q = LOAD P, D = LOAD *Q
235235

236-
and always in that order. On most systems, smp_read_barrier_depends()
237-
does nothing, but it is required for DEC Alpha. The READ_ONCE()
238-
is required to prevent compiler mischief. Please note that you
239-
should normally use something like rcu_dereference() instead of
240-
open-coding smp_read_barrier_depends().
236+
and always in that order. However, on DEC Alpha, READ_ONCE() also
237+
emits a memory-barrier instruction, so that a DEC Alpha CPU will
238+
instead issue the following memory operations:
239+
240+
Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
241+
242+
Whether on DEC Alpha or not, the READ_ONCE() also prevents compiler
243+
mischief.
241244

242245
(*) Overlapping loads and stores within a particular CPU will appear to be
243246
ordered within that CPU. This means that for:
@@ -1815,7 +1818,7 @@ The Linux kernel has eight basic CPU memory barriers:
18151818
GENERAL mb() smp_mb()
18161819
WRITE wmb() smp_wmb()
18171820
READ rmb() smp_rmb()
1818-
DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends()
1821+
DATA DEPENDENCY READ_ONCE()
18191822

18201823

18211824
All memory barriers except the data dependency barriers imply a compiler
@@ -2864,7 +2867,10 @@ access depends on a read, not all do, so it may not be relied on.
28642867

28652868
Other CPUs may also have split caches, but must coordinate between the various
28662869
cachelets for normal memory accesses. The semantics of the Alpha removes the
2867-
need for coordination in the absence of memory barriers.
2870+
need for hardware coordination in the absence of memory barriers, which
2871+
permitted Alpha to sport higher CPU clock rates back in the day. However,
2872+
please note that smp_read_barrier_depends() should not be used except in
2873+
Alpha arch-specific code and within the READ_ONCE() macro.
28682874

28692875

28702876
CACHE COHERENCY VS DMA

MAINTAINERS

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8196,6 +8196,7 @@ F: arch/*/include/asm/rwsem.h
81968196
F: include/linux/seqlock.h
81978197
F: lib/locking*.[ch]
81988198
F: kernel/locking/
8199+
X: kernel/locking/locktorture.c
81998200

82008201
LOGICAL DISK MANAGER SUPPORT (LDM, Windows 2000/XP/Vista Dynamic Disks)
82018202
M: "Richard Russon (FlatCap)" <[email protected]>
@@ -11480,15 +11481,6 @@ L: [email protected]
1148011481
S: Orphan
1148111482
F: drivers/net/wireless/ray*
1148211483

11483-
RCUTORTURE MODULE
11484-
M: Josh Triplett <[email protected]>
11485-
M: "Paul E. McKenney" <[email protected]>
11486-
11487-
S: Supported
11488-
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
11489-
F: Documentation/RCU/torture.txt
11490-
F: kernel/rcu/rcutorture.c
11491-
1149211484
RCUTORTURE TEST FRAMEWORK
1149311485
M: "Paul E. McKenney" <[email protected]>
1149411486
M: Josh Triplett <[email protected]>
@@ -13803,6 +13795,18 @@ L: [email protected]
1380313795
S: Maintained
1380413796
F: drivers/platform/x86/topstar-laptop.c
1380513797

13798+
TORTURE-TEST MODULES
13799+
M: Davidlohr Bueso <[email protected]>
13800+
M: "Paul E. McKenney" <[email protected]>
13801+
M: Josh Triplett <[email protected]>
13802+
13803+
S: Supported
13804+
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
13805+
F: Documentation/RCU/torture.txt
13806+
F: kernel/torture.c
13807+
F: kernel/rcu/rcutorture.c
13808+
F: kernel/locking/locktorture.c
13809+
1380613810
TOSHIBA ACPI EXTRAS DRIVER
1380713811
M: Azael Avalos <[email protected]>
1380813812

arch/mn10300/kernel/mn10300-serial.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -550,7 +550,7 @@ static void mn10300_serial_receive_interrupt(struct mn10300_serial_port *port)
550550
return;
551551
}
552552

553-
smp_read_barrier_depends();
553+
/* READ_ONCE() enforces dependency, but dangerous through integer!!! */
554554
ch = port->rx_buffer[ix++];
555555
st = port->rx_buffer[ix++];
556556
smp_mb();
@@ -1728,7 +1728,10 @@ static int mn10300_serial_poll_get_char(struct uart_port *_port)
17281728
if (CIRC_CNT(port->rx_inp, ix, MNSC_BUFFER_SIZE) == 0)
17291729
return NO_POLL_CHAR;
17301730

1731-
smp_read_barrier_depends();
1731+
/*
1732+
* READ_ONCE() enforces dependency, but dangerous
1733+
* through integer!!!
1734+
*/
17321735
ch = port->rx_buffer[ix++];
17331736
st = port->rx_buffer[ix++];
17341737
smp_mb();

drivers/dma/ioat/dma.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -597,7 +597,6 @@ static void __cleanup(struct ioatdma_chan *ioat_chan, dma_addr_t phys_complete)
597597
for (i = 0; i < active && !seen_current; i++) {
598598
struct dma_async_tx_descriptor *tx;
599599

600-
smp_read_barrier_depends();
601600
prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
602601
desc = ioat_get_ring_ent(ioat_chan, idx + i);
603602
dump_desc_dbg(ioat_chan, desc);
@@ -715,7 +714,6 @@ static void ioat_abort_descs(struct ioatdma_chan *ioat_chan)
715714
for (i = 1; i < active; i++) {
716715
struct dma_async_tx_descriptor *tx;
717716

718-
smp_read_barrier_depends();
719717
prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
720718
desc = ioat_get_ring_ent(ioat_chan, idx + i);
721719

drivers/infiniband/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ menuconfig INFINIBAND
44
depends on NET
55
depends on INET
66
depends on m || IPV6 != m
7+
depends on !ALPHA
78
select IRQ_POLL
89
---help---
910
Core support for InfiniBand (IB). Make sure to also select

drivers/infiniband/hw/hfi1/rc.c

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -302,7 +302,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
302302
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
303303
goto bail;
304304
/* We are in the error state, flush the work request. */
305-
smp_read_barrier_depends(); /* see post_one_send() */
306305
if (qp->s_last == READ_ONCE(qp->s_head))
307306
goto bail;
308307
/* If DMAs are in progress, we can't flush immediately. */
@@ -346,7 +345,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
346345
newreq = 0;
347346
if (qp->s_cur == qp->s_tail) {
348347
/* Check if send work queue is empty. */
349-
smp_read_barrier_depends(); /* see post_one_send() */
350348
if (qp->s_tail == READ_ONCE(qp->s_head)) {
351349
clear_ahg(qp);
352350
goto bail;
@@ -900,7 +898,6 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd,
900898
}
901899

902900
/* Ensure s_rdma_ack_cnt changes are committed */
903-
smp_read_barrier_depends();
904901
if (qp->s_rdma_ack_cnt) {
905902
hfi1_queue_rc_ack(qp, is_fecn);
906903
return;
@@ -1562,7 +1559,6 @@ static void rc_rcv_resp(struct hfi1_packet *packet)
15621559
trace_hfi1_ack(qp, psn);
15631560

15641561
/* Ignore invalid responses. */
1565-
smp_read_barrier_depends(); /* see post_one_send */
15661562
if (cmp_psn(psn, READ_ONCE(qp->s_next_psn)) >= 0)
15671563
goto ack_done;
15681564

drivers/infiniband/hw/hfi1/ruc.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -362,7 +362,6 @@ static void ruc_loopback(struct rvt_qp *sqp)
362362
sqp->s_flags |= RVT_S_BUSY;
363363

364364
again:
365-
smp_read_barrier_depends(); /* see post_one_send() */
366365
if (sqp->s_last == READ_ONCE(sqp->s_head))
367366
goto clr_busy;
368367
wqe = rvt_get_swqe_ptr(sqp, sqp->s_last);

drivers/infiniband/hw/hfi1/sdma.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -553,7 +553,6 @@ static void sdma_hw_clean_up_task(unsigned long opaque)
553553

554554
static inline struct sdma_txreq *get_txhead(struct sdma_engine *sde)
555555
{
556-
smp_read_barrier_depends(); /* see sdma_update_tail() */
557556
return sde->tx_ring[sde->tx_head & sde->sdma_mask];
558557
}
559558

0 commit comments

Comments
 (0)