Skip to content

Commit 8112152

Browse files
fbqPeter Zijlstra
authored andcommitted
locking/rwbase: Take care of ordering guarantee for fastpath reader
Readers of rwbase can lock and unlock without taking any inner lock, if that happens, we need the ordering provided by atomic operations to satisfy the ordering semantics of lock/unlock. Without that, considering the follow case: { X = 0 initially } CPU 0 CPU 1 ===== ===== rt_write_lock(); X = 1 rt_write_unlock(): atomic_add(READER_BIAS - WRITER_BIAS, ->readers); // ->readers is READER_BIAS. rt_read_lock(): if ((r = atomic_read(->readers)) < 0) // True atomic_try_cmpxchg(->readers, r, r + 1); // succeed. <acquire the read lock via fast path> r1 = X; // r1 may be 0, because nothing prevent the reordering // of "X=1" and atomic_add() on CPU 1. Therefore audit every usage of atomic operations that may happen in a fast path, and add necessary barriers. Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 616be87 commit 8112152

File tree

1 file changed

+19
-2
lines changed

1 file changed

+19
-2
lines changed

kernel/locking/rwbase_rt.c

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,12 @@
4141
* The risk of writer starvation is there, but the pathological use cases
4242
* which trigger it are not necessarily the typical RT workloads.
4343
*
44+
* Fast-path orderings:
45+
* The lock/unlock of readers can run in fast paths: lock and unlock are only
46+
* atomic ops, and there is no inner lock to provide ACQUIRE and RELEASE
47+
* semantics of rwbase_rt. Atomic ops should thus provide _acquire()
48+
* and _release() (or stronger).
49+
*
4450
* Common code shared between RT rw_semaphore and rwlock
4551
*/
4652

@@ -53,6 +59,7 @@ static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
5359
* set.
5460
*/
5561
for (r = atomic_read(&rwb->readers); r < 0;) {
62+
/* Fully-ordered if cmpxchg() succeeds, provides ACQUIRE */
5663
if (likely(atomic_try_cmpxchg(&rwb->readers, &r, r + 1)))
5764
return 1;
5865
}
@@ -162,6 +169,8 @@ static __always_inline void rwbase_read_unlock(struct rwbase_rt *rwb,
162169
/*
163170
* rwb->readers can only hit 0 when a writer is waiting for the
164171
* active readers to leave the critical section.
172+
*
173+
* dec_and_test() is fully ordered, provides RELEASE.
165174
*/
166175
if (unlikely(atomic_dec_and_test(&rwb->readers)))
167176
__rwbase_read_unlock(rwb, state);
@@ -172,7 +181,11 @@ static inline void __rwbase_write_unlock(struct rwbase_rt *rwb, int bias,
172181
{
173182
struct rt_mutex_base *rtm = &rwb->rtmutex;
174183

175-
atomic_add(READER_BIAS - bias, &rwb->readers);
184+
/*
185+
* _release() is needed in case that reader is in fast path, pairing
186+
* with atomic_try_cmpxchg() in rwbase_read_trylock(), provides RELEASE
187+
*/
188+
(void)atomic_add_return_release(READER_BIAS - bias, &rwb->readers);
176189
raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
177190
rwbase_rtmutex_unlock(rtm);
178191
}
@@ -201,7 +214,11 @@ static inline bool __rwbase_write_trylock(struct rwbase_rt *rwb)
201214
/* Can do without CAS because we're serialized by wait_lock. */
202215
lockdep_assert_held(&rwb->rtmutex.wait_lock);
203216

204-
if (!atomic_read(&rwb->readers)) {
217+
/*
218+
* _acquire is needed in case the reader is in the fast path, pairing
219+
* with rwbase_read_unlock(), provides ACQUIRE.
220+
*/
221+
if (!atomic_read_acquire(&rwb->readers)) {
205222
atomic_set(&rwb->readers, WRITER_BIAS);
206223
return 1;
207224
}

0 commit comments

Comments
 (0)