Replace hand-rolled SharedMutex with C++17 std::shared_mutex (NFCI) #7120

adrian-prantl · 2023-08-01T00:35:40Z

We are getting reports where LLDB is hanging in try_lock(). This patch makes the code easier to reason about by using standard primitives, and avoids the pitfall of the copy constructor that the old code was still allowing. While this may not fix the hangs, it should certainly make it easier to diagnose.

rdar://112025620

We are getting reports where LLDB is hanging in try_lock(). This patch makes the code easier to reason about by using standard primitives, and avoids the pitfall of the copy constructor that the old code was still allowing. While this may not fix the hangs, it should certainly make it easier to diagnose. rdar://112025620

adrian-prantl · 2023-08-01T00:36:34Z

@swift-ci test

JDevlieghere · 2023-08-01T04:05:46Z

LLVM's RWMutex uses shared_mutex if the deployment target allows it. Either this is a NOOP or it has the potential to break bots with a lower deployment target.

adrian-prantl · 2023-08-01T15:55:36Z

LLVM's RWMutex uses shared_mutex if the deployment target allows it. Either this is a NOOP or it has the potential to break bots with a lower deployment target.

According to the comment in RWMutex.h std::shared_timed_mutex is only available on macOS 10.12 and later. Green dragon is running on 10.15.5.

Note that llvm::RWMutex does not have a try_lock method and it's my hand-rolled implementation of that, that I would like to remove with this patch.

JDevlieghere · 2023-08-01T16:41:41Z

According to the comment in RWMutex.h std::shared_timed_mutex is only available on macOS 10.12 and later. Green dragon is running on 10.15.5.

Guess who wrote that comment. 🙃 Also, I assume you mean Swift CI? If both GreenDragon and Swift CI are using 10.15 as the minimum deployment target I'll remove that workaround upstream.

Note that llvm::RWMutex does not have a try_lock method and it's my hand-rolled implementation of that, that I would like to remove with this patch.

👍

adrian-prantl · 2023-08-01T17:03:30Z

ci.swift.org claims to use Host OS: macOS 12.6.

adrian-prantl · 2023-08-01T17:10:44Z

I just realized that the "host os" != "deployment target". Green dragon uses x86_64-apple-darwin20.6.0 (macOS 11) and ci.swift.org seems to build for x86_64-apple-macosx10.13.
So we should be good here.

JDevlieghere · 2023-08-01T17:11:15Z

Actually, my memory was correct, but it seems someone hoisted shared_mutex out of the availability check. Seems like Google is still running a macOS bot that targets 10.12 (discussed in https://reviews.llvm.org/D130689) so let's keep the workaround in place for shared_timed_mutex upstream. That doesn't affect this patch.

kastiglione · 2023-08-01T17:27:06Z

lldb/source/Target/Target.cpp

-              [this] { GetSwiftScratchContextLock().unlock(); });
+        auto &lock = GetSwiftScratchContextLock();
+        if (lock.try_lock()) {
+          auto unlock = llvm::make_scope_exit([&lock] { lock.unlock(); });


can this use std::lock_guard + std::adopt_lock?

Suggested change

auto unlock = llvm::make_scope_exit([&lock] { lock.unlock(); });

std::lock_guard<std::shared_mutex> unlock(lock, std::adopt_lock);

kastiglione · 2023-08-01T17:28:29Z

lldb/include/lldb/Core/SwiftScratchContextReader.h

  TypeSystemSwiftTypeRefForExpressions *operator->() { return get(); }
  TypeSystemSwiftTypeRefForExpressions &operator*() { return *get(); }
 };

 /// An RAII object that just acquires the reader lock.
-struct SwiftScratchContextLock : ScopedSharedMutexReader {
+struct SwiftScratchContextLock {
+  std::shared_lock<std::shared_mutex> lock;


maybe m_lock for consistency?

IIUC the lldb coding style uses m_ only in classes, not in structs.

The reader lock must be acquired before retrieving a scratch context, otherwise another thread could still tear it down between the time it was created and the creation of the reader object. There are other shortcomings in GetSwiftScratchContext() that are documented, but not addressed, in this patch to make it easier to review.

adrian-prantl · 2023-08-01T18:57:44Z

@kastiglione @JDevlieghere I added another commit on top that actually fixes a concurrency issue.

adrian-prantl · 2023-08-01T18:57:49Z

@swift-ci test

augusto2112 · 2023-08-01T23:34:27Z

lldb/source/Target/Target.cpp

    auto type_system_or_err =
        GetScratchTypeSystemForLanguage(eLanguageTypeSwift, false);
    if (!type_system_or_err) {
      llvm::consumeError(type_system_or_err.takeError());
-      return nullptr;
+      return;


Nit: we could log this error instead of silently consuming it.

augusto2112 · 2023-08-02T00:00:51Z

lldb/source/Target/Target.cpp

-      log->Printf("returned project-wide scratch context\n");
+  llvm::Optional<SwiftScratchContextReader> reader;
+  if (lldb_module && m_use_scratch_typesystem_per_module) {
+    maybe_create_fallback_context();


Could you explain one more time why this can't potentially delete the scratch context while it's being used?

In my mind the following events could happen:

Thread 1 successfully calls GetSwiftScratchContext and creates a new scratch type system.

Thread 2 calls GetSwiftScratchContext

Thread 2 calls maybe_create_fallback_context (this call is not guarded by any locks as far as I can tell).

maybe_create_fallback_context deletes the scratch type system.

Thread 1 is now using a dangling reference to the scratch type system.

You are right.

I addressed this by only deleting when we can acquire the write lock.

JDevlieghere · 2023-08-02T00:06:17Z

lldb/source/Target/Target.cpp

+        if (log)
+          log->Printf("erased module-wide scratch context with errors\n");


JDevlieghere · 2023-08-02T00:09:47Z

lldb/source/Target/Target.cpp

+  llvm::Optional<SwiftScratchContextReader> reader;
+  if (lldb_module && m_use_scratch_typesystem_per_module) {
+    maybe_create_fallback_context();
+    std::shared_lock<std::shared_mutex> lock(GetSwiftScratchContextLock());


Would it be possible to create a reader object from the get_or_create object and make the whole operation atomic?

Unfortunately, no. AFAIK, there is no way to "downgrade" a unique lock into a shared lock, so we need to reacquire it. That means we also need to do the cache lookup again, because another thread could have swapped it out in the mean time by calling the same function.

JDevlieghere · 2023-08-02T00:11:07Z

lldb/source/Target/Target.cpp

@@ -2431,7 +2431,7 @@ Target::GetScratchTypeSystemForLanguage(lldb::LanguageType language,
        // replacing it could cause a use-after-free later on.
        auto &lock = GetSwiftScratchContextLock();
        if (lock.try_lock()) {
-          auto unlock = llvm::make_scope_exit([&lock] { lock.unlock(); });
+          std::lock_guard<std::shared_mutex> unlock(lock, std::adopt_lock);


augusto2112 · 2023-08-02T00:14:31Z

lldb/include/lldb/Core/SwiftScratchContextReader.h

-class SwiftScratchContextReader : ScopedSharedMutexReader {
-  TypeSystemSwiftTypeRefForExpressions *m_ptr;
+class SwiftScratchContextReader {
+  std::shared_lock<std::shared_mutex> m_lock;


All the locks of the shared mutex are locked by std::shared_lock (none of them use std::unique_lock). Doesn't this mean that all of them can use the type system concurrently freely?

Correct. The lock is not meant to ensure exclusive access, but guard against a context from being destroyed while still being used.

Fair enough, you would have to release the unique_lock and acquire the shared_lock, which has the same drawback as the current approach.

JDevlieghere

LGTM

adrian-prantl · 2023-08-02T15:29:15Z

@swift-ci test windows

adrian-prantl · 2023-08-02T15:38:19Z

@swift-ci test

adrian-prantl · 2023-08-02T15:38:35Z

@swift-ci test windows

adrian-prantl requested review from kastiglione, JDevlieghere, augusto2112 and jimingham August 1, 2023 00:36

JDevlieghere approved these changes Aug 1, 2023

View reviewed changes

kastiglione reviewed Aug 1, 2023

View reviewed changes

adrian-prantl added 3 commits August 1, 2023 10:44

Hoist check outside of lambda (NFC)

87ac257

Use std::adopt_lock instead of lambda

6b89a2a

adrian-prantl requested a review from JDevlieghere August 1, 2023 20:55

augusto2112 reviewed Aug 1, 2023

View reviewed changes

augusto2112 reviewed Aug 2, 2023

View reviewed changes

JDevlieghere reviewed Aug 2, 2023

View reviewed changes

augusto2112 reviewed Aug 2, 2023

View reviewed changes

JDevlieghere approved these changes Aug 2, 2023

View reviewed changes

Only delete the scratch context if we can acquire the unique lock

e90fe50

adrian-prantl merged commit fbc2eaa into swiftlang:swift/release/5.9 Aug 2, 2023

	auto unlock = llvm::make_scope_exit([&lock] { lock.unlock(); });
	std::lock_guard<std::shared_mutex> unlock(lock, std::adopt_lock);

		if (log)
		log->Printf("erased module-wide scratch context with errors\n");

Replace hand-rolled SharedMutex with C++17 std::shared_mutex (NFCI) #7120

Replace hand-rolled SharedMutex with C++17 std::shared_mutex (NFCI) #7120

Uh oh!

Conversation

adrian-prantl commented Aug 1, 2023

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

JDevlieghere commented Aug 1, 2023

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

JDevlieghere commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

JDevlieghere commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

adrian-prantl commented Aug 1, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

augusto2112 Aug 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JDevlieghere left a comment

Choose a reason for hiding this comment

Uh oh!

adrian-prantl commented Aug 2, 2023

Uh oh!

adrian-prantl commented Aug 2, 2023

Uh oh!

adrian-prantl commented Aug 2, 2023

Uh oh!

Uh oh!

JDevlieghere commented Aug 1, 2023 •

edited

Loading

JDevlieghere commented Aug 1, 2023 •

edited

Loading

augusto2112 Aug 2, 2023 •

edited

Loading