Skip to content

[MLIR, Python] Make it easy to run tests with ASan on mac #115524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 23, 2024

Conversation

kasper0406
Copy link
Contributor

There are two things that make running MLIR tests with ASan on a mac tedious:

  1. The DYLD_INSERT_LIBRARIES environment variable needs to be set to point to libclang_rt.asan_osx_dynamic.dylib
  2. Mac is wrapping Python, which means that the DYLD_INSERT_LIBRARIES environment variable is not being respected in the Python tests. The solution is to find and use a non-wrapped Python binary.

With the above two changes, ASan works out of the box on mac's by setting the -DLLVM_USE_SANITIZER=Address cmake flag.

I have stolen most of the code in this PR from other LLVM projects. It may be a good idea to reconcile it somewhere.

@llvmbot llvmbot added the mlir label Nov 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-mlir

Author: Kasper Nielsen (kasper0406)

Changes

There are two things that make running MLIR tests with ASan on a mac tedious:

  1. The DYLD_INSERT_LIBRARIES environment variable needs to be set to point to libclang_rt.asan_osx_dynamic.dylib
  2. Mac is wrapping Python, which means that the DYLD_INSERT_LIBRARIES environment variable is not being respected in the Python tests. The solution is to find and use a non-wrapped Python binary.

With the above two changes, ASan works out of the box on mac's by setting the -DLLVM_USE_SANITIZER=Address cmake flag.

I have stolen most of the code in this PR from other LLVM projects. It may be a good idea to reconcile it somewhere.


Full diff: https://github.com/llvm/llvm-project/pull/115524.diff

2 Files Affected:

  • (added) mlir/test/get_darwin_real_python.py (+16)
  • (modified) mlir/test/lit.cfg.py (+98-2)
diff --git a/mlir/test/get_darwin_real_python.py b/mlir/test/get_darwin_real_python.py
new file mode 100644
index 00000000000000..63bd08bcff89e1
--- /dev/null
+++ b/mlir/test/get_darwin_real_python.py
@@ -0,0 +1,16 @@
+# On macOS, system python binaries like /usr/bin/python and $(xcrun -f python3)
+# are shims. They do some light validation work and then spawn the "real" python
+# binary. Find the "real" python by asking dyld -- sys.executable reports the
+# wrong thing more often than not. This is also useful when we're running under
+# a Homebrew python3 binary, which also appears to be some kind of shim.
+def getDarwinRealPythonExecutable():
+    import ctypes
+
+    dyld = ctypes.cdll.LoadLibrary("/usr/lib/system/libdyld.dylib")
+    namelen = ctypes.c_ulong(1024)
+    name = ctypes.create_string_buffer(b"\000", namelen.value)
+    dyld._NSGetExecutablePath(ctypes.byref(name), ctypes.byref(namelen))
+    return name.value.decode("utf-8").strip()
+
+
+print(getDarwinRealPythonExecutable())
diff --git a/mlir/test/lit.cfg.py b/mlir/test/lit.cfg.py
index 9b429b424d3575..0daa9f3a151fed 100644
--- a/mlir/test/lit.cfg.py
+++ b/mlir/test/lit.cfg.py
@@ -3,6 +3,7 @@
 import os
 import platform
 import re
+import shutil
 import subprocess
 import tempfile
 
@@ -77,6 +78,85 @@ def add_runtime(name):
     return ToolSubst(f"%{name}", find_runtime(name))
 
 
+# Provide the path to asan runtime lib 'libclang_rt.asan_osx_dynamic.dylib' if
+# available. This is darwin specific since it's currently only needed on darwin.
+# Stolen from llvm/test/lit.cfg.py with a few modifications
+def get_asan_rtlib():
+    if (
+        not "asan" in config.available_features
+        or not "Darwin" in config.host_os
+    ):
+        return ""
+    try:
+        import glob
+    except:
+        print("glob module not found, skipping get_asan_rtlib() lookup")
+        return ""
+    # The libclang_rt.asan_osx_dynamic.dylib path is obtained using the relative
+    # path from the host cc.
+    host_lib_dir = os.path.join(os.path.dirname(config.host_cc), "../lib")
+    asan_dylib_dir_pattern = (
+        host_lib_dir + "/clang/*/lib/darwin/libclang_rt.asan_osx_dynamic.dylib"
+    )
+    found_dylibs = glob.glob(asan_dylib_dir_pattern)
+    found_dylibs = set([os.path.realpath(dylib_file) for dylib_file in found_dylibs])
+    if len(found_dylibs) != 1:
+        return ""
+    return next(iter(found_dylibs))
+
+
+# On macOS, we can't do the DYLD_INSERT_LIBRARIES trick with a shim python
+# binary as the ASan interceptors get loaded too late. Also, when SIP is
+# enabled, we can't inject libraries into system binaries at all, so we need a
+# copy of the "real" python to work with.
+# Stolen from lldb/test/API/lit.cfg.py with a few modifications
+def find_real_python_interpreter():
+    # If we're running in a virtual environment, we have to copy Python into
+    # the virtual environment for it to work.
+    if sys.prefix != sys.base_prefix:
+        copied_python = os.path.join(sys.prefix, "bin", "copied-python")
+    else:
+        copied_python = os.path.join(config.lldb_build_directory, "copied-python")
+
+    # Avoid doing any work if we already copied the binary.
+    if os.path.isfile(copied_python):
+        return copied_python
+
+    # Find the "real" python binary.
+    real_python = (
+        subprocess.check_output(
+            [
+                config.python_executable,
+                os.path.join(
+                    os.path.dirname(os.path.realpath(__file__)),
+                    "get_darwin_real_python.py",
+                ),
+            ]
+        )
+        .decode("utf-8")
+        .strip()
+    )
+
+    shutil.copy(real_python, copied_python)
+
+    # Now make sure the copied Python works. The Python in Xcode has a relative
+    # RPATH and cannot be copied.
+    try:
+        # We don't care about the output, just make sure it runs.
+        subprocess.check_call([copied_python, "-V"])
+    except subprocess.CalledProcessError:
+        # The copied Python didn't work. Assume we're dealing with the Python
+        # interpreter in Xcode. Given that this is not a system binary SIP
+        # won't prevent us form injecting the interceptors, but when running in
+        # a virtual environment, we can't use it directly. Create a symlink
+        # instead.
+        os.remove(copied_python)
+        os.symlink(real_python, copied_python)
+
+    # The copied Python works.
+    return copied_python
+
+
 llvm_config.with_system_environment(["HOME", "INCLUDE", "LIB", "TMP", "TEMP"])
 
 llvm_config.use_default_substitutions()
@@ -91,6 +171,7 @@ def add_runtime(name):
     "LICENSE.txt",
     "lit.cfg.py",
     "lit.site.cfg.py",
+    "get_darwin_real_python.py",
 ]
 
 # Tweak the PATH to include the tools dir.
@@ -174,8 +255,23 @@ def add_runtime(name):
 python_executable = config.python_executable
 # Python configuration with sanitizer requires some magic preloading. This will only work on clang/linux.
 # TODO: detect Darwin/Windows situation (or mark these tests as unsupported on these platforms).
-if "asan" in config.available_features and "Linux" in config.host_os:
-    python_executable = f"LD_PRELOAD=$({config.host_cxx} -print-file-name=libclang_rt.asan-{config.host_arch}.so) {config.python_executable}"
+if "asan" in config.available_features:
+    if "Linux" in config.host_os:
+        python_executable = f"LD_PRELOAD=$({config.host_cxx} -print-file-name=libclang_rt.asan-{config.host_arch}.so) {config.python_executable}"
+    if "Darwin" in config.host_os:
+        real_python_executable = find_real_python_interpreter()
+        if real_python_executable:
+            python_executable = real_python_executable
+            # Ensure Python is not wrapped, for DYLD_INSERT_LIBRARIES to take effect
+            lit_config.note(
+                "Using {} instead of {}".format(python_executable, config.python_executable)
+            )
+
+        asan_rtlib = get_asan_rtlib()
+        lit_config.note("Using ASan rtlib {}".format(asan_rtlib))
+        config.environment["DYLD_INSERT_LIBRARIES"] = asan_rtlib
+
+
 # On Windows the path to python could contains spaces in which case it needs to be provided in quotes.
 # This is the equivalent of how %python is setup in llvm/utils/lit/lit/llvm/config.py.
 elif "Windows" in config.host_os:

Copy link

github-actions bot commented Nov 8, 2024

✅ With the latest revision this PR passed the Python code formatter.

real_python_executable = find_real_python_interpreter()
if real_python_executable:
python_executable = real_python_executable
# Ensure Python is not wrapped, for DYLD_INSERT_LIBRARIES to take effect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment in the right place? I'm not sure how it applies to the note.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I moved the comment up, and made it a bit more clear.
In addition I improved the code finding the ASan RT lib, so it also works for custom llvm installations (such as through brew), and not just the bundled one with osx Xcode tools.

@kasper0406
Copy link
Contributor Author

@joker-eph thanks for the review!
Notice that I do not have merge permissions on the repo, so if you are still happy with the PR after my recent commits, I'd appreciate if you can click the merge button for me 🙇

@kasper0406 kasper0406 requested a review from joker-eph November 22, 2024 20:31
@joker-eph joker-eph merged commit beff2ba into llvm:main Nov 23, 2024
8 checks passed
@kasper0406 kasper0406 deleted the kn/fix-asan-mac branch November 23, 2024 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants