Skip to content

Commit 74c9750

Browse files
authored
Add path_finder.SUPPORTED_LIBNAMES (#558)
* Revert "Reapply "Revert debug changes under .github/workflows"" This reverts commit 8f69f83. * Add names of all CTK 12.8.1 x86_64-linux libraries (.so) as `path_finder.SUPPORTED_LIBNAMES` https://chatgpt.com/share/67f98d0b-148c-8008-9951-9995cf5d860c * Add `SUPPORTED_WINDOWS_DLLS` * Add copyright notice * Move SUPPORTED_LIBNAMES, SUPPORTED_WINDOWS_DLLS to _path_finder/supported_libs.py * Use SUPPORTED_WINDOWS_DLLS in _windows_load_with_dll_basename() * Change "Set up mini CTK" to use `method: local`, remove `sub-packages` line. * Use Jimver/[email protected] also under Linux, `method: local`, no `sub-packages`. * Add more `nvidia-*-cu12` wheels to get as many of the supported shared libraries as possible. * Revert "Use Jimver/[email protected] also under Linux, `method: local`, no `sub-packages`." This reverts commit d499806. Problem observed: ``` /usr/bin/docker exec 1b42cd4ea3149ac3f2448eae830190ee62289b7304a73f8001e90cead5005102 sh -c "cat /etc/*release | grep ^ID" Warning: Failed to restore: Cache service responded with 422 /usr/bin/tar --posix -cf cache.tgz --exclude cache.tgz -P -C /__w/cuda-python/cuda-python --files-from manifest.txt -z Failed to save: Unable to reserve cache with key cuda_installer-linux-5.15.0-135-generic-x64-12.8.0, another job may be creating this cache. More details: This legacy service is shutting down, effective April 15, 2025. Migrate to the new service ASAP. For more information: https://gh.io/gha-cache-sunset Warning: Error during installation: Error: Unable to locate executable file: sudo. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable. Error: Error: Unable to locate executable file: sudo. Please verify either the file path exists or the file can be found within a directory specified by the PATH environment variable. Also check the file mode to verify the file is executable. ``` * Change test_path_finder::test_find_and_load() to skip cufile on Windows, and report exceptions as failures, except for cudart * Add nvidia-cuda-runtime-cu12 to pyproject.toml (for libname cudart) * test_path_finder.py: before loading cusolver, load nvJitLink, cusparse, cublas (experiment to see if that resolves the only Windows failure) Test (win-64, Python 3.12, CUDA 12.8.0, Runner default, CTK wheels) / test ``` ================================== FAILURES =================================== ________________________ test_find_and_load[cusolver] _________________________ libname = 'cusolver' @pytest.mark.parametrize("libname", path_finder.SUPPORTED_LIBNAMES) def test_find_and_load(libname): if sys.platform == "win32" and libname == "cufile": pytest.skip(f'test_find_and_load("{libname}") not supported on this platform') print(f'\ntest_find_and_load("{libname}")') failures = [] for algo, func in ( ("find", path_finder.find_nvidia_dynamic_library), ("load", path_finder.load_nvidia_dynamic_library), ): try: out = func(libname) except Exception as e: out = f"EXCEPTION: {type(e)} {str(e)}" failures.append(algo) print(out) print() > assert not failures E AssertionError: assert not ['load'] tests\test_path_finder.py:29: AssertionError ``` * test_path_finder.py: load *only* nvJitLink before loading cusolver * Run each test_find_or_load_nvidia_dynamic_library() subtest in a subprocess * Add cublasLt to supported_libs.py and load deps for cusolver, cusolverMg, cusparse in test_path_finder.py. Also restrict test_path_finder.py to test load only for now. * Add supported_libs.DIRECT_DEPENDENCIES * Remove cufile_rdma from supported libs (comment out). https://chatgpt.com/share/68033a33-385c-8008-a293-4c8cc3ea23ae * Split out `PARTIALLY_SUPPORTED_LIBNAMES`. Fix up test code. * Reduce public API to only load_nvidia_dynamic_library, SUPPORTED_LIBNAMES * Set CUDA_BINDINGS_PATH_FINDER_TEST_ALL_LIBNAMES=1 to match expected availability of nvidia shared libraries. * Refactor as `class _find_nvidia_dynamic_library` * Strict wheel, conda, system rule: try using the platform-specific dynamic loader search mechanisms only last * Introduce _load_and_report_path_linux(), add supported_libs.EXPECTED_LIB_SYMBOLS * Plug in ctypes.windll.kernel32.GetModuleFileNameW() * Keep track of nvrtc-related GitHub comment * Factor out `_find_dll_under_dir(dirpath, file_wild)` and reuse from `_find_dll_using_nvidia_bin_dirs()`, `_find_dll_using_cudalib_dir()` (to fix loading nvrtc64_120_0.dll from local CTK) * Minimal "is already loaded" code. * Add THIS FILE NEEDS TO BE REVIEWED/UPDATED FOR EACH CTK RELEASE comment in _path_finder/supported_libs.py * Add SUPPORTED_LINUX_SONAMES in _path_finder/supported_libs.py * Update SUPPORTED_WINDOWS_DLLS in _path_finder/supported_libs.py based on DLLs found in cuda_*win*.exe files. * Remove `os.add_dll_directory()` and `os.environ["PATH"]` manipulations from find_nvidia_dynamic_library.py. Add `supported_libs.LIBNAMES_REQUIRING_OS_ADD_DLL_DIRECTORY` and use from `load_nvidia_dynamic_library()`. * Move nvrtc-specific code from find_nvidia_dynamic_library.py to `supported_libs.is_suppressed_dll_file()` * Introduce dataclass LoadedDL as return type for load_nvidia_dynamic_library() * Factor out _abs_path_for_dynamic_library_* and use on handle obtained through "is already loaded" checks * Factor out _load_nvidia_dynamic_library_no_cache() and use for exercising LoadedDL.was_already_loaded_from_elsewhere * _check_nvjitlink_usable() in test_path_finder.py * Undo changes in .github/workflows/ and cuda_bindings/pyproject.toml * Move cuda_bindings/tests/path_finder.py -> toolshed/run_cuda_bindings_path_finder.py * Add bandit suppressions in test_path_finder.py * Add pytest info_summary_append fixture and use from test_path_finder.py to report the absolute paths of the loaded libraries.
1 parent 32f6c76 commit 74c9750

17 files changed

+868
-146
lines changed

cuda_bindings/cuda/bindings/_bindings/cynvrtc.pyx.in

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ cdef int cuPythonInit() except -1 nogil:
5656

5757
{{if 'Windows' == platform.system()}}
5858
with gil:
59-
handle = path_finder.load_nvidia_dynamic_library("nvrtc")
59+
handle = path_finder.load_nvidia_dynamic_library("nvrtc").handle
6060
{{if 'nvrtcGetErrorString' in found_functions}}
6161
try:
6262
global __nvrtcGetErrorString
@@ -242,7 +242,7 @@ cdef int cuPythonInit() except -1 nogil:
242242

243243
{{else}}
244244
with gil:
245-
handle = <void*><uintptr_t>path_finder.load_nvidia_dynamic_library("nvrtc")
245+
handle = <void*><uintptr_t>path_finder.load_nvidia_dynamic_library("nvrtc").handle
246246
{{if 'nvrtcGetErrorString' in found_functions}}
247247
global __nvrtcGetErrorString
248248
__nvrtcGetErrorString = dlfcn.dlsym(handle, 'nvrtcGetErrorString')

cuda_bindings/cuda/bindings/_internal/nvjitlink_linux.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ cdef void* __nvJitLinkVersion = NULL
5353

5454

5555
cdef void* load_library(int driver_ver) except* with gil:
56-
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink")
56+
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink").handle
5757
return <void*>handle
5858

5959

cuda_bindings/cuda/bindings/_internal/nvjitlink_windows.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ cdef void* __nvJitLinkVersion = NULL
4040

4141

4242
cdef void* load_library(int driver_ver) except* with gil:
43-
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink")
43+
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink").handle
4444
return <void*>handle
4545

4646

cuda_bindings/cuda/bindings/_internal/nvvm_linux.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ cdef void* __nvvmGetProgramLog = NULL
5151

5252

5353
cdef void* load_library(const int driver_ver) except* with gil:
54-
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm")
54+
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm").handle
5555
return <void*>handle
5656

5757

cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ cdef void* __nvvmGetProgramLog = NULL
3838

3939

4040
cdef void* load_library(int driver_ver) except* with gil:
41-
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm")
41+
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm").handle
4242
return <void*>handle
4343

4444

cuda_bindings/cuda/bindings/_path_finder/find_nvidia_dynamic_library.py

Lines changed: 50 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
import os
88

99
from .cuda_paths import IS_WIN32, get_cuda_paths
10+
from .supported_libs import is_suppressed_dll_file
1011
from .sys_path_find_sub_dirs import sys_path_find_sub_dirs
1112

1213

@@ -38,9 +39,13 @@ def _find_so_using_nvidia_lib_dirs(libname, so_basename, error_messages, attachm
3839
return None
3940

4041

41-
def _append_to_os_environ_path(dirpath):
42-
curr_path = os.environ.get("PATH")
43-
os.environ["PATH"] = dirpath if curr_path is None else os.pathsep.join((curr_path, dirpath))
42+
def _find_dll_under_dir(dirpath, file_wild):
43+
for path in sorted(glob.glob(os.path.join(dirpath, file_wild))):
44+
if not os.path.isfile(path):
45+
continue
46+
if not is_suppressed_dll_file(os.path.basename(path)):
47+
return path
48+
return None
4449

4550

4651
def _find_dll_using_nvidia_bin_dirs(libname, error_messages, attachments):
@@ -50,30 +55,8 @@ def _find_dll_using_nvidia_bin_dirs(libname, error_messages, attachments):
5055
nvidia_sub_dirs = ("nvidia", "*", "bin")
5156
file_wild = libname + "*.dll"
5257
for bin_dir in sys_path_find_sub_dirs(nvidia_sub_dirs):
53-
dll_name = None
54-
have_builtins = False
55-
for path in sorted(glob.glob(os.path.join(bin_dir, file_wild))):
56-
# nvidia_cuda_nvrtc_cu12-12.8.93-py3-none-win_amd64.whl:
57-
# nvidia\cuda_nvrtc\bin\
58-
# nvrtc-builtins64_128.dll
59-
# nvrtc64_120_0.alt.dll
60-
# nvrtc64_120_0.dll
61-
node = os.path.basename(path)
62-
if node.endswith(".alt.dll"):
63-
continue
64-
if "-builtins" in node:
65-
have_builtins = True
66-
continue
67-
if dll_name is not None:
68-
continue
69-
if os.path.isfile(path):
70-
dll_name = path
58+
dll_name = _find_dll_under_dir(bin_dir, file_wild)
7159
if dll_name is not None:
72-
if have_builtins:
73-
# Add the DLL directory to the search path
74-
os.add_dll_directory(bin_dir)
75-
# Update PATH as a fallback for dependent DLL resolution
76-
_append_to_os_environ_path(bin_dir)
7760
return dll_name
7861
_no_such_file_in_sub_dirs(nvidia_sub_dirs, file_wild, error_messages, attachments)
7962
return None
@@ -122,41 +105,52 @@ def _find_dll_using_cudalib_dir(libname, error_messages, attachments):
122105
if cudalib_dir is None:
123106
return None
124107
file_wild = libname + "*.dll"
125-
for dll_name in sorted(glob.glob(os.path.join(cudalib_dir, file_wild))):
126-
if os.path.isfile(dll_name):
127-
return dll_name
108+
dll_name = _find_dll_under_dir(cudalib_dir, file_wild)
109+
if dll_name is not None:
110+
return dll_name
128111
error_messages.append(f"No such file: {file_wild}")
129112
attachments.append(f' listdir("{cudalib_dir}"):')
130113
for node in sorted(os.listdir(cudalib_dir)):
131114
attachments.append(f" {node}")
132115
return None
133116

134117

135-
@functools.cache
136-
def find_nvidia_dynamic_library(name: str) -> str:
137-
error_messages = []
138-
attachments = []
139-
140-
if IS_WIN32:
141-
dll_name = _find_dll_using_nvidia_bin_dirs(name, error_messages, attachments)
142-
if dll_name is None:
143-
if name == "nvvm":
144-
dll_name = _get_cuda_paths_info("nvvm", error_messages)
145-
else:
146-
dll_name = _find_dll_using_cudalib_dir(name, error_messages, attachments)
147-
if dll_name is None:
148-
attachments = "\n".join(attachments)
149-
raise RuntimeError(f'Failure finding "{name}*.dll": {", ".join(error_messages)}\n{attachments}')
150-
return dll_name
151-
152-
so_basename = f"lib{name}.so"
153-
so_name = _find_so_using_nvidia_lib_dirs(name, so_basename, error_messages, attachments)
154-
if so_name is None:
155-
if name == "nvvm":
156-
so_name = _get_cuda_paths_info("nvvm", error_messages)
118+
class _find_nvidia_dynamic_library:
119+
def __init__(self, libname: str):
120+
self.libname = libname
121+
self.error_messages = []
122+
self.attachments = []
123+
self.abs_path = None
124+
125+
if IS_WIN32:
126+
self.abs_path = _find_dll_using_nvidia_bin_dirs(libname, self.error_messages, self.attachments)
127+
if self.abs_path is None:
128+
if libname == "nvvm":
129+
self.abs_path = _get_cuda_paths_info("nvvm", self.error_messages)
130+
else:
131+
self.abs_path = _find_dll_using_cudalib_dir(libname, self.error_messages, self.attachments)
132+
self.lib_searched_for = f"{libname}*.dll"
157133
else:
158-
so_name = _find_so_using_cudalib_dir(so_basename, error_messages, attachments)
159-
if so_name is None:
160-
attachments = "\n".join(attachments)
161-
raise RuntimeError(f'Failure finding "{so_basename}": {", ".join(error_messages)}\n{attachments}')
162-
return so_name
134+
self.lib_searched_for = f"lib{libname}.so"
135+
self.abs_path = _find_so_using_nvidia_lib_dirs(
136+
libname, self.lib_searched_for, self.error_messages, self.attachments
137+
)
138+
if self.abs_path is None:
139+
if libname == "nvvm":
140+
self.abs_path = _get_cuda_paths_info("nvvm", self.error_messages)
141+
else:
142+
self.abs_path = _find_so_using_cudalib_dir(
143+
self.lib_searched_for, self.error_messages, self.attachments
144+
)
145+
146+
def raise_if_abs_path_is_None(self):
147+
if self.abs_path:
148+
return self.abs_path
149+
err = ", ".join(self.error_messages)
150+
att = "\n".join(self.attachments)
151+
raise RuntimeError(f'Failure finding "{self.lib_searched_for}": {err}\n{att}')
152+
153+
154+
@functools.cache
155+
def find_nvidia_dynamic_library(libname: str) -> str:
156+
return _find_nvidia_dynamic_library(libname).raise_if_abs_path_is_None()

0 commit comments

Comments
 (0)