Skip to content

[path_finder_dev] path_finder Search Priority v2 #604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
May 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
8c9a2de
WIP (search priority updated in README.md but not in code)
rwgk May 4, 2025
8479511
Merge branch 'path_finder_dev' into path_finder_search_priority_v2
rwgk May 4, 2025
2cf3fa2
Completely replace cuda_paths.py to achieve the desired Search Priori…
rwgk May 4, 2025
2b74022
Define `IS_WINDOWS = sys.platform == "win32"` in supported_libs.py
rwgk May 5, 2025
27db0a7
Use os.path.samefile() to resolve issues with doubled backslashes.
rwgk May 5, 2025
e0a0143
Merge branch 'path_finder_dev' into path_finder_search_priority_v2
rwgk May 5, 2025
1f728c0
`load_in_subprocess(): Pass current environment
rwgk May 5, 2025
0d23bb6
Add run_python_code_safely.py as generated by perplexity, plus ruff f…
rwgk May 5, 2025
b1a5e9d
Replace subprocess.run with run_python_code_safely
rwgk May 5, 2025
8e9c7b1
Factor out `class Worker` to fix pickle issue.
rwgk May 5, 2025
5977b9d
ChatGPT revisions based on Deep research:
rwgk May 5, 2025
9b474bc
Fix race condition in result queue handling by using timeout-based get()
rwgk May 5, 2025
ab00a87
Resolve SIM108
rwgk May 5, 2025
2a039d2
Change to "nppc" as ANCHOR_LIBNAME
rwgk May 5, 2025
f978e67
Implement CUDA_PYTHON_CUDA_HOME_PRIORITY first, last, with default first
rwgk May 6, 2025
782fcf6
Remove retry_with_anchor_abs_path() and make retry_with_cuda_home_pri…
rwgk May 6, 2025
676ecb2
Update README.md to reflect new search priority
rwgk May 6, 2025
73498c0
SUPPORTED_LINUX_SONAMES does not need updates for CTK 12.9.0
rwgk May 6, 2025
7661c13
The only addition to SUPPORTED_WINDOWS_DLLS for CTK 12.9.0 is nvvm70.dll
rwgk May 6, 2025
ddea021
Make OSError in load_dl_windows.py abs_path_for_dynamic_library() mor…
rwgk May 6, 2025
55583d9
run_cuda_bindings_path_finder.py: optionally use args as libnames (to…
rwgk May 6, 2025
a576327
Bug fix in load_dl_windows.py: ctypes.windll.kernel32.LoadLibraryW() …
rwgk May 6, 2025
5fb2d1f
Remove _find_nvidia_dynamic_library.retry_with_anchor_abs_path() meth…
rwgk May 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 13 additions & 38 deletions cuda_bindings/cuda/bindings/_path_finder/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,52 +24,27 @@ strategy for locating NVIDIA shared libraries:
The absolute path of the already loaded library will be returned, along
with the handle to the library.

1. **Python Package Ecosystem**
- Scans `sys.path` to find libraries installed via NVIDIA Python wheels.
1. **NVIDIA Python wheels**
- Scans all site-packages to find libraries installed via NVIDIA Python wheels.

2. **Conda Environments**
- Leverages Conda-specific paths through our fork of `get_cuda_paths()`
from numba-cuda.

3. **Environment variables**
- Relies on `CUDA_HOME`/`CUDA_PATH` environment variables if set.

4. **System Installations**
- Checks traditional system locations through these paths:
- Linux: `/usr/local/cuda/lib64`
- Windows: `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y\bin`
(where X.Y is the CTK version)
- **Notably does NOT search**:
- Versioned CUDA directories like `/usr/local/cuda-12.3`
- Distribution-specific packages (RPM/DEB)
EXCEPT Debian's `nvidia-cuda-toolkit`

5. **OS Default Mechanisms**
2. **OS default mechanisms / Conda environments**
- Falls back to native loader:
- `dlopen()` on Linux
- `LoadLibraryW()` on Windows
- CTK installations with system config updates are expected to be discovered:
- Linux: Via `/etc/ld.so.conf.d/*cuda*.conf`
- Windows: Via `C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\vX.Y\bin` on system `PATH`
- Conda installations are expected to be discovered:
- Linux: Via `$ORIGIN/../lib` on `RPATH` (of the `python` binary)
- Windows: Via `%CONDA_PREFIX%\Library\bin` on system `PATH`

3. **Environment variables**
- Relies on `CUDA_HOME` or `CUDA_PATH` environment variables if set
(in that order).

Note that the search is done on a per-library basis. There is no centralized
mechanism that ensures all libraries are found in the same way.

## Implementation Philosophy

The current implementation balances stability and evolution:

- **Baseline Foundation:** Uses a fork of numba-cuda's `cuda_paths.py` that has been
battle-tested in production environments.

- **Validation Infrastructure:** Comprehensive CI testing matrix being developed to cover:
- Various Linux/Windows environments
- Python packaging formats (wheels, conda)
- CUDA Toolkit versions

- **Roadmap:** Planned refactoring to:
- Unify library discovery logic
- Improve maintainability
- Better enforce search priority
- Expand platform support

## Maintenance Requirements

These key components must be updated for new CUDA Toolkit releases:
Expand Down
Loading
Loading