Skip to content

Commit c87f807

Browse files
committed
docs: build cuda update
1 parent ecb81a4 commit c87f807

File tree

1 file changed

+53
-8
lines changed

1 file changed

+53
-8
lines changed

docs/build.md

Lines changed: 53 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -125,21 +125,66 @@ For detailed info, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
125125
126126
## CUDA
127127
128-
This provides GPU acceleration using an NVIDIA GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager (e.g. `apt install nvidia-cuda-toolkit`) or from the [NVIDIA developer site](https://developer.nvidia.com/cuda-downloads).
128+
This provides GPU acceleration using an NVIDIA GPU. Make sure to have the [CUDA toolkit](https://developer.nvidia.com/cuda-toolkit) installed.
129129
130-
If you are using Fedora (using Fedora Workstation, or an 'Atomic' variant such as Silverblue), or would like to set up CUDA in a toolbox, please consider our [Fedora CUDA guide](./cuda-fedora.md). Unfortunately, the process is not as simple as one might expect.
130+
#### Download directly from NVIDIA
131+
You may find the official downloads here: [NVIDIA developer site](https://developer.nvidia.com/cuda-downloads).
131132
132-
- Using `CMake`:
133133
134-
```bash
135-
cmake -B build -DGGML_CUDA=ON
136-
cmake --build build --config Release
137-
```
134+
#### Compile and run inside a Fedora Toolbox Container
135+
We also have a [guide](./cuda-fedora.md) for setting up CUDA toolkit in a Fedora [toolbox container](https://containertoolbx.org/).
136+
137+
**Recommended for:**
138+
139+
- ***Particularly*** *convenient* for users of [Atomic Desktops for Fedora](https://fedoraproject.org/atomic-desktops/); such as: [Silverblue](https://fedoraproject.org/atomic-desktops/silverblue/) and [Kinoite](https://fedoraproject.org/atomic-desktops/kinoite/).
140+
- Toolbox is installed by default: [Fedora Workstation](https://fedoraproject.org/workstation/) or [Fedora KDE Plasma Desktop](https://fedoraproject.org/spins/kde).
141+
- *Optionally* toolbox packages are available: [Arch Linux](https://archlinux.org/), [Red Hat Enterprise Linux >= 8.5](https://www.redhat.com/en/technologies/linux-platforms/enterprise-linux), or [Ubuntu](https://ubuntu.com/download)
142+
143+
144+
### Compilation
145+
```bash
146+
cmake -B build -DGGML_CUDA=ON
147+
cmake --build build --config Release
148+
```
149+
150+
### Override Compute Capability Specifications
151+
152+
If `nvcc` cannot detect your gpu, you may get compile-warnings such as:
153+
```text
154+
nvcc warning : Cannot find valid GPU for '-arch=native', default arch is used
155+
```
138156

139-
The environment variable [`CUDA_VISIBLE_DEVICES`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars) can be used to specify which GPU(s) will be used.
157+
To override the `native` GPU detection:
158+
159+
#### 1. Take note of the `Compute Capability` of your NVIDIA devices: ["CUDA: Your GPU Compute > Capability"](https://developer.nvidia.com/cuda-gpus).
160+
161+
```text
162+
GeForce RTX 4090 8.9
163+
GeForce RTX 3080 Ti 8.6
164+
GeForce RTX 3070 8.6
165+
```
166+
167+
#### 2. Manually list each varying `Compute Capability` in the `CMAKE_CUDA_ARCHITECTURES` list.
168+
169+
```bash
170+
cmake -B build -DGGML_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES="86;89"
171+
```
172+
173+
### Runtime CUDA environmental variables
174+
175+
You may set the [cuda environmental variables](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars) at runtime.
176+
177+
```bash
178+
# Use `CUDA_VISIBLE_DEVICES` to hide the first compute device.
179+
CUDA_VISIBLE_DEVICES="-0" ./build/bin/llama-server --model /srv/models/llama.gguf
180+
```
181+
182+
### Unified Memory
140183

141184
The environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY=1` can be used to enable unified memory in Linux. This allows swapping to system RAM instead of crashing when the GPU VRAM is exhausted. In Windows this setting is available in the NVIDIA control panel as `System Memory Fallback`.
142185

186+
### Performance Tuning
187+
143188
The following compilation options are also available to tweak performance:
144189

145190
| Option | Legal values | Default | Description |

0 commit comments

Comments
 (0)