Skip to content

Commit aeb43a1

Browse files
committed
doc: comment on what the two cuda images are
1 parent fdad997 commit aeb43a1

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

README.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -532,8 +532,8 @@ Assuming one has the [nvidia-container-toolkit](https://github.com/NVIDIA/nvidia
532532
#### Building Locally
533533

534534
```bash
535-
docker build -t local/llama.cpp:full -f .devops/full-cuda.Dockerfile .
536-
docker build -t local/llama.cpp:light -f .devops/main-cuda.Dockerfile .
535+
docker build -t local/llama.cpp:full-cuda -f .devops/full-cuda.Dockerfile .
536+
docker build -t local/llama.cpp:light-cuda -f .devops/main-cuda.Dockerfile .
537537
```
538538

539539
You may want to pass in some different `ARGS`, depending on the CUDA environment supported by your container host, as well as the GPU architecture.
@@ -543,6 +543,11 @@ The defaults are:
543543
- `CUDA_VERSION` set to `11.7.1`
544544
- `CUDA_DOCKER_ARCH` set to `all`
545545

546+
The resulting images, are essentially the same as the non-CUDA images:
547+
548+
1. `local/llama.cpp:full-cuda`: This image includes both the main executable file and the tools to convert LLaMA models into ggml and convert into 4-bit quantization.
549+
2. `local/llama.cpp:light-cuda`: This image only includes the main executable file.
550+
546551
#### Usage
547552

548553
After building locally, Usage is similar to the non-CUDA examples, but you'll need to add the `--gpus` flag. You will also want to use the `--n-gpu-layers` flag.

0 commit comments

Comments
 (0)