Skip to content

Commit 3c29b91

Browse files
committed
update build.md
1 parent 886c153 commit 3c29b91

File tree

1 file changed

+27
-28
lines changed

1 file changed

+27
-28
lines changed

docs/build.md

Lines changed: 27 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,9 @@ git clone https://github.com/ggerganov/llama.cpp
77
cd llama.cpp
88
```
99

10-
In order to build llama.cpp you have four different options.
10+
The following sections describe how to build with different backends and options.
11+
12+
## CPU-only Build
1113

1214
- Using `CMake`:
1315

@@ -47,22 +49,10 @@ In order to build llama.cpp you have four different options.
4749
```
4850
Note: Building for arm64 could also be done just with MSVC (with the build-arm64-windows-MSVC preset, or the standard CMake build instructions). But MSVC does not support inline ARM assembly-code, used e.g. for the accelerated Q4_0_4_8 CPU kernels.
4951

50-
- Using `gmake` (FreeBSD):
51-
52-
1. Install and activate [DRM in FreeBSD](https://wiki.freebsd.org/Graphics)
53-
2. Add your user to **video** group
54-
3. Install compilation dependencies.
55-
56-
```bash
57-
sudo pkg install gmake automake autoconf pkgconf llvm15 openblas
58-
59-
gmake CC=/usr/local/bin/clang15 CXX=/usr/local/bin/clang++15 -j4
60-
```
61-
6252
## Metal Build
6353

6454
On MacOS, Metal is enabled by default. Using Metal makes the computation run on the GPU.
65-
To disable the Metal build at compile time use the `GGML_NO_METAL=1` flag or the `GGML_METAL=OFF` cmake option.
55+
To disable the Metal build at compile time use the `-DGGML_METAL=OFF` cmake option.
6656

6757
When built with Metal support, you can explicitly disable GPU inference with the `--n-gpu-layers|-ngl 0` command-line
6858
argument.
@@ -159,9 +149,9 @@ The environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY=1` can be used to enab
159149
160150
Most of the compilation options available for CUDA should also be available for MUSA, though they haven't been thoroughly tested yet.
161151

162-
### hipBLAS
152+
### HIP
163153

164-
This provides BLAS acceleration on HIP-supported AMD GPUs.
154+
This provides GPU acceleration on HIP-supported AMD GPUs.
165155
Make sure to have ROCm installed.
166156
You can download it from your Linux distro's package manager or from here: [ROCm Quick Start (Linux)](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/quick-start.html#rocm-install-quick).
167157
@@ -227,6 +217,12 @@ EOF
227217
228218
```
229219
220+
Switch into `llama.cpp` directory and build using CMake.
221+
```sh
222+
cmake -B build -DGGML_VULKAN=ON
223+
cmake --build build --config Release
224+
```
225+
230226
#### Git Bash MINGW64
231227
232228
Download and install [`Git-SCM`](https://git-scm.com/downloads/win) with the default settings
@@ -246,19 +242,20 @@ cmake --build build --config Release
246242
247243
Now you can load the model in conversation mode using `Vulkan`
248244
249-
```
250-
build/bin/release/llama-cli -m "[PATH TO MODEL]" -ngl 100 -c 16384 -t 10 -n -2 -cnv
245+
```sh
246+
build/bin/Release/llama-cli -m "[PATH TO MODEL]" -ngl 100 -c 16384 -t 10 -n -2 -cnv
251247
```
252248
253249
#### MSYS2
254250
Install [MSYS2](https://www.msys2.org/) and then run the following commands in a UCRT terminal to install dependencies.
255-
```sh
256-
pacman -S git \
257-
mingw-w64-ucrt-x86_64-gcc \
258-
mingw-w64-ucrt-x86_64-cmake \
259-
mingw-w64-ucrt-x86_64-vulkan-devel \
260-
mingw-w64-ucrt-x86_64-shaderc
261-
```
251+
```sh
252+
pacman -S git \
253+
mingw-w64-ucrt-x86_64-gcc \
254+
mingw-w64-ucrt-x86_64-cmake \
255+
mingw-w64-ucrt-x86_64-vulkan-devel \
256+
mingw-w64-ucrt-x86_64-shaderc
257+
```
258+
262259
Switch into `llama.cpp` directory and build using CMake.
263260
```sh
264261
cmake -B build -DGGML_VULKAN=ON
@@ -323,11 +320,13 @@ cmake --build build --config release
323320

324321
You can test with:
325322

326-
`./build/bin/llama-cli -m PATH_TO_MODEL -p "Building a website can be done in 10 steps:" -ngl 32`
323+
```bash
324+
./build/bin/llama-cli -m PATH_TO_MODEL -p "Building a website can be done in 10 steps:" -ngl 32
325+
```
327326

328-
If the fllowing info is output on screen, you are using `llama.cpp by CANN backend`:
327+
If the following info is output on screen, you are using `llama.cpp` with the CANN backend:
329328
```bash
330-
llm_load_tensors: CANN buffer size = 13313.00 MiB
329+
llm_load_tensors: CANN model buffer size = 13313.00 MiB
331330
llama_new_context_with_model: CANN compute buffer size = 1260.81 MiB
332331
```
333332

0 commit comments

Comments
 (0)