You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/build.md
-63Lines changed: 0 additions & 63 deletions
Original file line number
Diff line number
Diff line change
@@ -9,30 +9,6 @@ cd llama.cpp
9
9
10
10
In order to build llama.cpp you have four different options.
11
11
12
-
- Using `make`:
13
-
- On Linux or MacOS:
14
-
15
-
```bash
16
-
make
17
-
```
18
-
19
-
- On Windows (x86/x64 only, arm64 requires cmake):
20
-
21
-
1. Download the latest fortran version of [w64devkit](https://github.com/skeeto/w64devkit/releases).
22
-
2. Extract `w64devkit` on your pc.
23
-
3. Run `w64devkit.exe`.
24
-
4. Use the `cd`command to reach the `llama.cpp` folder.
25
-
5. From here you can run:
26
-
```bash
27
-
make
28
-
```
29
-
30
-
- Notes:
31
-
- For `Q4_0_4_4` quantization type build, add the `GGML_NO_LLAMAFILE=1` flag. For example, use `make GGML_NO_LLAMAFILE=1`.
32
-
- For faster compilation, add the `-j` argument to run multiple jobsin parallel. For example, `make -j 8` will run 8 jobsin parallel.
33
-
- For faster repeated compilation, install [ccache](https://ccache.dev/).
34
-
- For debug builds, run `make LLAMA_DEBUG=1`
35
-
36
12
- Using `CMake`:
37
13
38
14
```bash
@@ -104,27 +80,6 @@ This is only available on Mac PCs and it's enabled by default. You can just buil
104
80
105
81
This provides BLAS acceleration using only the CPU. Make sure to have OpenBLAS installed on your machine.
106
82
107
-
- Using `make`:
108
-
- On Linux:
109
-
```bash
110
-
make GGML_OPENBLAS=1
111
-
```
112
-
113
-
- On Windows:
114
-
115
-
1. Download the latest fortran version of [w64devkit](https://github.com/skeeto/w64devkit/releases).
116
-
2. Download the latest version of [OpenBLAS for Windows](https://github.com/xianyi/OpenBLAS/releases).
117
-
3. Extract `w64devkit` on your pc.
118
-
4. From the OpenBLAS zip that you just downloaded copy `libopenblas.a`, located inside the `lib` folder, inside `w64devkit\x86_64-w64-mingw32\lib`.
119
-
5. From the same OpenBLAS zip copy the content of the `include` folder inside `w64devkit\x86_64-w64-mingw32\include`.
120
-
6. Run `w64devkit.exe`.
121
-
7. Use the `cd`command to reach the `llama.cpp` folder.
122
-
8. From here you can run:
123
-
124
-
```bash
125
-
make GGML_OPENBLAS=1
126
-
```
127
-
128
83
- Using `CMake` on Linux:
129
84
130
85
```bash
@@ -167,10 +122,6 @@ This provides GPU acceleration using the CUDA cores of your Nvidia GPU. Make sur
167
122
168
123
For Jetson user, if you have Jetson Orin, you can try this: [Offical Support](https://www.jetson-ai-lab.com/tutorial_text-generation.html). If you are using an old model(nano/TX2), need some additional operations before compiling.
169
124
170
-
- Using `make`:
171
-
```bash
172
-
make GGML_CUDA=1
173
-
```
174
125
- Using `CMake`:
175
126
176
127
```bash
@@ -196,10 +147,6 @@ The following compilation options are also available to tweak performance:
196
147
197
148
This provides GPU acceleration using the MUSA cores of your Moore Threads MTT GPU. Make sure to have the MUSA SDK installed. You can download it from here: [MUSA SDK](https://developer.mthreads.com/sdk/download/musa).
198
149
199
-
- Using `make`:
200
-
```bash
201
-
make GGML_MUSA=1
202
-
```
203
150
- Using `CMake`:
204
151
205
152
```bash
@@ -219,10 +166,6 @@ This provides BLAS acceleration on HIP-supported AMD GPUs.
219
166
Make sure to have ROCm installed.
220
167
You can download it from your Linux distro's package manager or from here: [ROCm Quick Start (Linux)](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/quick-start.html#rocm-install-quick).
221
168
222
-
- Using `make`:
223
-
```bash
224
-
make GGML_HIP=1
225
-
```
226
169
- Using `CMake` for Linux (assuming a gfx1030-compatible AMD GPU):
0 commit comments