Skip to content

Commit 9745ac3

Browse files
author
Aidan
committed
Update sycl read-me for Nvidia target
1 parent acaf1ac commit 9745ac3

File tree

1 file changed

+26
-0
lines changed

1 file changed

+26
-0
lines changed

README-sycl.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,29 @@ For iGPU, please make sure the shared memory from host memory is enough. For lla
7373

7474
For dGPU, please make sure the device memory is enough. For llama-2-7b.Q4_0, recommend the device memory is 4GB+.
7575

76+
## Nvidia GPU
77+
78+
### Verified
79+
80+
|Intel GPU| Status | Verified Model|
81+
|-|-|-|
82+
|Ampere Series| Support| A100|
83+
84+
### oneMKL
85+
86+
The current oneMKL release does not contain the oneMKL cuBlas backend.
87+
As a result for Nvidia GPU's oneMKL must be built from source.
88+
89+
```
90+
git clone https://github.com/oneapi-src/oneMKL
91+
cd oneMKL
92+
mkdir build
93+
cd build
94+
cmake -G Ninja .. -DCMAKE_CXX_COMPILER=icpx -DCMAKE_C_COMPILER=icx -DENABLE_MKLGPU_BACKEND=OFF -DENABLE_MKLCPU_BACKEND=OFF -DENABLE_CUBLAS_BACKEND=ON
95+
ninja
96+
// Add paths as necessary
97+
```
98+
7699
## Docker
77100

78101
Note:
@@ -186,6 +209,9 @@ source /opt/intel/oneapi/setvars.sh
186209
# Or, for FP32:
187210
cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
188211

212+
# For Nvidia GPUs
213+
cmake .. -DLLAMA_SYCL=ON -DLLAMA_SYCL_TARGET=NVIDIA -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
214+
189215
# Build example/main only
190216
#cmake --build . --config Release --target main
191217

0 commit comments

Comments
 (0)