Skip to content

Commit e067689

Browse files
committed
ROCm: use native CMake HIP support
Supercedes ggml-org#4024 and ggml-org#4813. CMake's native HIP support has become the recommended way to add HIP code into a project (see [here](https://rocm.docs.amd.com/en/docs-6.0.0/conceptual/cmake-packages.html#using-hip-in-cmake)). This PR makes the following changes: 1. The environment variable `HIPCXX` or CMake option `CMAKE_HIP_COMPILER` should be used to specify the HIP compiler. Notably this shouldn't be `hipcc`, but ROCm's clang, which usually resides in `$ROCM_PATH/llvm/bin/clang`. Previously this was control by `CMAKE_C_COMPILER` and `CMAKE_CXX_COMPILER`. 2. CMake option `CMAKE_HIP_ARCHITECTURES` is used to control the GPU architectures to build for. Previously this was controled by `GPU_TARGETS`. 3. Updated the Nix recipe to account for these new changes. 4. The GPU targets to build against in the Nix recipe is now consistent with the supported GPU targets in nixpkgs. The most important part about this PR is the separation of the HIP compiler and the C/C++ compiler. This allows users to choose a different C/C++ compiler if desired, compared to the current situation where when building for ROCm support, everything must be compiled with ROCm's clang. ~~Makefile is unchanged. Please let me know if we want to be consistent on variables' naming because Makefile still uses `GPU_TARGETS` to control architectures to build for, but I feel like setting `CMAKE_HIP_ARCHITECTURES` is a bit awkward when you're calling `make`.~~ Makefile used `GPU_TARGETS` but the README says to use `AMDGPU_TARGETS`. For consistency with CMake, all usage of `GPU_TARGETS` in Makefile has been updated to `AMDGPU_TARGETS`. Thanks to the suggestion of @jin-eld, to maintain backwards compatibility (and not break too many downstream users' builds), if `CMAKE_CXX_COMPILER` ends with `hipcc`, then we still compile using the original behavior and emit a warning that recommends switching to the new HIP support. Similarly, if `AMDGPU_TARGETS` is set but `CMAKE_HIP_ARCHITECTURES` is not, then we forward `AMDGPU_TARGETS` to `CMAKE_HIP_ARCHITECTURES` to ease the transition to the new HIP support. Signed-off-by: Gavin Zhao <[email protected]>
1 parent 5b7b0ac commit e067689

File tree

5 files changed

+108
-21
lines changed

5 files changed

+108
-21
lines changed

.devops/nix/package.nix

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -207,18 +207,18 @@ effectiveStdenv.mkDerivation (
207207
)
208208
]
209209
++ optionals useRocm [
210-
(cmakeFeature "CMAKE_C_COMPILER" "hipcc")
211-
(cmakeFeature "CMAKE_CXX_COMPILER" "hipcc")
212-
213-
# Build all targets supported by rocBLAS. When updating search for TARGET_LIST_ROCM
214-
# in https://github.com/ROCmSoftwarePlatform/rocBLAS/blob/develop/CMakeLists.txt
215-
# and select the line that matches the current nixpkgs version of rocBLAS.
216-
# Should likely use `rocmPackages.clr.gpuTargets`.
217-
"-DAMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx940;gfx941;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102"
210+
(cmakeFeature "CMAKE_HIP_COMPILER" "${rocmPackages.llvm.clang}/bin/clang")
211+
(cmakeFeature "CMAKE_HIP_ARCHITECTURES" (builtins.concatStringsSep ";" rocmPackages.clr.gpuTargets))
218212
]
219213
++ optionals useMetalKit [ (lib.cmakeFeature "CMAKE_C_FLAGS" "-D__ARM_FEATURE_DOTPROD=1") ]
220214
++ optionals useBlas [ (lib.cmakeFeature "LLAMA_BLAS_VENDOR" "OpenBLAS") ];
221215

216+
# Environment variables needed for ROCm
217+
env = optionals useRocm {
218+
ROCM_PATH = "${rocmPackages.clr}";
219+
HIP_DEVICE_LIB_PATH = "${rocmPackages.rocm-device-libs}/amdgcn/bitcode";
220+
};
221+
222222
# TODO(SomeoneSerge): It's better to add proper install targets at the CMake level,
223223
# if they haven't been added yet.
224224
postInstall = ''

.github/workflows/build.yml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,33 @@ jobs:
301301
cmake -DLLAMA_VULKAN=ON ..
302302
cmake --build . --config Release -j $(nproc)
303303
304+
ubuntu-22-cmake-hip:
305+
runs-on: ubuntu-22.04
306+
container: rocm/dev-ubuntu-22.04:6.0.2
307+
308+
steps:
309+
- name: Clone
310+
id: checkout
311+
uses: actions/checkout@v3
312+
313+
- name: Dependencies
314+
id: depends
315+
run: |
316+
sudo apt-get update
317+
sudo apt-get install -y build-essential git cmake rocblas-dev hipblas-dev
318+
319+
- name: Build with native CMake HIP support
320+
id: cmake_build
321+
run: |
322+
cmake -B build -S . -DCMAKE_HIP_COMPILER="$(hipconfig -l)/clang" -DLLAMA_HIPBLAS=ON
323+
cmake --build build --config Release -j $(nproc)
324+
325+
- name: Build with legacy HIP support
326+
id: cmake_build_legacy_hip
327+
run: |
328+
cmake -B build2 -S . -DCMAKE_C_COMPILER=hipcc -DCMAKE_CXX_COMPILER=hipcc -DLLAMA_HIPBLAS=ON
329+
cmake --build build2 --config Release -j $(nproc)
330+
304331
ubuntu-22-cmake-sycl:
305332
runs-on: ubuntu-22.04
306333

@@ -797,6 +824,36 @@ jobs:
797824
id: cmake_build
798825
run: examples/sycl/win-build-sycl.bat
799826

827+
windows-latest-cmake-hip:
828+
runs-on: windows
829+
830+
steps:
831+
- name: Clone
832+
id: checkout
833+
uses: actions/checkout@v3
834+
835+
- name: Install
836+
id: depends
837+
run: |
838+
$ErrorActionPreference = "Stop"
839+
write-host "Downloading AMD HIP SDK Installer"
840+
Invoke-WebRequest -Uri "https://download.amd.com/developer/eula/rocm-hub/AMD-Software-PRO-Edition-23.Q4-WinSvr2022-For-HIP.exe" -OutFile "${env:RUNNER_TEMP}\rocm-install.exe"
841+
write-host "Installing AMD HIP SDK"
842+
Start-Process "${env:RUNNER_TEMP}\rocm-install.exe" -ArgumentList '-install' -NoNewWindow -Wait
843+
write-host "Completed AMD HIP SDK installation"
844+
845+
- name: Verify ROCm
846+
id: verify
847+
run: |
848+
& 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' --version
849+
850+
- name: Build
851+
id: cmake_build
852+
run: |
853+
$env:HIP_PATH=$(Resolve-Path 'C:\Program Files\AMD\ROCm\*\bin\clang.exe' | split-path | split-path)
854+
cmake -B build -S . -DCMAKE_C_COMPILER="$HIP_PATH\bin\clang.exe" -DCMAKE_CXX_COMPILER="$HIP_PATH\bin\clang++.exe" -DLLAMA_HIPBLAS=ON
855+
cmake --build build --config Release
856+
800857
ios-xcode-build:
801858
runs-on: macos-latest
802859

CMakeLists.txt

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -498,16 +498,26 @@ if (LLAMA_VULKAN)
498498
endif()
499499

500500
if (LLAMA_HIPBLAS)
501-
list(APPEND CMAKE_PREFIX_PATH /opt/rocm)
502-
503-
if (NOT ${CMAKE_C_COMPILER_ID} MATCHES "Clang")
504-
message(WARNING "Only LLVM is supported for HIP, hint: CC=/opt/rocm/llvm/bin/clang")
501+
if ($ENV{ROCM_PATH})
502+
set(ROCM_PATH $ENV{ROCM_PATH})
503+
else()
504+
set(ROCM_PATH /opt/rocm)
505505
endif()
506+
list(APPEND CMAKE_PREFIX_PATH ${ROCM_PATH})
506507

507-
if (NOT ${CMAKE_CXX_COMPILER_ID} MATCHES "Clang")
508-
message(WARNING "Only LLVM is supported for HIP, hint: CXX=/opt/rocm/llvm/bin/clang++")
509-
endif()
508+
string(REGEX MATCH "hipcc(\.bat)?$" CXX_IS_HIPCC "${CMAKE_CXX_COMPILER}")
510509

510+
if (CXX_IS_HIPCC AND UNIX)
511+
message(WARNING "Setting hipcc as the C++ compiler is legacy behavior."
512+
" Prefer setting the HIP compiler directly. See README for details.")
513+
else()
514+
# Forward AMDGPU_TARGETS to CMAKE_HIP_ARCHITECTURES.
515+
if (AMDGPU_TARGETS AND NOT CMAKE_HIP_ARCHITECTURES)
516+
set(CMAKE_HIP_ARCHITECTURES ${AMDGPU_ARGETS})
517+
endif()
518+
cmake_minimum_required(VERSION 3.21)
519+
enable_language(HIP)
520+
endif()
511521
find_package(hip REQUIRED)
512522
find_package(hipblas REQUIRED)
513523
find_package(rocblas REQUIRED)
@@ -535,13 +545,18 @@ if (LLAMA_HIPBLAS)
535545
add_compile_definitions(GGML_CUDA_MMV_Y=${LLAMA_CUDA_MMV_Y})
536546
add_compile_definitions(K_QUANTS_PER_ITERATION=${LLAMA_CUDA_KQUANTS_ITER})
537547

538-
set_source_files_properties(ggml-cuda.cu PROPERTIES LANGUAGE CXX)
548+
if (CXX_IS_HIPCC)
549+
set_source_files_properties(ggml-cuda.cu PROPERTIES LANGUAGE CXX)
550+
set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} hip::device)
551+
else()
552+
set_source_files_properties(ggml-cuda.cu PROPERTIES LANGUAGE HIP)
553+
endif()
539554

540555
if (LLAMA_STATIC)
541556
message(FATAL_ERROR "Static linking not supported for HIP/ROCm")
542557
endif()
543558

544-
set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} hip::device PUBLIC hip::host roc::rocblas roc::hipblas)
559+
set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} PUBLIC hip::host roc::rocblas roc::hipblas)
545560
endif()
546561

547562
if (LLAMA_SYCL)

Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -513,10 +513,10 @@ ifdef LLAMA_HIPBLAS
513513

514514
ifeq ($(wildcard /opt/rocm),)
515515
ROCM_PATH ?= /usr
516-
GPU_TARGETS ?= $(shell $(shell which amdgpu-arch))
516+
AMDGPU_TARGETS ?= $(shell $(shell which amdgpu-arch))
517517
else
518518
ROCM_PATH ?= /opt/rocm
519-
GPU_TARGETS ?= $(shell $(ROCM_PATH)/llvm/bin/amdgpu-arch)
519+
AMDGPU_TARGETS ?= $(shell $(ROCM_PATH)/llvm/bin/amdgpu-arch)
520520
endif
521521
HIPCC ?= $(CCACHE) $(ROCM_PATH)/bin/hipcc
522522
LLAMA_CUDA_DMMV_X ?= 32
@@ -528,7 +528,7 @@ ifdef LLAMA_HIP_UMA
528528
endif # LLAMA_HIP_UMA
529529
MK_LDFLAGS += -L$(ROCM_PATH)/lib -Wl,-rpath=$(ROCM_PATH)/lib
530530
MK_LDFLAGS += -lhipblas -lamdhip64 -lrocblas
531-
HIPFLAGS += $(addprefix --offload-arch=,$(GPU_TARGETS))
531+
HIPFLAGS += $(addprefix --offload-arch=,$(AMDGPU_TARGETS))
532532
HIPFLAGS += -DGGML_CUDA_DMMV_X=$(LLAMA_CUDA_DMMV_X)
533533
HIPFLAGS += -DGGML_CUDA_MMV_Y=$(LLAMA_CUDA_MMV_Y)
534534
HIPFLAGS += -DK_QUANTS_PER_ITERATION=$(LLAMA_CUDA_KQUANTS_ITER)

README.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -490,13 +490,28 @@ Building the program with BLAS support may lead to some performance improvements
490490
```
491491
- Using `CMake` for Linux (assuming a gfx1030-compatible AMD GPU):
492492
```bash
493-
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ \
493+
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
494494
cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
495495
&& cmake --build build -- -j 16
496496
```
497497
On Linux it is also possible to use unified memory architecture (UMA) to share main memory between the CPU and integrated GPU by setting `-DLLAMA_HIP_UMA=ON"`.
498498
However, this hurts performance for non-integrated GPUs (but enables working with integrated GPUs).
499499
500+
Note that if you get the following error:
501+
```
502+
clang: error: cannot find ROCm device library; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
503+
```
504+
Try searching for a directory under `HIP_PATH` that contains the file
505+
`oclc_abi_version_400.bc`. Then, add the following to the start of the
506+
command: `HIP_DEVICE_LIB_PATH=<directory-you-just-found>`, so something
507+
like:
508+
```bash
509+
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -p)" \
510+
HIP_DEVICE_LIB_PATH=<directory-you-just-found> \
511+
cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
512+
&& cmake --build build -- -j 16
513+
```
514+
500515
- Using `make` (example for target gfx1030, build with 16 CPU threads):
501516
```bash
502517
make -j16 LLAMA_HIPBLAS=1 LLAMA_HIP_UMA=1 AMDGPU_TARGETS=gxf1030

0 commit comments

Comments
 (0)