Skip to content

Commit 4ab41e9

Browse files
fxzjshmmglambda
authored andcommitted
HIP: force max threads per block to be 1024 (ggml-org#11621)
Some old/vendor forked version of llvm still use 256. Explicitly set it to 1024 to align with upstream llvm. Signed-off-by: fxzjshm <[email protected]>
1 parent f7c065f commit 4ab41e9

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

ggml/src/ggml-hip/CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,9 @@ endif()
4646

4747
message(STATUS "HIP and hipBLAS found")
4848

49+
# Workaround old compilers
50+
set(CMAKE_HIP_FLAGS "${CMAKE_HIP_FLAGS} --gpu-max-threads-per-block=1024")
51+
4952
file(GLOB GGML_HEADERS_ROCM "../ggml-cuda/*.cuh")
5053
list(APPEND GGML_HEADERS_ROCM "../../include/ggml-cuda.h")
5154

0 commit comments

Comments
 (0)