[HLSL] Don't use CreateRuntimeFunction for intrinsics #145334

nikic · 2025-06-23T13:58:36Z

HLSL uses CreateRuntimeFunction for three intrinsics. This is pretty unusual thing to do, and doesn't match what the rest of the file does.

I suspect this might be because these are convergent calls, but the intrinsics themselves are already marked convergent, so it's not necessary for clang to manually add the attribute.

This does lose the spir_func CC on the intrinsic declaration, but again, CC should not be relevant to intrinsics at all.

llvmbot · 2025-06-23T13:59:08Z

@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-clang-codegen

Author: Nikita Popov (nikic)

Changes

HLSL uses CreateRuntimeFunction for two intrinsics. This is pretty unusual thing to do, and doesn't match what the rest of the file does.

I suspect this might be because these are convergent calls, but the intrinsics themselves are already marked convergent, so it's not necessary for clang to manually add the attribute.

This does lose the spir_func CC on the intrinsic declaration, but again, CC should not be relevant to intrinsics at all.

Full diff: https://github.com/llvm/llvm-project/pull/145334.diff

3 Files Affected:

(modified) clang/lib/CodeGen/CGHLSLBuiltins.cpp (+4-16)
(modified) clang/test/CodeGenHLSL/builtins/WaveActiveMax.hlsl (+3-3)
(modified) clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl (+3-3)

diff --git a/clang/lib/CodeGen/CGHLSLBuiltins.cpp b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
index 2a60a0909c93e..5074a915f2817 100644
--- a/clang/lib/CodeGen/CGHLSLBuiltins.cpp
+++ b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
@@ -676,35 +676,23 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
   case Builtin::BI__builtin_hlsl_wave_active_sum: {
     // Due to the use of variadic arguments, explicitly retreive argument
     Value *OpExpr = EmitScalarExpr(E->getArg(0));
-    llvm::FunctionType *FT = llvm::FunctionType::get(
-        OpExpr->getType(), ArrayRef{OpExpr->getType()}, false);
     Intrinsic::ID IID = getWaveActiveSumIntrinsic(
         getTarget().getTriple().getArch(), CGM.getHLSLRuntime(),
         E->getArg(0)->getType());
 
-    // Get overloaded name
-    std::string Name =
-        Intrinsic::getName(IID, ArrayRef{OpExpr->getType()}, &CGM.getModule());
-    return EmitRuntimeCall(CGM.CreateRuntimeFunction(FT, Name, {},
-                                                     /*Local=*/false,
-                                                     /*AssumeConvergent=*/true),
+    return EmitRuntimeCall(Intrinsic::getOrInsertDeclaration(
+                               &CGM.getModule(), IID, {OpExpr->getType()}),
                            ArrayRef{OpExpr}, "hlsl.wave.active.sum");
   }
   case Builtin::BI__builtin_hlsl_wave_active_max: {
     // Due to the use of variadic arguments, explicitly retreive argument
     Value *OpExpr = EmitScalarExpr(E->getArg(0));
-    llvm::FunctionType *FT = llvm::FunctionType::get(
-        OpExpr->getType(), ArrayRef{OpExpr->getType()}, false);
     Intrinsic::ID IID = getWaveActiveMaxIntrinsic(
         getTarget().getTriple().getArch(), CGM.getHLSLRuntime(),
         E->getArg(0)->getType());
 
-    // Get overloaded name
-    std::string Name =
-        Intrinsic::getName(IID, ArrayRef{OpExpr->getType()}, &CGM.getModule());
-    return EmitRuntimeCall(CGM.CreateRuntimeFunction(FT, Name, {},
-                                                     /*Local=*/false,
-                                                     /*AssumeConvergent=*/true),
+    return EmitRuntimeCall(Intrinsic::getOrInsertDeclaration(
+                               &CGM.getModule(), IID, {OpExpr->getType()}),
                            ArrayRef{OpExpr}, "hlsl.wave.active.max");
   }
   case Builtin::BI__builtin_hlsl_wave_get_lane_index: {
diff --git a/clang/test/CodeGenHLSL/builtins/WaveActiveMax.hlsl b/clang/test/CodeGenHLSL/builtins/WaveActiveMax.hlsl
index 7891cfc1989af..be05a17cc3692 100644
--- a/clang/test/CodeGenHLSL/builtins/WaveActiveMax.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/WaveActiveMax.hlsl
@@ -16,7 +16,7 @@ int test_int(int expr) {
 }
 
 // CHECK-DXIL: declare [[TY]] @llvm.dx.wave.reduce.max.i32([[TY]]) #[[#attr:]]
-// CHECK-SPIRV: declare spir_func [[TY]] @llvm.spv.wave.reduce.max.i32([[TY]]) #[[#attr:]]
+// CHECK-SPIRV: declare [[TY]] @llvm.spv.wave.reduce.max.i32([[TY]]) #[[#attr:]]
 
 // CHECK-LABEL: test_uint64_t
 uint64_t test_uint64_t(uint64_t expr) {
@@ -27,7 +27,7 @@ uint64_t test_uint64_t(uint64_t expr) {
 }
 
 // CHECK-DXIL: declare [[TY]] @llvm.dx.wave.reduce.umax.i64([[TY]]) #[[#attr:]]
-// CHECK-SPIRV: declare spir_func [[TY]] @llvm.spv.wave.reduce.umax.i64([[TY]]) #[[#attr:]]
+// CHECK-SPIRV: declare [[TY]] @llvm.spv.wave.reduce.umax.i64([[TY]]) #[[#attr:]]
 
 // Test basic lowering to runtime function call with array and float value.
 
@@ -40,7 +40,7 @@ float4 test_floatv4(float4 expr) {
 }
 
 // CHECK-DXIL: declare [[TY1]] @llvm.dx.wave.reduce.max.v4f32([[TY1]]) #[[#attr]]
-// CHECK-SPIRV: declare spir_func [[TY1]] @llvm.spv.wave.reduce.max.v4f32([[TY1]]) #[[#attr]]
+// CHECK-SPIRV: declare [[TY1]] @llvm.spv.wave.reduce.max.v4f32([[TY1]]) #[[#attr]]
 
 // CHECK: attributes #[[#attr]] = {{{.*}} convergent {{.*}}}
 
diff --git a/clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl b/clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl
index 4bf423ccc1b82..1fc93c62c8db0 100644
--- a/clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/WaveActiveSum.hlsl
@@ -16,7 +16,7 @@ int test_int(int expr) {
 }
 
 // CHECK-DXIL: declare [[TY]] @llvm.dx.wave.reduce.sum.i32([[TY]]) #[[#attr:]]
-// CHECK-SPIRV: declare spir_func [[TY]] @llvm.spv.wave.reduce.sum.i32([[TY]]) #[[#attr:]]
+// CHECK-SPIRV: declare [[TY]] @llvm.spv.wave.reduce.sum.i32([[TY]]) #[[#attr:]]
 
 // CHECK-LABEL: test_uint64_t
 uint64_t test_uint64_t(uint64_t expr) {
@@ -27,7 +27,7 @@ uint64_t test_uint64_t(uint64_t expr) {
 }
 
 // CHECK-DXIL: declare [[TY]] @llvm.dx.wave.reduce.usum.i64([[TY]]) #[[#attr:]]
-// CHECK-SPIRV: declare spir_func [[TY]] @llvm.spv.wave.reduce.sum.i64([[TY]]) #[[#attr:]]
+// CHECK-SPIRV: declare [[TY]] @llvm.spv.wave.reduce.sum.i64([[TY]]) #[[#attr:]]
 
 // Test basic lowering to runtime function call with array and float value.
 
@@ -40,6 +40,6 @@ float4 test_floatv4(float4 expr) {
 }
 
 // CHECK-DXIL: declare [[TY1]] @llvm.dx.wave.reduce.sum.v4f32([[TY1]]) #[[#attr]]
-// CHECK-SPIRV: declare spir_func [[TY1]] @llvm.spv.wave.reduce.sum.v4f32([[TY1]]) #[[#attr]]
+// CHECK-SPIRV: declare [[TY1]] @llvm.spv.wave.reduce.sum.v4f32([[TY1]]) #[[#attr]]
 
 // CHECK: attributes #[[#attr]] = {{{.*}} convergent {{.*}}}

HLSL uses CreateRuntimeFunction for two intrinsics. This is pretty weird thing to do, and doesn't match what the rest of the file does. I suspect this might be because these are convergent calls, but the intrinsics themselves are already marked convergent, so it's not necessary for clang to manually add the attribute.

farzonl · 2025-06-23T14:20:22Z

clang/lib/CodeGen/CGHLSLBuiltins.cpp

-    return EmitRuntimeCall(CGM.CreateRuntimeFunction(FT, Name, {},
-                                                     /*Local=*/false,
-                                                     /*AssumeConvergent=*/true),
+    return EmitRuntimeCall(Intrinsic::getOrInsertDeclaration(


This is correct. @Keenuts did EmitRuntimeCall(Intrinsic::getOrInsertDeclaration( in https://github.com/llvm/llvm-project/pull/143127/files Thats what we expect to use for these convergence intrinsics.

Looks like this was just an accident introduced in this pr: https://github.com/llvm/llvm-project/pull/118580/files#diff-202c36399fe94363f22692e534129341972137a24721c385b1f1f05fb239dd79R19523 and then the patttern may have been copied by others.

farzonl

LGTM

HLSL uses CreateRuntimeFunction for three intrinsics. This is pretty unusual thing to do, and doesn't match what the rest of the file does. I suspect this might be because these are convergent calls, but the intrinsics themselves are already marked convergent, so it's not necessary for clang to manually add the attribute. This does lose the spir_func CC on the intrinsic declaration, but again, CC should not be relevant to intrinsics at all.

nikic requested review from bogner and farzonl June 23, 2025 13:58

llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Jun 23, 2025

nikic force-pushed the cghlsl-intrin branch from 4eb990b to 9ce18af Compare June 23, 2025 14:04

farzonl reviewed Jun 23, 2025

View reviewed changes

farzonl approved these changes Jun 23, 2025

View reviewed changes

Keenuts approved these changes Jun 23, 2025

View reviewed changes

s-perron approved these changes Jun 23, 2025

View reviewed changes

nikic merged commit 1128a4f into llvm:main Jun 23, 2025
7 of 8 checks passed

nikic deleted the cghlsl-intrin branch June 23, 2025 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[HLSL] Don't use CreateRuntimeFunction for intrinsics #145334

[HLSL] Don't use CreateRuntimeFunction for intrinsics #145334

Uh oh!

nikic commented Jun 23, 2025 •

edited

Loading

Uh oh!

llvmbot commented Jun 23, 2025 •

edited

Loading

Uh oh!

farzonl Jun 23, 2025

Uh oh!

farzonl left a comment

Uh oh!

Uh oh!

Uh oh!

[HLSL] Don't use CreateRuntimeFunction for intrinsics #145334

[HLSL] Don't use CreateRuntimeFunction for intrinsics #145334

Uh oh!

Conversation

nikic commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

farzonl Jun 23, 2025

Choose a reason for hiding this comment

Uh oh!

farzonl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nikic commented Jun 23, 2025 •

edited

Loading

llvmbot commented Jun 23, 2025 •

edited

Loading