[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V #110695

AlexVlx · 2024-10-01T16:27:01Z

SPIR-V doesn't currently encode "native" integer bit-widths in its datalayout(s). This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain. This patch addresses that by encoding i8, i16, i32 and i64 as native types for vanilla SPIR-V (the spec natively supports them), and i32 and i64 for AMDGCNSPIRV (where the hardware targets are known). We also set the stack alignment on the latter, as it is overaligned (32-bit vs 8-bit).

As part of the update, we also delete a test that was fragile and depended on very specific CodeGenPrepare behaviour, around capability (integer ops with overflow intrisics) that already has coverage.

llvmbot · 2024-10-01T16:27:40Z

@llvm/pr-subscribers-clang

Author: Alex Voicu (AlexVlx)

Changes

SPIR-V doesn't currently encode "native" integer bit-widths in its datalayout(s). This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain. This patch addresses that by encoding i8, i16, i32 and i64 as native types for vanilla SPIR-V (the spec natively supports them), and i32 and i64 for AMDGCNSPIRV (where the hardware targets are known). We also set the stack alignment on the latter, as it is overaligned (32-bit vs 8-bit).

As part of the update, we also delete a test that was fragile and depended on very specific CodeGenPrepare behaviour, around capability (integer ops with overflow intrisics) that already has coverage.

Full diff: https://github.com/llvm/llvm-project/pull/110695.diff

5 Files Affected:

(modified) clang/lib/Basic/Targets/SPIR.h (+8-8)
(modified) clang/test/CodeGen/target-data.c (+1-1)
(modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+1-1)
(modified) llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp (+6-6)
(removed) llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll (-56)

diff --git a/clang/lib/Basic/Targets/SPIR.h b/clang/lib/Basic/Targets/SPIR.h
index cc79562de2871e..09d4ad3c0ac620 100644
--- a/clang/lib/Basic/Targets/SPIR.h
+++ b/clang/lib/Basic/Targets/SPIR.h
@@ -314,8 +314,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRVTargetInfo : public BaseSPIRVTargetInfo {
 
     // SPIR-V IDs are represented with a single 32-bit word.
     SizeType = TargetInfo::UnsignedInt;
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -338,8 +338,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV32TargetInfo : public BaseSPIRVTargetInfo {
     // SPIR-V has core support for atomic ops, and Int32 is always available;
     // we take the maximum because it's possible the Host supports wider types.
     MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 32);
-    resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-"
+                    "v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -362,8 +362,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64TargetInfo : public BaseSPIRVTargetInfo {
     // SPIR-V has core support for atomic ops, and Int64 is always available;
     // we take the maximum because it's possible the Host supports wider types.
     MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 64);
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -388,8 +388,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64AMDGCNTargetInfo final
     PtrDiffType = IntPtrType = TargetInfo::SignedLong;
     AddrSpaceMap = &SPIRDefIsGenMap;
 
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0");
 
     BFloat16Width = BFloat16Align = 16;
     BFloat16Format = &llvm::APFloat::BFloat();
diff --git a/clang/test/CodeGen/target-data.c b/clang/test/CodeGen/target-data.c
index 8548aa00cfe877..fa875fe68b0c5b 100644
--- a/clang/test/CodeGen/target-data.c
+++ b/clang/test/CodeGen/target-data.c
@@ -271,4 +271,4 @@
 
 // RUN: %clang_cc1 -triple spirv64-amd-amdhsa -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=AMDGPUSPIRV64
-// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0"
+// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0"
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
index bf5f2971cf118c..9132cc8a717e0f 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
@@ -638,7 +638,7 @@ void test_get_workgroup_size(int d, global int *out)
 
 // CHECK-LABEL: @test_get_grid_size(
 // CHECK: {{.*}}call align 4 dereferenceable(64){{.*}} ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
-// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %.sink
+// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %{{.+}}
 // CHECK: load i32, ptr addrspace(4) %{{.*}}, align 4, !invariant.load
 void test_get_grid_size(int d, global int *out)
 {
diff --git a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
index e5384b2eb2c2c1..50c881a19cf58b 100644
--- a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
@@ -54,14 +54,14 @@ static std::string computeDataLayout(const Triple &TT) {
   // memory model used for graphics: PhysicalStorageBuffer64. But it shouldn't
   // mean anything.
   if (Arch == Triple::spirv32)
-    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-"
-           "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1";
+    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+           "v256:256-v512:512-v1024:1024-n8:16:32:64-G1";
   if (TT.getVendor() == Triple::VendorType::AMD &&
       TT.getOS() == Triple::OSType::AMDHSA)
-    return "e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-           "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0";
-  return "e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-         "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1";
+    return "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-"
+           "v512:512-v1024:1024-n32:64-S32-G1-P4-A0";
+  return "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-"
+         "v512:512-v1024:1024-n8:16:32:64-G1";
 }
 
 static Reloc::Model getEffectiveRelocModel(std::optional<Reloc::Model> RM) {
diff --git a/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll b/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll
deleted file mode 100644
index 1a630f77a44c5d..00000000000000
--- a/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll
+++ /dev/null
@@ -1,56 +0,0 @@
-; This test aims to check ability to support "Arithmetic with Overflow" intrinsics
-; in the special case when those intrinsics are being generated by the CodeGenPrepare;
-; pass during translations with optimization (note -O3 in llc arguments).
-
-; RUN: llc -O3 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s
-; RUN: %if spirv-tools %{ llc -O3 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
-
-; RUN: llc -O3 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s
-; RUN: %if spirv-tools %{ llc -O3 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
-
-; CHECK-DAG: OpName %[[Val:.*]] "math"
-; CHECK-DAG: OpName %[[IsOver:.*]] "ov"
-; CHECK-DAG: %[[Int:.*]] = OpTypeInt 32 0
-; CHECK-DAG: %[[Char:.*]] = OpTypeInt 8 0
-; CHECK-DAG: %[[PtrChar:.*]] = OpTypePointer Generic %[[Char]]
-; CHECK-DAG: %[[Bool:.*]] = OpTypeBool
-; CHECK-DAG: %[[Struct:.*]] = OpTypeStruct %[[Int]] %[[Int]]
-; CHECK-DAG: %[[Const1:.*]] = OpConstant %[[Int]] 1
-; CHECK-DAG: %[[Const42:.*]] = OpConstant %[[Char]] 42
-; CHECK-DAG: %[[Zero:.*]] = OpConstantNull %[[Int]]
-
-; CHECK: OpFunction
-; CHECK: %[[A:.*]] = OpFunctionParameter %[[Int]]
-; CHECK: %[[Ptr:.*]] = OpFunctionParameter %[[PtrChar]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: OpBranch %[[#]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: %[[PhiRes:.*]] = OpPhi %[[Int]] %[[A]] %[[#]] %[[Val]] %[[#]]
-; CHECK: %[[AggRes:.*]] = OpIAddCarry %[[Struct]] %[[PhiRes]] %[[Const1]]
-; CHECK: %[[Val]] = OpCompositeExtract %[[Int]] %[[AggRes]] 0
-; CHECK: %[[Over:.*]] = OpCompositeExtract %[[Int]] %[[AggRes]] 1
-; CHECK: %[[IsOver]] = OpINotEqual %[[Bool:.*]] %[[Over]] %[[Zero]]
-; CHECK: OpBranchConditional %[[IsOver]] %[[#]] %[[#]]
-; CHECK: OpStore %[[Ptr]] %[[Const42]] Aligned 1
-; CHECK: OpBranch %[[#]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: OpReturnValue %[[Val]]
-; CHECK: OpFunctionEnd
-
-define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
-entry:
-  br label %l1
-
-body:
-  store i8 42, ptr addrspace(4) %p
-  br label %l1
-
-l1:
-  %e = phi i32 [ %a, %entry ], [ %i, %body ]
-  %i = add nsw i32 %e, 1
-  %fl = icmp eq i32 %i, 0
-  br i1 %fl, label %exit, label %body
-
-exit:
-  ret i32 %i
-}

llvmbot · 2024-10-01T16:27:41Z

@llvm/pr-subscribers-backend-spir-v

Author: Alex Voicu (AlexVlx)

Changes

SPIR-V doesn't currently encode "native" integer bit-widths in its datalayout(s). This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain. This patch addresses that by encoding i8, i16, i32 and i64 as native types for vanilla SPIR-V (the spec natively supports them), and i32 and i64 for AMDGCNSPIRV (where the hardware targets are known). We also set the stack alignment on the latter, as it is overaligned (32-bit vs 8-bit).

As part of the update, we also delete a test that was fragile and depended on very specific CodeGenPrepare behaviour, around capability (integer ops with overflow intrisics) that already has coverage.

Full diff: https://github.com/llvm/llvm-project/pull/110695.diff

5 Files Affected:

(modified) clang/lib/Basic/Targets/SPIR.h (+8-8)
(modified) clang/test/CodeGen/target-data.c (+1-1)
(modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+1-1)
(modified) llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp (+6-6)
(removed) llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll (-56)

diff --git a/clang/lib/Basic/Targets/SPIR.h b/clang/lib/Basic/Targets/SPIR.h
index cc79562de2871e..09d4ad3c0ac620 100644
--- a/clang/lib/Basic/Targets/SPIR.h
+++ b/clang/lib/Basic/Targets/SPIR.h
@@ -314,8 +314,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRVTargetInfo : public BaseSPIRVTargetInfo {
 
     // SPIR-V IDs are represented with a single 32-bit word.
     SizeType = TargetInfo::UnsignedInt;
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -338,8 +338,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV32TargetInfo : public BaseSPIRVTargetInfo {
     // SPIR-V has core support for atomic ops, and Int32 is always available;
     // we take the maximum because it's possible the Host supports wider types.
     MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 32);
-    resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-"
+                    "v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -362,8 +362,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64TargetInfo : public BaseSPIRVTargetInfo {
     // SPIR-V has core support for atomic ops, and Int64 is always available;
     // we take the maximum because it's possible the Host supports wider types.
     MaxAtomicInlineWidth = std::max<unsigned char>(MaxAtomicInlineWidth, 64);
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n8:16:32:64-G1");
   }
 
   void getTargetDefines(const LangOptions &Opts,
@@ -388,8 +388,8 @@ class LLVM_LIBRARY_VISIBILITY SPIRV64AMDGCNTargetInfo final
     PtrDiffType = IntPtrType = TargetInfo::SignedLong;
     AddrSpaceMap = &SPIRDefIsGenMap;
 
-    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-                    "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0");
+    resetDataLayout("e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+                    "v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0");
 
     BFloat16Width = BFloat16Align = 16;
     BFloat16Format = &llvm::APFloat::BFloat();
diff --git a/clang/test/CodeGen/target-data.c b/clang/test/CodeGen/target-data.c
index 8548aa00cfe877..fa875fe68b0c5b 100644
--- a/clang/test/CodeGen/target-data.c
+++ b/clang/test/CodeGen/target-data.c
@@ -271,4 +271,4 @@
 
 // RUN: %clang_cc1 -triple spirv64-amd-amdhsa -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=AMDGPUSPIRV64
-// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0"
+// AMDGPUSPIRV64: target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n32:64-S32-G1-P4-A0"
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
index bf5f2971cf118c..9132cc8a717e0f 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn.cl
@@ -638,7 +638,7 @@ void test_get_workgroup_size(int d, global int *out)
 
 // CHECK-LABEL: @test_get_grid_size(
 // CHECK: {{.*}}call align 4 dereferenceable(64){{.*}} ptr addrspace(4) @llvm.amdgcn.dispatch.ptr()
-// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %.sink
+// CHECK: getelementptr inbounds i8, ptr addrspace(4) %{{.*}}, i64 %{{.+}}
 // CHECK: load i32, ptr addrspace(4) %{{.*}}, align 4, !invariant.load
 void test_get_grid_size(int d, global int *out)
 {
diff --git a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
index e5384b2eb2c2c1..50c881a19cf58b 100644
--- a/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
+++ b/llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp
@@ -54,14 +54,14 @@ static std::string computeDataLayout(const Triple &TT) {
   // memory model used for graphics: PhysicalStorageBuffer64. But it shouldn't
   // mean anything.
   if (Arch == Triple::spirv32)
-    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-"
-           "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1";
+    return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-"
+           "v256:256-v512:512-v1024:1024-n8:16:32:64-G1";
   if (TT.getVendor() == Triple::VendorType::AMD &&
       TT.getOS() == Triple::OSType::AMDHSA)
-    return "e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-           "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1-P4-A0";
-  return "e-i64:64-v16:16-v24:32-v32:32-v48:64-"
-         "v96:128-v192:256-v256:256-v512:512-v1024:1024-G1";
+    return "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-"
+           "v512:512-v1024:1024-n32:64-S32-G1-P4-A0";
+  return "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-"
+         "v512:512-v1024:1024-n8:16:32:64-G1";
 }
 
 static Reloc::Model getEffectiveRelocModel(std::optional<Reloc::Model> RM) {
diff --git a/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll b/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll
deleted file mode 100644
index 1a630f77a44c5d..00000000000000
--- a/llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll
+++ /dev/null
@@ -1,56 +0,0 @@
-; This test aims to check ability to support "Arithmetic with Overflow" intrinsics
-; in the special case when those intrinsics are being generated by the CodeGenPrepare;
-; pass during translations with optimization (note -O3 in llc arguments).
-
-; RUN: llc -O3 -mtriple=spirv32-unknown-unknown %s -o - | FileCheck %s
-; RUN: %if spirv-tools %{ llc -O3 -mtriple=spirv32-unknown-unknown %s -o - -filetype=obj | spirv-val %}
-
-; RUN: llc -O3 -mtriple=spirv64-unknown-unknown %s -o - | FileCheck %s
-; RUN: %if spirv-tools %{ llc -O3 -mtriple=spirv64-unknown-unknown %s -o - -filetype=obj | spirv-val %}
-
-; CHECK-DAG: OpName %[[Val:.*]] "math"
-; CHECK-DAG: OpName %[[IsOver:.*]] "ov"
-; CHECK-DAG: %[[Int:.*]] = OpTypeInt 32 0
-; CHECK-DAG: %[[Char:.*]] = OpTypeInt 8 0
-; CHECK-DAG: %[[PtrChar:.*]] = OpTypePointer Generic %[[Char]]
-; CHECK-DAG: %[[Bool:.*]] = OpTypeBool
-; CHECK-DAG: %[[Struct:.*]] = OpTypeStruct %[[Int]] %[[Int]]
-; CHECK-DAG: %[[Const1:.*]] = OpConstant %[[Int]] 1
-; CHECK-DAG: %[[Const42:.*]] = OpConstant %[[Char]] 42
-; CHECK-DAG: %[[Zero:.*]] = OpConstantNull %[[Int]]
-
-; CHECK: OpFunction
-; CHECK: %[[A:.*]] = OpFunctionParameter %[[Int]]
-; CHECK: %[[Ptr:.*]] = OpFunctionParameter %[[PtrChar]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: OpBranch %[[#]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: %[[PhiRes:.*]] = OpPhi %[[Int]] %[[A]] %[[#]] %[[Val]] %[[#]]
-; CHECK: %[[AggRes:.*]] = OpIAddCarry %[[Struct]] %[[PhiRes]] %[[Const1]]
-; CHECK: %[[Val]] = OpCompositeExtract %[[Int]] %[[AggRes]] 0
-; CHECK: %[[Over:.*]] = OpCompositeExtract %[[Int]] %[[AggRes]] 1
-; CHECK: %[[IsOver]] = OpINotEqual %[[Bool:.*]] %[[Over]] %[[Zero]]
-; CHECK: OpBranchConditional %[[IsOver]] %[[#]] %[[#]]
-; CHECK: OpStore %[[Ptr]] %[[Const42]] Aligned 1
-; CHECK: OpBranch %[[#]]
-; CHECK: %[[#]] = OpLabel
-; CHECK: OpReturnValue %[[Val]]
-; CHECK: OpFunctionEnd
-
-define spir_func i32 @foo(i32 %a, ptr addrspace(4) %p) {
-entry:
-  br label %l1
-
-body:
-  store i8 42, ptr addrspace(4) %p
-  br label %l1
-
-l1:
-  %e = phi i32 [ %a, %entry ], [ %i, %body ]
-  %i = add nsw i32 %e, 1
-  %fl = icmp eq i32 %i, 0
-  br i1 %fl, label %exit, label %body
-
-exit:
-  ret i32 %i
-}

llvm/lib/Target/SPIRV/SPIRVTargetMachine.cpp

arsenm · 2024-10-01T16:31:54Z

llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll

@@ -1,56 +0,0 @@
-; This test aims to check ability to support "Arithmetic with Overflow" intrinsics


Not sure what the problem is with this test, but it's already covered by another?

This relies on CodeGenPrepare::combineToUSubWithOverflow firing, however when encoding the int sizes we can do loop strength reduction before, which is preferable, but also means one no longer gets the overflow intrinsics inserted, hence the test is spurious. I think you mentioned elsewhere that it's risque to rely on specific optimisations, and since the correct lowering of the overflow intrinsics is already covered, this seems spurious, hence the removal.

This one is testing codegenprepare as part of the normal codegen pipeline, so this one is fine. The other case was a full optimization pipeline + codegen, which are more far removed

Right but it's relying on a non-guaranteed maybe-optimisation firing, as far as I can tell.

@AlexVlx I'm strongly against deleting this test case.

The main objection is that the code base should switch from one stable state to another, without losing current coverage, stability, etc. When any of us is adding a feature that alters translation behavior it looks fair to expect that the same contributor is responsible for updating all impacted existing components. Applying this principle to the PR, I'd rather expect that you update existing test case (if it's required by the change), but not remove it.

@AlexVlx I am not sure I follow the reasoning here. The idea behind those various tests in /SPIRV/optimizations/ (including this one) is to ensure the backend can accept any IR. Either coming from the standard -O3 or some custom pipeline.

I do agree that we should not expect specific optimizations (or even more sets of optimizations) to do specific things when writing a test for an independent component. Hence, after some thinking, I believe the test could be improved by already containing a specific pattern that would be coming from the optimization. However, by the same principle, we should not expect that the issue will be "optimized away" by another optimization.

I would also oppose removing existing test cases. Any improvements, strengthening, or necessary changes are (of course) desirable.

Right but it's relying on a non-guaranteed maybe-optimisation firing, as far as I can tell.

The point is to test the optimization does work. The codegen pipeline is a bunch of intertwined IR passes on top of core codegen, and they need to cooperate.

Testing what the behavior is is also always important, regardless of whether the result is what you want it to be or not

@michalpaszkowski @VyacheslavLevytskyy I've restored the test, and retained the apparent intention of triggering combining in CodeGenPrepare. I do think there's a bit of an impedance mismatch in the conversation, but that might be just me: on one hand, there's nothing special about the pattern that results, and we already test for the correct generation of the intrinsics themselves. On the other, LSR does a better job here so it's not quite a case of a problem going away, but rather a more profitable optimisation becoming viable. Perhaps we could extend the test to cover this by way of, essentially, looking for the overflow intrinsics if LSR is off, and checking their absence when it's on, albeit that might introduce some added brittleness.

The codegen prepare behavior is still backend code to be tested. You can just run codegenprepare as a standalone pass too (usually would have separate llc and opt run lines in such a test)

MrSidims

Thanks, it should make LLVM IR after optimizations more translatable in SPIR-V! Few questions though:

Usually (or at least AFAIK) optimization passes won't consider datalayout automatically, as LLVM defines datalayout not as a contract set by the frontend, but a contact, that the code generator expects. Do you plan to go over LLVM passes adding this check?
Some existing and future extensions might allow extra bit widths for integers. For example here is SPV_INTEL_arbitrary_precision_integers extension that allows any bit widths for integers (it's actually a bad example as it's developed only for _BitInt C23 extension and FPGA hardware, so datalayout wouldn't have an impact on it) or one of the internally discussed within Khronos extensions for ML (which might be impacted by this change). Can we envision, how can we change datalayout information depending on the enabled extensions (or you don't think it's a big problem?)

arsenm · 2024-10-02T06:36:09Z

Usually (or at least AFAIK) optimization passes won't consider datalayout automatically,

The datalayout is a widely used global constant. There's no option of "not considering it"

Do you plan to go over LLVM passes adding this check?

There's nothing new to do here. This has always existed

Some existing and future extensions might allow extra bit widths for integers.

This does not mean arbitrary integer bitwidths do not work. The n field is weird, it's more of an optimization hint.

MrSidims · 2024-10-02T09:43:33Z

There's nothing new to do here. This has always existed

@arsenm here is a small experiment, I've compiled the following OpenCL code:

struct S {
    char i8_3[3];
};

kernel void test(global struct S *p, float3 v)
{
   int3 tmp;
   frexp(v, &tmp);
   tmp += 1;
   p->i8_3[0] = tmp.x;
   p->i8_3[1] = tmp.y;
   p->i8_3[2] = tmp.z;
}

with the PR pulled in (on top of LLVM's HEAD aadfba9), the compilation command is:
clang++ -cl-std=CL2.0 -emit-llvm -c -x cl -g0 --target=spir -Xclang -finclude-default-header -O2 test.cl
The output LLVM IR after the optimizations is:

; Function Attrs: convergent norecurse nounwind
define dso_local spir_kernel void @test(ptr addrspace(1) nocapture noundef writeonly align 1 %p, <3 x float> noundef %v) local_unnamed_addr #0 !kernel_arg_a>
entry:
  %tmp = alloca <3 x i32>, align 16
  call void @llvm.lifetime.start.p0(i64 16, ptr nonnull %tmp) #3
  %tmp.ascast = addrspacecast ptr %tmp to ptr addrspace(4)
  %call = call spir_func <3 x float> @_Z5frexpDv3_fPU3AS4Dv3_i(<3 x float> noundef %v, ptr addrspace(4) noundef %tmp.ascast) #4
  %loadVec42 = load <4 x i32>, ptr %tmp, align 16
  %extractVec4 = add <4 x i32> %loadVec42, <i32 1, i32 1, i32 1, i32 1>
  %0 = bitcast <4 x i32> %extractVec4 to i128
  %1 = trunc i128 %0 to i96
  %2 = bitcast i96 %1 to <12 x i8>
  %conv = trunc i128 %0 to i8
  store i8 %conv, ptr addrspace(1) %p, align 1, !tbaa !9
  %conv5 = extractelement <12 x i8> %2, i64 4
  %arrayidx7 = getelementptr inbounds i8, ptr addrspace(1) %p, i32 1
  store i8 %conv5, ptr addrspace(1) %arrayidx7, align 1, !tbaa !9
  %conv8 = extractelement <12 x i8> %2, i64 8
  %arrayidx10 = getelementptr inbounds i8, ptr addrspace(1) %p, i32 2
  store i8 %conv8, ptr addrspace(1) %arrayidx10, align 1, !tbaa !9
  call void @llvm.lifetime.end.p0(i64 16, ptr nonnull %tmp) #3
  ret void
}

note bitcast to i128 with the following truncation to i96 - those types aren't part of the datalayout, yet some optimization generated them. So something has to be done with it and changing the datalayout is not enough.

This does not mean arbitrary integer bitwidths do not work. The n field is weird, it's more of an optimization hint.

Let me clarify myself, _BitInt(N) will work with the change, I have no doubts. But I can imagine a SPIR-V extension to appear that would add support for 4-bit integers. And I can imagine that we would want to not only be able to emit 4-bit integers in the frontend, but also allow optimization passes to emit them. For this it would be nice to have a mechanism that would change datalayout depending on --spirv-ext (or other option).

arsenm · 2024-10-02T10:28:21Z

with the PR pulled in (on top of LLVM's HEAD aadfba9), the compilation command is: clang++ -cl-std=CL2.0 -emit-llvm -c -x cl -g0 --target=spir -Xclang -finclude-default-header -O2 test.cl The output LLVM IR after the optimizations is:

You want spirv, not spir

note bitcast to i128 with the following truncation to i96 - those types aren't part of the datalayout, yet some optimization generated them. So something has to be done with it and changing the datalayout is not enough.

Any pass is allowed to introduce any IR type. This field is a pure optimization hint. It is not required to do anything, and places no restrictions on any pass

This does not mean arbitrary integer bitwidths do not work. The n field is weird, it's more of an optimization hint.

And I can imagine that we would want to not only be able to emit 4-bit integers in the frontend, but also allow optimization passes to emit them.

Just because there's an extension doesn't mean it's desirable to use them. On real targets, they'll end up codegenning in wider types anyway

…_layout_with_int_sz

MrSidims · 2024-10-02T13:21:52Z

You want spirv, not spir

Thanks! Yet the result is the same (for spirv64, just spirv target compilation crashes).

Do you plan to go over LLVM passes adding this check?

So guess answer to my question would be: "no" :)

AlexVlx · 2024-10-02T13:27:12Z

Let me clarify myself, _BitInt(N) will work with the change, I have no doubts. But I can imagine a SPIR-V extension to appear that would add support for 4-bit integers. And I can imagine that we would want to not only be able to emit 4-bit integers in the frontend, but also allow optimization passes to emit them. For this it would be nice to have a mechanism that would change datalayout depending on --spirv-ext (or other option).

IMHO, whilst it is mechanically possible, we should not make DataLayout mutable subject to things like the presence or absence of an extension, that seems like a recipe for subtle pain. It seems preferable (to me) to just have another triple for targets where arbitrary (and possibly weird) bitwidths are native i.e. desirable / optimal.

I will note that 4-bit ints are still somewhat natural (we've had nibbles, after all). What motivated this change was the weirder types that obtain without conveying native int widths (i7, i62 for example) which are quite unnatural and poorly handled (last time I checked the Translator didn't like them at all), and which might never be optimal native widths, even on exotic FPGAs.

In what regards the example itself, unfortunately (fortunately?), that's an actual bug which is relatively orthogonal to this particular PR, it fails today with vanilla upstream, please see: https://godbolt.org/z/7aeP797PP. The wide integer types come from GVN, as part of its Load handling, see VnCoercion::getStoreValueForLoadHelper. At a glance, it seems like an issue around handling vec3s, which are odd, but, probably; the BE should probably handle this gracefully rather than errorring out, since it's not a case of doing arithmetic on wide ints, it's just ephemeral manipulation of bits - with emphasis on "at a glance".

…_layout_with_int_sz

llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll

MrSidims · 2024-10-07T20:49:52Z

with vanilla upstream, please see

You mean the translator, right? I don't think that SPIR-V backend should follow its practices especially when we could do better.

At a glance, it seems like an issue around handling vec3s, which are odd, but, probably; the BE should probably handle this gracefully rather than errorring out

So we both agree, that the compiler must compile the OpenCL code from above to SPIR-V without erroring out. We can do it in 2 ways:

Regularize types in the backend (since the backend is based on top of global isel we should have here better luck then in the translator);
Or since this patch modifies datalayout - adjust LLVM pipeline to consider datalayout in optimization passes. The reasoning you have provided in the PR description: "This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain." totally makes sense to me. So I'm asking if you have plans to go through the passes and modify them, or you only intend to modify InstCombine and/or AMD-specific passes?

AlexVlx · 2024-10-07T21:39:28Z

So we both agree, that the compiler must compile the OpenCL code from above to SPIR-V without erroring out. We can do it in 2 ways:

Regularize types in the backend (since the backend is based on top of global isel we should have here better luck then in the translator);

Or since this patch modifies datalayout - adjust LLVM pipeline to consider datalayout in optimization passes. The reasoning you have provided in the PR description: "This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain." totally makes sense to me. So I'm asking if you have plans to go through the passes and modify them, or you only intend to modify InstCombine and/or AMD-specific passes?

Wel, the issue the example illustrates is pre-existing and not influenced / triggered by this patch; stepping back, if the question is “are you going to try / would you like to fix this other issue” the answer is yes, but sadly I don't have an immediate intuition as to the right solution (1 is ~~approx~~ what we do on the AMDGPU side, where we e.g. handle accesses to i128 as the corresponding vec4 access).

MrSidims · 2024-10-07T21:47:18Z

Don't get me wrong, what I'm saying is not an objection against the patch, but rather an attempt to test the waters and gather what plans/ideas you have :)

I can say, that in https://github.com/intel/llvm/ we do have some customization that prevents certain optimizations for SPIR target and was wondering if it's a 'community standard' and if we can align on how InstCombine and other passes should behave when we compile to SPIR-V (if they should, if I read the comments above correctly, Matt has a different opinion).

arsenm · 2024-10-10T16:31:55Z

I can say, that in https://github.com/intel/llvm/ we do have some customization that prevents certain optimizations for SPIR target and was wondering if it's a 'community standard' and if we can align on how InstCombine and other passes should behave when we compile to SPIR-V (if they should, if I read the comments above correctly, Matt has a different opinion).

InstCombine's primary function is a canonicalization pass. You shouldn't be modifying it for specifically SPIRV optimizations (with the exception of SPIRV intrinsic support). SPIRV specific transforms belong in later backend IR passes

MrSidims · 2024-10-11T11:14:54Z

InstCombine's primary function is a canonicalization pass. You shouldn't be modifying it for specifically SPIRV optimizations (with the exception of SPIRV intrinsic support). SPIRV specific transforms belong in later backend IR passes

Does it mean, that the reasoning behind this very PR is not legit?

This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain.

AlexVlx · 2024-10-11T11:32:20Z

InstCombine's primary function is a canonicalization pass. You shouldn't be modifying it for specifically SPIRV optimizations (with the exception of SPIRV intrinsic support). SPIRV specific transforms belong in later backend IR passes

Does it mean, that the reasoning behind this very PR is not legit?

This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain.

This is not a SPIR-V specific optimisation being added to InstCombine. It addresses a SPIR-V issue (lack of information) on the SPIR-V side, not in InstCombine.

arsenm · 2024-10-11T13:37:28Z

Does it mean, that the reasoning behind this very PR is not legit?

No. This is providing the generic property in the datalayout used by InstCombine and others as a hint of what to do without directly knowing what the target is

…_layout_with_int_sz

VyacheslavLevytskyy

Thanks, sorry for the delay

…R-V (llvm#110695) SPIR-V doesn't currently encode "native" integer bit-widths in its datalayout(s). This is problematic as it leads to optimisation passes, such as InstCombine, getting ideas and e.g. shrinking to non byte-multiple integer types, which is not desirable and can lead to breakage further down in the toolchain. This patch addresses that by encoding `i8`, `i16`, `i32` and `i64` as native types for vanilla SPIR-V (the spec natively supports them), and `i32` and `i64` for AMDGCNSPIRV (where the hardware targets are known). We also set the stack alignment on the latter, as it is overaligned (32-bit vs 8-bit).

Adds the following patches AMDGPU: Remove wavefrontsize64 feature from dummy target llvm#117410 [LLVM][NFC] Use used's element type if available llvm#116804 [llvm][AMDGPU] Fold llvm.amdgcn.wavefrontsize early llvm#114481 [clang][Driver][HIP] Add support for mixing AMDGCNSPIRV & concrete offload-archs. llvm#113509 [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V llvm#110695 [llvm][opt][Transforms] Replacement calloc should match replaced malloc llvm#110524 [clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV llvm#110447 [cuda][HIP] constant should imply constant llvm#110182 [llvm][SPIRV] Expose fast popcnt support for SPIR-V targets llvm#109845 [clang][CodeGen][SPIR-V] Fix incorrect SYCL usage, implement missing interface llvm#109415 [SPIRV][RFC] Rework / extend support for memory scopes llvm#106429 [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. llvm#102776 Change-Id: I2b9ab54aba1c9345b9b0eb84409e6ed6c3cdb6cd

Explicitly encode native integer widths for SPIR-V.

758fb6e

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" backend:SPIR-V labels Oct 1, 2024

AlexVlx requested review from bogner, arsenm and VyacheslavLevytskyy October 1, 2024 16:27

arsenm reviewed Oct 1, 2024

View reviewed changes

VyacheslavLevytskyy requested a review from michalpaszkowski October 1, 2024 19:24

MrSidims reviewed Oct 1, 2024

View reviewed changes

AlexVlx added 3 commits October 2, 2024 12:22

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

ccbff3c

…_layout_with_int_sz

Restore test and fix it to test what was intended.

f1c8446

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

9863f1d

…_layout_with_int_sz

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

c552d99

…_layout_with_int_sz

arsenm reviewed Oct 7, 2024

View reviewed changes

llvm/test/CodeGen/SPIRV/optimizations/add-check-overflow.ll Outdated Show resolved Hide resolved

MrSidims approved these changes Oct 7, 2024

View reviewed changes

AlexVlx added 4 commits October 15, 2024 22:56

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

04cdc23

…_layout_with_int_sz

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

d07b46f

…_layout_with_int_sz

Merge branch 'main' of https://github.com/llvm/llvm-project into data…

a2d6159

…_layout_with_int_sz

Extend test to reflect current default output.

a5e183b

VyacheslavLevytskyy approved these changes Nov 5, 2024

View reviewed changes

AlexVlx merged commit 2c13dec into llvm:main Nov 5, 2024
9 checks passed

AlexVlx deleted the data_layout_with_int_sz branch November 5, 2024 15:26

		@@ -1,56 +0,0 @@
		; This test aims to check ability to support "Arithmetic with Overflow" intrinsics

[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V #110695

[clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V #110695

Uh oh!

Conversation

AlexVlx commented Oct 1, 2024

Uh oh!

llvmbot commented Oct 1, 2024

Uh oh!

llvmbot commented Oct 1, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MrSidims left a comment

Choose a reason for hiding this comment

Uh oh!

arsenm commented Oct 2, 2024

Uh oh!

MrSidims commented Oct 2, 2024

Uh oh!

arsenm commented Oct 2, 2024

Uh oh!

MrSidims commented Oct 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexVlx commented Oct 2, 2024

Uh oh!

Uh oh!

MrSidims commented Oct 7, 2024

Uh oh!

AlexVlx commented Oct 7, 2024

Uh oh!

MrSidims commented Oct 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Oct 10, 2024

Uh oh!

MrSidims commented Oct 11, 2024

Uh oh!

AlexVlx commented Oct 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Oct 11, 2024

Uh oh!

VyacheslavLevytskyy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

arsenm Oct 2, 2024 •

edited

Loading

MrSidims commented Oct 2, 2024 •

edited

Loading

MrSidims commented Oct 7, 2024 •

edited

Loading

AlexVlx commented Oct 11, 2024 •

edited

Loading