[DirectX] Clean up extra vectors when lowering to buffer store #116721

bogner · 2024-11-19T01:26:45Z

DXILOpLowering runs after scalarization but @llvm.dx.typedbuffer.store takes a vector, so the argument is usually an artifact. Avoid creating a vector just to extract elements from it immediately.

DXILOpLowering runs after scalarization but `@llvm.dx.typedbuffer.store` takes a vector, so the argument is usually an artifact. Avoid creating a vector just to extract elements from it immediately.

llvmbot · 2024-11-19T01:27:21Z

@llvm/pr-subscribers-backend-directx

Author: Justin Bogner (bogner)

Changes

DXILOpLowering runs after scalarization but @llvm.dx.typedbuffer.store takes a vector, so the argument is usually an artifact. Avoid creating a vector just to extract elements from it immediately.

Full diff: https://github.com/llvm/llvm-project/pull/116721.diff

2 Files Affected:

(modified) llvm/lib/Target/DirectX/DXILOpLowering.cpp (+34-11)
(modified) llvm/test/CodeGen/DirectX/BufferStore.ll (+24)

diff --git a/llvm/lib/Target/DirectX/DXILOpLowering.cpp b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
index 9f124394363a38..00daa87a53aa13 100644
--- a/llvm/lib/Target/DirectX/DXILOpLowering.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
@@ -531,23 +531,46 @@ class OpLowerer {
         return make_error<StringError>(
             "typedBufferStore data must be a vector of 4 elements",
             inconvertibleErrorCode());
-      Value *Data0 =
-          IRB.CreateExtractElement(Data, ConstantInt::get(Int32Ty, 0));
-      Value *Data1 =
-          IRB.CreateExtractElement(Data, ConstantInt::get(Int32Ty, 1));
-      Value *Data2 =
-          IRB.CreateExtractElement(Data, ConstantInt::get(Int32Ty, 2));
-      Value *Data3 =
-          IRB.CreateExtractElement(Data, ConstantInt::get(Int32Ty, 3));
-
-      std::array<Value *, 8> Args{Handle, Index0, Index1, Data0,
-                                  Data1,  Data2,  Data3,  Mask};
+
+      // Since we're post-scalarizer, we likely have a vector that's constructed
+      // solely for the argument of the store.
+      std::array<Value *, 4> DataElements{nullptr, nullptr, nullptr, nullptr};
+      auto *IEI = dyn_cast<InsertElementInst>(Data);
+      while (IEI) {
+        auto *IndexOp = dyn_cast<ConstantInt>(IEI->getOperand(2));
+        if (!IndexOp)
+          break;
+        size_t IndexVal = IndexOp->getZExtValue();
+        assert(IndexVal < 4 && "Too many elements for buffer store");
+        DataElements[IndexVal] = IEI->getOperand(1);
+        IEI = dyn_cast<InsertElementInst>(IEI->getOperand(0));
+      }
+
+      // If for some reason we weren't able to forward the arguments from the
+      // scalarizer artifact, then we need to actually extract elements from the
+      // vector.
+      for (int I = 0, E = 4; I != E; ++I)
+        if (DataElements[I] == nullptr)
+          DataElements[I] =
+              IRB.CreateExtractElement(Data, ConstantInt::get(Int32Ty, I));
+
+      std::array<Value *, 8> Args{
+          Handle,          Index0,          Index1,          DataElements[0],
+          DataElements[1], DataElements[2], DataElements[3], Mask};
       Expected<CallInst *> OpCall =
           OpBuilder.tryCreateOp(OpCode::BufferStore, Args, CI->getName());
       if (Error E = OpCall.takeError())
         return E;
 
       CI->eraseFromParent();
+      // Clean up any leftover `insertelement`s
+      IEI = dyn_cast<InsertElementInst>(Data);
+      while (IEI && IEI->use_empty()) {
+        InsertElementInst *Tmp = IEI;
+        IEI = dyn_cast<InsertElementInst>(IEI->getOperand(0));
+        Tmp->eraseFromParent();
+      }
+
       return Error::success();
     });
   }
diff --git a/llvm/test/CodeGen/DirectX/BufferStore.ll b/llvm/test/CodeGen/DirectX/BufferStore.ll
index 9ea7735be59c81..81cc5fd328e0a7 100644
--- a/llvm/test/CodeGen/DirectX/BufferStore.ll
+++ b/llvm/test/CodeGen/DirectX/BufferStore.ll
@@ -90,3 +90,27 @@ define void @storei16(<4 x i16> %data, i32 %index) {
 
   ret void
 }
+
+define void @store_scalarized_floats(float %data0, float %data1, float %data2, float %data3, i32 %index) {
+
+  ; CHECK: [[BIND:%.*]] = call %dx.types.Handle @dx.op.createHandleFromBinding(i32 217,
+  ; CHECK: [[HANDLE:%.*]] = call %dx.types.Handle @dx.op.annotateHandle(i32 216, %dx.types.Handle [[BIND]]
+  %buffer = call target("dx.TypedBuffer", <4 x float>, 1, 0, 0)
+      @llvm.dx.handle.fromBinding.tdx.TypedBuffer_v4f32_1_0_0(
+          i32 0, i32 0, i32 1, i32 0, i1 false)
+
+  ; We shouldn't end up with any inserts/extracts.
+  ; CHECK-NOT: insertelement
+  ; CHECK-NOT: extractelement
+
+  ; CHECK: call void @dx.op.bufferStore.f32(i32 69, %dx.types.Handle [[HANDLE]], i32 %index, i32 undef, float %data0, float %data1, float %data2, float %data3, i8 15)
+  %vec.upto0 = insertelement <4 x float> poison, float %data0, i64 0
+  %vec.upto1 = insertelement <4 x float> %vec.upto0, float %data1, i64 1
+  %vec.upto2 = insertelement <4 x float> %vec.upto1, float %data2, i64 2
+  %vec = insertelement <4 x float> %vec.upto2, float %data3, i64 3
+  call void @llvm.dx.typedBufferStore(
+      target("dx.TypedBuffer", <4 x float>, 1, 0, 0) %buffer,
+      i32 %index, <4 x float> %vec)
+
+  ret void
+}

damyanp

Looks Good To My Untrained Eye

damyanp · 2024-11-19T01:41:37Z

llvm/lib/Target/DirectX/DXILOpLowering.cpp

+      // Since we're post-scalarizer, we likely have a vector that's constructed
+      // solely for the argument of the store.


FWIW, I feel that there's a "...so we can (do whatever this loop does)" missing from this comment, but that might just be my lack of familiarity with this sort of code speaking.

Added "If so, just use the scalar values from before they're inserted into the temporary." to the comment.

damyanp · 2024-11-19T01:43:51Z

llvm/lib/Target/DirectX/DXILOpLowering.cpp

+        if (!IndexOp)
+          break;
+        size_t IndexVal = IndexOp->getZExtValue();
+        assert(IndexVal < 4 && "Too many elements for buffer store");


If we start support > 4 element vectors, are asserts like this going to bite us (ie hard to debug errors that don't show up in production builds?)

I don't think so, this is fairly likely to crash in a non-asserts build if it comes up anyways, and this is a fairly obvious place we'd need to update if we were to make such a change. Note that we'd have to expand into multiple calls to dx.op.BufferStore here if we were to support larger vectors in HLSL's typed buffers, so this isn't just some implementation detail we'd be likely to forget about.

[DirectX] Clean up extra vectors when lowering to buffer store

ec3c4f4

DXILOpLowering runs after scalarization but `@llvm.dx.typedbuffer.store` takes a vector, so the argument is usually an artifact. Avoid creating a vector just to extract elements from it immediately.

llvmbot added the backend:DirectX label Nov 19, 2024

damyanp reviewed Nov 19, 2024

View reviewed changes

Improve comment

3ad9863

llvm-beanz approved these changes Dec 2, 2024

View reviewed changes

bogner merged commit 2c88ac9 into llvm:main Dec 2, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DirectX] Clean up extra vectors when lowering to buffer store #116721

[DirectX] Clean up extra vectors when lowering to buffer store #116721

Uh oh!

bogner commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

damyanp left a comment

Uh oh!

damyanp Nov 19, 2024 •

edited

Loading

Uh oh!

bogner Nov 20, 2024

Uh oh!

damyanp Nov 19, 2024

Uh oh!

bogner Nov 20, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

		// Since we're post-scalarizer, we likely have a vector that's constructed
		// solely for the argument of the store.

[DirectX] Clean up extra vectors when lowering to buffer store #116721

[DirectX] Clean up extra vectors when lowering to buffer store #116721

Uh oh!

Conversation

bogner commented Nov 19, 2024

Uh oh!

llvmbot commented Nov 19, 2024

Uh oh!

damyanp left a comment

Choose a reason for hiding this comment

Uh oh!

damyanp Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bogner Nov 20, 2024

Choose a reason for hiding this comment

Uh oh!

damyanp Nov 19, 2024

Choose a reason for hiding this comment

Uh oh!

bogner Nov 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

damyanp Nov 19, 2024 •

edited

Loading

bogner Nov 20, 2024 •

edited

Loading