Skip to content

Revert "[GlobalISel] prevent G_UNMERGE_VALUES for vectors with different elements" #144650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 18, 2025

Conversation

ro-i
Copy link
Contributor

@ro-i ro-i commented Jun 18, 2025

Reverts #133335

@ro-i ro-i merged commit a38932a into main Jun 18, 2025
8 of 10 checks passed
@ro-i ro-i deleted the revert-133335-unmerge-values-vector branch June 18, 2025 07:49
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Robert Imschweiler (ro-i)

Changes

Reverts llvm/llvm-project#133335


Full diff: https://github.com/llvm/llvm-project/pull/144650.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h (+1-4)
  • (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll (-55)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
index 8f560c42082f9..22f6a5fde546a 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizationArtifactCombiner.h
@@ -997,7 +997,6 @@ class LegalizationArtifactCombiner {
 
       // Recognize UnmergeSrc that can be unmerged to DstTy directly.
       // Types have to be either both vector or both non-vector types.
-      // In case of vector types, the scalar elements need to match.
       // Merge-like opcodes are combined one at the time. First one creates new
       // unmerge, following should use the same unmerge (builder performs CSE).
       //
@@ -1006,9 +1005,7 @@ class LegalizationArtifactCombiner {
       // %AnotherDst:_(DstTy) = G_merge_like_opcode %2:_(EltTy), %3
       //
       // %Dst:_(DstTy), %AnotherDst = G_UNMERGE_VALUES %UnmergeSrc
-      if (((!DstTy.isVector() && !UnmergeSrcTy.isVector()) ||
-           (DstTy.isVector() && UnmergeSrcTy.isVector() &&
-            DstTy.getScalarType() == UnmergeSrcTy.getScalarType())) &&
+      if ((DstTy.isVector() == UnmergeSrcTy.isVector()) &&
           (Elt0UnmergeIdx % NumMIElts == 0) &&
           getCoverTy(UnmergeSrcTy, DstTy) == UnmergeSrcTy) {
         if (!isSequenceFromUnmerge(MI, 0, Unmerge, Elt0UnmergeIdx, NumMIElts,
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
index 132a89478c5fd..8134eb3ca2afc 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.ll
@@ -6506,58 +6506,3 @@ entry:
   %insert = insertelement <5 x double> %vec, double %val, i32 %idx
   ret <5 x double> %insert
 }
-
-; Found by fuzzer, reduced with llvm-reduce.
-define amdgpu_kernel void @insert_very_small_from_very_large(<32 x i16> %L3, ptr %ptr) {
-; GPRIDX-LABEL: insert_very_small_from_very_large:
-; GPRIDX:       ; %bb.0: ; %bb
-; GPRIDX-NEXT:    s_load_dwordx16 s[12:27], s[8:9], 0x0
-; GPRIDX-NEXT:    s_load_dwordx2 s[0:1], s[8:9], 0x40
-; GPRIDX-NEXT:    s_waitcnt lgkmcnt(0)
-; GPRIDX-NEXT:    s_lshr_b32 s2, s12, 1
-; GPRIDX-NEXT:    s_and_b32 s2, s2, 1
-; GPRIDX-NEXT:    s_lshl_b32 s2, s2, 1
-; GPRIDX-NEXT:    v_mov_b32_e32 v0, s0
-; GPRIDX-NEXT:    v_mov_b32_e32 v2, s2
-; GPRIDX-NEXT:    v_mov_b32_e32 v1, s1
-; GPRIDX-NEXT:    flat_store_byte v[0:1], v2
-; GPRIDX-NEXT:    s_endpgm
-;
-; GFX10-LABEL: insert_very_small_from_very_large:
-; GFX10:       ; %bb.0: ; %bb
-; GFX10-NEXT:    s_clause 0x1
-; GFX10-NEXT:    s_load_dwordx16 s[12:27], s[8:9], 0x0
-; GFX10-NEXT:    s_load_dwordx2 s[0:1], s[8:9], 0x40
-; GFX10-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX10-NEXT:    s_lshr_b32 s2, s12, 1
-; GFX10-NEXT:    v_mov_b32_e32 v0, s0
-; GFX10-NEXT:    s_and_b32 s2, s2, 1
-; GFX10-NEXT:    v_mov_b32_e32 v1, s1
-; GFX10-NEXT:    s_lshl_b32 s2, s2, 1
-; GFX10-NEXT:    v_mov_b32_e32 v2, s2
-; GFX10-NEXT:    flat_store_byte v[0:1], v2
-; GFX10-NEXT:    s_endpgm
-;
-; GFX11-LABEL: insert_very_small_from_very_large:
-; GFX11:       ; %bb.0: ; %bb
-; GFX11-NEXT:    s_clause 0x1
-; GFX11-NEXT:    s_load_b512 s[8:23], s[4:5], 0x0
-; GFX11-NEXT:    s_load_b64 s[0:1], s[4:5], 0x40
-; GFX11-NEXT:    s_waitcnt lgkmcnt(0)
-; GFX11-NEXT:    s_lshr_b32 s2, s8, 1
-; GFX11-NEXT:    v_mov_b32_e32 v0, s0
-; GFX11-NEXT:    s_and_b32 s2, s2, 1
-; GFX11-NEXT:    v_mov_b32_e32 v1, s1
-; GFX11-NEXT:    s_lshl_b32 s2, s2, 1
-; GFX11-NEXT:    v_mov_b32_e32 v2, s2
-; GFX11-NEXT:    flat_store_b8 v[0:1], v2
-; GFX11-NEXT:    s_endpgm
-bb:
-  %a = bitcast <32 x i16> %L3 to i512
-  %b = trunc i512 %a to i8
-  %c = trunc i8 %b to i2
-  %d = bitcast i2 %c to <2 x i1>
-  %insert = insertelement <2 x i1> %d, i1 false, i32 0
-  store <2 x i1> %insert, ptr %ptr, align 1
-  ret void
-}

fschlimb pushed a commit to fschlimb/llvm-project that referenced this pull request Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants