-
Notifications
You must be signed in to change notification settings - Fork 14.3k
AMDGPU: Add sgpr bit convert tests #136112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-amdgpu Author: None (Shoreshen) ChangesAdd inreg test for sgpr purpose Patch is 45.87 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136112.diff 24 Files Affected:
diff --git a/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.inreg.1024bit.ll b/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.inreg.1024bit.ll
new file mode 100644
index 0000000000000..2177b140f9957
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.inreg.1024bit.ll
@@ -0,0 +1,165047 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+
+; RUN: llc -mtriple=amdgcn -mcpu=tahiti < %s | FileCheck -check-prefix=GCN %s
+; RUN: llc -mtriple=amdgcn -mcpu=tonga < %s | FileCheck -check-prefixes=VI %s
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 < %s | FileCheck -check-prefixes=GFX9 %s
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 < %s | FileCheck -check-prefixes=GFX11 %s
+
+define inreg <32 x float> @bitcast_v32i32_to_v32f32_inreg(<32 x i32> inreg %a, i32 inreg %b) {
+; GCN-LABEL: bitcast_v32i32_to_v32f32_inreg:
+; GCN: ; %bb.0:
+; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT: buffer_store_dword v40, off, s[0:3], s32 offset:60 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v41, off, s[0:3], s32 offset:56 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v42, off, s[0:3], s32 offset:52 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v43, off, s[0:3], s32 offset:48 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v44, off, s[0:3], s32 offset:44 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v45, off, s[0:3], s32 offset:40 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v46, off, s[0:3], s32 offset:36 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v47, off, s[0:3], s32 offset:32 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v56, off, s[0:3], s32 offset:28 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v57, off, s[0:3], s32 offset:24 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v58, off, s[0:3], s32 offset:20 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v59, off, s[0:3], s32 offset:16 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v60, off, s[0:3], s32 offset:12 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v61, off, s[0:3], s32 offset:8 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v62, off, s[0:3], s32 offset:4 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v63, off, s[0:3], s32 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:68 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:72 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:76 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:80 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:84 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:88 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:92 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:96 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:100 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:104 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:108 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:112 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:64 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:116 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:120 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:124 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:128 ; 4-byte Folded Spill
+; GCN-NEXT: s_waitcnt expcnt(0)
+; GCN-NEXT: v_mov_b32_e32 v0, s16
+; GCN-NEXT: v_mov_b32_e32 v48, v17
+; GCN-NEXT: v_cmp_ne_u32_e32 vcc, 0, v18
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:1668 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:1672 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:1676 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:1680 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:1684 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:1688 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:1692 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:1696 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:1700 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:1704 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:1708 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:1712 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:1716 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:1720 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:1724 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:1728 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:1732 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:1736 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:1740 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v19, off, s[0:3], s32 offset:1744 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v20, off, s[0:3], s32 offset:1748 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v21, off, s[0:3], s32 offset:1752 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v22, off, s[0:3], s32 offset:1756 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v23, off, s[0:3], s32 offset:1760 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v24, off, s[0:3], s32 offset:1764 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v25, off, s[0:3], s32 offset:1768 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v26, off, s[0:3], s32 offset:1772 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v27, off, s[0:3], s32 offset:1776 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v28, off, s[0:3], s32 offset:1780 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v29, off, s[0:3], s32 offset:1784 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v30, off, s[0:3], s32 offset:1788 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v31, off, s[0:3], s32 offset:1792 ; 4-byte Folded Spill
+; GCN-NEXT: v_mov_b32_e32 v1, s17
+; GCN-NEXT: s_and_b64 s[4:5], vcc, exec
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:132 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:136 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:140 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:144 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:148 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:152 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:156 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:160 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:164 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:168 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:172 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:176 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:180 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:184 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:188 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:192 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:196 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:200 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:204 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v19, off, s[0:3], s32 offset:208 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v20, off, s[0:3], s32 offset:212 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v21, off, s[0:3], s32 offset:216 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v22, off, s[0:3], s32 offset:220 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v23, off, s[0:3], s32 offset:224 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v24, off, s[0:3], s32 offset:228 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v25, off, s[0:3], s32 offset:232 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v26, off, s[0:3], s32 offset:236 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v27, off, s[0:3], s32 offset:240 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v28, off, s[0:3], s32 offset:244 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v29, off, s[0:3], s32 offset:248 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v30, off, s[0:3], s32 offset:252 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v31, off, s[0:3], s32 offset:256 ; 4-byte Folded Spill
+; GCN-NEXT: v_mov_b32_e32 v2, s18
+; GCN-NEXT: v_mov_b32_e32 v3, s19
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:1156 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:1160 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:1164 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:1168 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:1172 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:1176 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:1180 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:1184 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:1188 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:1192 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:1196 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:1200 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:1204 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:1208 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:1212 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:1216 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:1220 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:1224 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:1228 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v19, off, s[0:3], s32 offset:1232 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v20, off, s[0:3], s32 offset:1236 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v21, off, s[0:3], s32 offset:1240 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v22, off, s[0:3], s32 offset:1244 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v23, off, s[0:3], s32 offset:1248 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v24, off, s[0:3], s32 offset:1252 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v25, off, s[0:3], s32 offset:1256 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v26, off, s[0:3], s32 offset:1260 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v27, off, s[0:3], s32 offset:1264 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v28, off, s[0:3], s32 offset:1268 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v29, off, s[0:3], s32 offset:1272 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v30, off, s[0:3], s32 offset:1276 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v31, off, s[0:3], s32 offset:1280 ; 4-byte Folded Spill
+; GCN-NEXT: v_mov_b32_e32 v4, s20
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:260 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:264 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:268 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:272 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:276 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:280 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:284 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:288 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:292 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:296 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:300 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:304 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:308 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:312 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:316 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:320 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:324 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:328 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:332 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v19, off, s[0:3], s32 offset:336 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v20, off, s[0:3], s32 offset:340 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v21, off, s[0:3], s32 offset:344 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v22, off, s[0:3], s32 offset:348 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v23, off, s[0:3], s32 offset:352 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v24, off, s[0:3], s32 offset:356 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v25, off, s[0:3], s32 offset:360 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v26, off, s[0:3], s32 offset:364 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v27, off, s[0:3], s32 offset:368 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v28, off, s[0:3], s32 offset:372 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v29, off, s[0:3], s32 offset:376 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v30, off, s[0:3], s32 offset:380 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v31, off, s[0:3], s32 offset:384 ; 4-byte Folded Spill
+; GCN-NEXT: v_mov_b32_e32 v5, s21
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:1028 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:1032 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:1036 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:1040 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:1044 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:1048 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:1052 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:1056 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:1060 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:1064 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:1068 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:1072 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:1076 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:1080 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:1084 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:1088 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:1092 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:1096 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:1100 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v19, off, s[0:3], s32 offset:1104 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v20, off, s[0:3], s32 offset:1108 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v21, off, s[0:3], s32 offset:1112 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v22, off, s[0:3], s32 offset:1116 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v23, off, s[0:3], s32 offset:1120 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v24, off, s[0:3], s32 offset:1124 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v25, off, s[0:3], s32 offset:1128 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v26, off, s[0:3], s32 offset:1132 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v27, off, s[0:3], s32 offset:1136 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v28, off, s[0:3], s32 offset:1140 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v29, off, s[0:3], s32 offset:1144 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v30, off, s[0:3], s32 offset:1148 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v31, off, s[0:3], s32 offset:1152 ; 4-byte Folded Spill
+; GCN-NEXT: v_mov_b32_e32 v6, s22
+; GCN-NEXT: v_mov_b32_e32 v8, s24
+; GCN-NEXT: buffer_store_dword v0, off, s[0:3], s32 offset:388 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v1, off, s[0:3], s32 offset:392 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v2, off, s[0:3], s32 offset:396 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v3, off, s[0:3], s32 offset:400 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v4, off, s[0:3], s32 offset:404 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v5, off, s[0:3], s32 offset:408 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v6, off, s[0:3], s32 offset:412 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v7, off, s[0:3], s32 offset:416 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v8, off, s[0:3], s32 offset:420 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v9, off, s[0:3], s32 offset:424 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v10, off, s[0:3], s32 offset:428 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v11, off, s[0:3], s32 offset:432 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v12, off, s[0:3], s32 offset:436 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v13, off, s[0:3], s32 offset:440 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v14, off, s[0:3], s32 offset:444 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v15, off, s[0:3], s32 offset:448 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v16, off, s[0:3], s32 offset:452 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v17, off, s[0:3], s32 offset:456 ; 4-byte Folded Spill
+; GCN-NEXT: buffer_store_dword v18, off, s[0:3], s32 offset:460 ; 4-byte Folded Spill
+; GCN-NEX...
[truncated]
|
; RUN: llc -mtriple=amdgcn -mcpu=gfx900 < %s | FileCheck -check-prefixes=GFX9 %s | ||
; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 < %s | FileCheck -check-prefixes=GFX11 %s | ||
|
||
define inreg <4 x float> @bitcast_v4i32_to_v4f32_inreg(<4 x i32> inreg %a, i32 inreg %b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inreg on the return value is not implemented. The convention is to use s_ prefixes to indicate the SGPR tests. inreg is a test implementation detail.
For the SGPR sink, it's a bit annoying. You need to use a shader calling convention (i.e. amdgpu_ps), or inline asm with an SGPR constraint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arsenm , you mean like this?
; RUN: llc -mtriple=amdgcn -mcpu=gfx900 < %s | FileCheck -check-prefixes=S_GFX9 %s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, no S_ in the check prefix. I mean in the test function names, e.g. @s_bitcast....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't include inreg in the test file names, or function names. Can also merge into the same size tests
@@ -0,0 +1,6419 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test filenames still mention inreg this is not about inreg. This is the scalar case, and these can merge with the existing type size tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File name and merge still not done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @arsenm , done rename test cases to xxxx_scalar
and merged them into existing type size files. Sorry for the delay~
llvm/test/lit.cfg.py
Outdated
@@ -466,7 +466,7 @@ def have_cxx_shared_library(): | |||
print("could not exec llvm-readobj") | |||
return False | |||
|
|||
readobj_out = readobj_cmd.stdout.read().decode("ascii") | |||
readobj_out = readobj_cmd.stdout.read().decode("utf-8") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/16600 Here is the relevant piece of the build log for the reference
|
Hello @Shoreshen , With this patch, the
testcase starts failing if you run verifiers, e.g. add -verify-machineinstrs (or compile with EXPENSIVE_CHECKS).
|
Hi really sorry, I'll put up a fix up PR now |
Hi @mikaelholmen , the fix up PR is here #139868. if you could review and approve it?? Thanks so much |
This is a fix up PR for #136112 There are test cases failing machine instruction verifier due to bundle (see this issue:#139102 (comment))
Hi @mikaelholmen , the fix up PR has been merges, please check. Sorry again and thank you for notifying me! |
Great! I'm a little surprised that there are no comments here in this patch from failing expensive-checks build bots due to this. Now I saw the failure in downstream non-public bots instead. |
…#139868) This is a fix up PR for llvm/llvm-project#136112 There are test cases failing machine instruction verifier due to bundle (see this issue:llvm/llvm-project#139102 (comment))
Hi @mikaelholmen thanks, I'm surprised too... BTW can you double check if the case has been resolved? Since I see the latest commits are still failing expensive check........ But it looks like the cases are not from this PR... |
The amdgcn.bitcast.1024bit.ll testcases passes for me even with EXPENSIVE_CHECKS after your fix. (Btw, that testcase is pretty bonkers... 250374 lines and takes a long time to execute.) |
It exposes some bugs in the existing code. |
Bit of history.... First in here: #131775 (comment) Then following the comments I put up this PR:#131955 I'll discuss with the team and see if we can change it back to only check the function name.... |
Add inreg test for sgpr purpose
This is the second PR after #135729.
To test sgpr inputs and outputs, using inreg cases for bit-conversions