Skip to content

[AMDGPU][True16][MC] add fake16 error and promote test #135984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Apr 16, 2025

This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:

  1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
  2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if the used registers are not encodable in the 32-bit form.

@broxigarchen broxigarchen changed the title promote and err test AMDGPU][True16][MC] add fake16 error and promote test Apr 16, 2025
@broxigarchen broxigarchen changed the title AMDGPU][True16][MC] add fake16 error and promote test [AMDGPU][True16][MC] add fake16 error and promote test Apr 16, 2025
@broxigarchen broxigarchen marked this pull request as ready for review April 16, 2025 16:40
@broxigarchen broxigarchen requested review from Sisyph and kosarev April 16, 2025 16:40
@llvmbot llvmbot added backend:AMDGPU mc Machine (object) code labels Apr 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2025

@llvm/pr-subscribers-mc

Author: Brox Chen (broxigarchen)

Changes

This is a NFC patch.

Added error and promote test for fake16 flow


Patch is 761.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135984.diff

18 Files Affected:

  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s (+90)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s (+1481)
  • (removed) llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err-fake16.s (-89)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop2_fake16_err.s (+228)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop2_fake16_promote.s (+191)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopc_fake16_err.s (+1776)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopc_fake16_promote.s (+2369)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopcx_fake16_err.s (+489)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopcx_fake16_promote.s (+488)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop1_fake16_err.s (+506)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop1_fake16_promote.s (+1481)
  • (removed) llvm/test/MC/AMDGPU/gfx12_asm_vop1_t16_err-fake16.s (-505)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop2_fake16_err.s (+228)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop2_fake16_promote.s (+191)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopc_fake16_err.s (+1776)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopc_fake16_promote.s (+2369)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopcx_fake16_err.s (+489)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopcx_fake16_promote.s (+488)
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s
new file mode 100644
index 0000000000000..ee089d1faeaaa
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s
@@ -0,0 +1,90 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --sort --version 5
+// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize32 -show-encoding %s 2>&1 | FileCheck --check-prefix=GFX11 --implicit-check-not=error: %s
+// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize64 -show-encoding %s 2>&1 | FileCheck --check-prefix=GFX11 --implicit-check-not=error: %s
+
+v_ceil_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_exp_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_exp_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_exp_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_log_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_log_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_log_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s
new file mode 100644
index 0000000000000..ee9f1be0410b6
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s
@@ -0,0 +1,1481 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --sort --version 5
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -show-encoding %s | FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s
+
+v_ceil_f16 v128, 0xfe0b
+// GFX11: v_ceil_f16_e64 v128, 0xfe0b             ; encoding: [0x80,0x00,0xdc,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_ceil_f16 v255, -1
+// GFX11: v_ceil_f16_e64 v255, -1                 ; encoding: [0xff,0x00,0xdc,0xd5,0xc1,0x00,0x00,0x00]
+
+v_ceil_f16 v255, 0.5
+// GFX11: v_ceil_f16_e64 v255, 0.5                ; encoding: [0xff,0x00,0xdc,0xd5,0xf0,0x00,0x00,0x00]
+
+v_ceil_f16 v255, exec_hi
+// GFX11: v_ceil_f16_e64 v255, exec_hi            ; encoding: [0xff,0x00,0xdc,0xd5,0x7f,0x00,0x00,0x00]
+
+v_ceil_f16 v255, exec_lo
+// GFX11: v_ceil_f16_e64 v255, exec_lo            ; encoding: [0xff,0x00,0xdc,0xd5,0x7e,0x00,0x00,0x00]
+
+v_ceil_f16 v255, m0
+// GFX11: v_ceil_f16_e64 v255, m0                 ; encoding: [0xff,0x00,0xdc,0xd5,0x7d,0x00,0x00,0x00]
+
+v_ceil_f16 v255, null
+// GFX11: v_ceil_f16_e64 v255, null               ; encoding: [0xff,0x00,0xdc,0xd5,0x7c,0x00,0x00,0x00]
+
+v_ceil_f16 v255, s1
+// GFX11: v_ceil_f16_e64 v255, s1                 ; encoding: [0xff,0x00,0xdc,0xd5,0x01,0x00,0x00,0x00]
+
+v_ceil_f16 v255, s105
+// GFX11: v_ceil_f16_e64 v255, s105               ; encoding: [0xff,0x00,0xdc,0xd5,0x69,0x00,0x00,0x00]
+
+v_ceil_f16 v255, src_scc
+// GFX11: v_ceil_f16_e64 v255, src_scc            ; encoding: [0xff,0x00,0xdc,0xd5,0xfd,0x00,0x00,0x00]
+
+v_ceil_f16 v255, ttmp15
+// GFX11: v_ceil_f16_e64 v255, ttmp15             ; encoding: [0xff,0x00,0xdc,0xd5,0x7b,0x00,0x00,0x00]
+
+v_ceil_f16 v255, v1
+// GFX11: v_ceil_f16_e64 v255, v1                 ; encoding: [0xff,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00]
+
+v_ceil_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_ceil_f16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_ceil_f16 v255, v127
+// GFX11: v_ceil_f16_e64 v255, v127               ; encoding: [0xff,0x00,0xdc,0xd5,0x7f,0x01,0x00,0x00]
+
+v_ceil_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_ceil_f16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_ceil_f16 v255, vcc_hi
+// GFX11: v_ceil_f16_e64 v255, vcc_hi             ; encoding: [0xff,0x00,0xdc,0xd5,0x6b,0x00,0x00,0x00]
+
+v_ceil_f16 v255, vcc_lo
+// GFX11: v_ceil_f16_e64 v255, vcc_lo             ; encoding: [0xff,0x00,0xdc,0xd5,0x6a,0x00,0x00,0x00]
+
+v_ceil_f16 v5, v199
+// GFX11: v_ceil_f16_e64 v5, v199                 ; encoding: [0x05,0x00,0xdc,0xd5,0xc7,0x01,0x00,0x00]
+
+v_ceil_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_ceil_f16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cos_f16 v128, 0xfe0b
+// GFX11: v_cos_f16_e64 v128, 0xfe0b              ; encoding: [0x80,0x00,0xe1,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cos_f16 v255, -1
+// GFX11: v_cos_f16_e64 v255, -1                  ; encoding: [0xff,0x00,0xe1,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cos_f16 v255, 0.5
+// GFX11: v_cos_f16_e64 v255, 0.5                 ; encoding: [0xff,0x00,0xe1,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cos_f16 v255, exec_hi
+// GFX11: v_cos_f16_e64 v255, exec_hi             ; encoding: [0xff,0x00,0xe1,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cos_f16 v255, exec_lo
+// GFX11: v_cos_f16_e64 v255, exec_lo             ; encoding: [0xff,0x00,0xe1,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cos_f16 v255, m0
+// GFX11: v_cos_f16_e64 v255, m0                  ; encoding: [0xff,0x00,0xe1,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cos_f16 v255, null
+// GFX11: v_cos_f16_e64 v255, null                ; encoding: [0xff,0x00,0xe1,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cos_f16 v255, s1
+// GFX11: v_cos_f16_e64 v255, s1                  ; encoding: [0xff,0x00,0xe1,0xd5,0x01,0x00,0x00,0x00]
+
+v_cos_f16 v255, s105
+// GFX11: v_cos_f16_e64 v255, s105                ; encoding: [0xff,0x00,0xe1,0xd5,0x69,0x00,0x00,0x00]
+
+v_cos_f16 v255, src_scc
+// GFX11: v_cos_f16_e64 v255, src_scc             ; encoding: [0xff,0x00,0xe1,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cos_f16 v255, ttmp15
+// GFX11: v_cos_f16_e64 v255, ttmp15              ; encoding: [0xff,0x00,0xe1,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cos_f16 v255, v1
+// GFX11: v_cos_f16_e64 v255, v1                  ; encoding: [0xff,0x00,0xe1,0xd5,0x01,0x01,0x00,0x00]
+
+v_cos_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cos_f16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cos_f16 v255, v127
+// GFX11: v_cos_f16_e64 v255, v127                ; encoding: [0xff,0x00,0xe1,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cos_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cos_f16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cos_f16 v255, vcc_hi
+// GFX11: v_cos_f16_e64 v255, vcc_hi              ; encoding: [0xff,0x00,0xe1,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cos_f16 v255, vcc_lo
+// GFX11: v_cos_f16_e64 v255, vcc_lo              ; encoding: [0xff,0x00,0xe1,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cos_f16 v5, v199
+// GFX11: v_cos_f16_e64 v5, v199                  ; encoding: [0x05,0x00,0xe1,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cos_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cos_f16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v128, 0xaf123456
+// GFX11: v_cvt_f16_f32_e64 v128, 0xaf123456      ; encoding: [0x80,0x00,0x8a,0xd5,0xff,0x00,0x00,0x00,0x56,0x34,0x12,0xaf]
+
+v_cvt_f16_f32 v255, -1
+// GFX11: v_cvt_f16_f32_e64 v255, -1              ; encoding: [0xff,0x00,0x8a,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, 0.5
+// GFX11: v_cvt_f16_f32_e64 v255, 0.5             ; encoding: [0xff,0x00,0x8a,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, exec_hi
+// GFX11: v_cvt_f16_f32_e64 v255, exec_hi         ; encoding: [0xff,0x00,0x8a,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, exec_lo
+// GFX11: v_cvt_f16_f32_e64 v255, exec_lo         ; encoding: [0xff,0x00,0x8a,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, m0
+// GFX11: v_cvt_f16_f32_e64 v255, m0              ; encoding: [0xff,0x00,0x8a,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, null
+// GFX11: v_cvt_f16_f32_e64 v255, null            ; encoding: [0xff,0x00,0x8a,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, s1
+// GFX11: v_cvt_f16_f32_e64 v255, s1              ; encoding: [0xff,0x00,0x8a,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, s105
+// GFX11: v_cvt_f16_f32_e64 v255, s105            ; encoding: [0xff,0x00,0x8a,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, src_scc
+// GFX11: v_cvt_f16_f32_e64 v255, src_scc         ; encoding: [0xff,0x00,0x8a,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, ttmp15
+// GFX11: v_cvt_f16_f32_e64 v255, ttmp15          ; encoding: [0xff,0x00,0x8a,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, v1
+// GFX11: v_cvt_f16_f32_e64 v255, v1              ; encoding: [0xff,0x00,0x8a,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_f32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0x8a,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_f32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0x8a,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v255, v255
+// GFX11: v_cvt_f16_f32_e64 v255, v255            ; encoding: [0xff,0x00,0x8a,0xd5,0xff,0x01,0x00,0x00]
+
+v_cvt_f16_f32 v255, v255 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v255 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0x8a,0xd5,0xe9,0x00,0x00,0x00,0xff,0x77,0x39,0x05]
+
+v_cvt_f16_f32 v255, v255 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v255 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0x8a,0xd5,0xfa,0x00,0x00,0x00,0xff,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v255, vcc_hi
+// GFX11: v_cvt_f16_f32_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0x8a,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, vcc_lo
+// GFX11: v_cvt_f16_f32_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0x8a,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v128, 0xfe0b
+// GFX11: v_cvt_f16_i16_e64 v128, 0xfe0b          ; encoding: [0x80,0x00,0xd1,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cvt_f16_i16 v255, -1
+// GFX11: v_cvt_f16_i16_e64 v255, -1              ; encoding: [0xff,0x00,0xd1,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, 0.5
+// GFX11: v_cvt_f16_i16_e64 v255, 0.5             ; encoding: [0xff,0x00,0xd1,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, exec_hi
+// GFX11: v_cvt_f16_i16_e64 v255, exec_hi         ; encoding: [0xff,0x00,0xd1,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, exec_lo
+// GFX11: v_cvt_f16_i16_e64 v255, exec_lo         ; encoding: [0xff,0x00,0xd1,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, m0
+// GFX11: v_cvt_f16_i16_e64 v255, m0              ; encoding: [0xff,0x00,0xd1,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, null
+// GFX11: v_cvt_f16_i16_e64 v255, null            ; encoding: [0xff,0x00,0xd1,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, s1
+// GFX11: v_cvt_f16_i16_e64 v255, s1              ; encoding: [0xff,0x00,0xd1,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, s105
+// GFX11: v_cvt_f16_i16_e64 v255, s105            ; encoding: [0xff,0x00,0xd1,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, src_scc
+// GFX11: v_cvt_f16_i16_e64 v255, src_scc         ; encoding: [0xff,0x00,0xd1,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, ttmp15
+// GFX11: v_cvt_f16_i16_e64 v255, ttmp15          ; encoding: [0xff,0x00,0xd1,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, v1
+// GFX11: v_cvt_f16_i16_e64 v255, v1              ; encoding: [0xff,0x00,0xd1,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_i16 v255, v127
+// GFX11: v_cvt_f16_i16_e64 v255, v127            ; encoding: [0xff,0x00,0xd1,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cvt_f16_i16 v255, vcc_hi
+// GFX11: v_cvt_f16_i16_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0xd1,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, vcc_lo
+// GFX11: v_cvt_f16_i16_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0xd1,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v5, v199
+// GFX11: v_cvt_f16_i16_e64 v5, v199              ; encoding: [0x05,0x00,0xd1,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v128, 0xfe0b
+// GFX11: v_cvt_f16_u16_e64 v128, 0xfe0b          ; encoding: [0x80,0x00,0xd0,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cvt_f16_u16 v255, -1
+// GFX11: v_cvt_f16_u16_e64 v255, -1              ; encoding: [0xff,0x00,0xd0,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, 0.5
+// GFX11: v_cvt_f16_u16_e64 v255, 0.5             ; encoding: [0xff,0x00,0xd0,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, exec_hi
+// GFX11: v_cvt_f16_u16_e64 v255, exec_hi         ; encoding: [0xff,0x00,0xd0,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, exec_lo
+// GFX11: v_cvt_f16_u16_e64 v255, exec_lo         ; encoding: [0xff,0x00,0xd0,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, m0
+// GFX11: v_cvt_f16_u16_e64 v255, m0              ; encoding: [0xff,0x00,0xd0,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, null
+// GFX11: v_cvt_f16_u16_e64 v255, null            ; encoding: [0xff,0x00,0xd0,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, s1
+// GFX11: v_cvt_f16_u16_e64 v255, s1              ; encoding: [0xff,0x00,0xd0,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, s105
+// GFX11: v_cvt_f16_u16_e64 v255, s105            ; encoding: [0xff,0x00,0xd0,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, src_scc
+// GFX11: v_cvt_f16_u16_e64 v255, src_scc         ; encoding: [0xff,0x00,0xd0,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, ttmp15
+// GFX11: v_cvt_f16_u16_e64 v255, ttmp15          ; encoding: [0xff,0x00,0xd0,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, v1
+// GFX11: v_cvt_f16_u16_e64 v255, v1              ; encoding: [0xff,0x00,0xd0,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v255, v127
+// GFX11: v_cvt_f16_u16_e64 v255, v127            ; encoding: [0xff,0x00,0xd0,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v255, vcc_hi
+// GFX11: v_cvt_f16_u16_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0xd0,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, vcc_lo
+// GFX11: v_cvt_f16_u16_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0xd0,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v5, v199
+// GFX11: v_cvt_f16_u16_e64 v5, v199              ; encoding: [0x05,0x00,0xd0,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f32_f16 v5, v199
+// GFX11: ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Brox Chen (broxigarchen)

Changes

This is a NFC patch.

Added error and promote test for fake16 flow


Patch is 761.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135984.diff

18 Files Affected:

  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s (+90)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s (+1481)
  • (removed) llvm/test/MC/AMDGPU/gfx11_asm_vop1_t16_err-fake16.s (-89)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop2_fake16_err.s (+228)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vop2_fake16_promote.s (+191)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopc_fake16_err.s (+1776)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopc_fake16_promote.s (+2369)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopcx_fake16_err.s (+489)
  • (added) llvm/test/MC/AMDGPU/gfx11_asm_vopcx_fake16_promote.s (+488)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop1_fake16_err.s (+506)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop1_fake16_promote.s (+1481)
  • (removed) llvm/test/MC/AMDGPU/gfx12_asm_vop1_t16_err-fake16.s (-505)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop2_fake16_err.s (+228)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vop2_fake16_promote.s (+191)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopc_fake16_err.s (+1776)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopc_fake16_promote.s (+2369)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopcx_fake16_err.s (+489)
  • (added) llvm/test/MC/AMDGPU/gfx12_asm_vopcx_fake16_promote.s (+488)
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s
new file mode 100644
index 0000000000000..ee089d1faeaaa
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_err.s
@@ -0,0 +1,90 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --sort --version 5
+// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize32 -show-encoding %s 2>&1 | FileCheck --check-prefix=GFX11 --implicit-check-not=error: %s
+// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize64 -show-encoding %s 2>&1 | FileCheck --check-prefix=GFX11 --implicit-check-not=error: %s
+
+v_ceil_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_ceil_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_ceil_f16_e32 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:25: error: invalid operand for instruction
+
+v_exp_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_exp_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_exp_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_floor_f16_e32 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_floor_f16_e32 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: :[[@LINE-1]]:26: error: invalid operand for instruction
+
+v_log_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_log_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_log_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rcp_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_rsq_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v128, 0xfe0b
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v255, v1
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
+
+v_sqrt_f16_e32 v5, v199
+// GFX11: :[[@LINE-1]]:1: error: operands are not valid for this GPU or mode
diff --git a/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s
new file mode 100644
index 0000000000000..ee9f1be0410b6
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/gfx11_asm_vop1_fake16_promote.s
@@ -0,0 +1,1481 @@
+// NOTE: Assertions have been autogenerated by utils/update_mc_test_checks.py UTC_ARGS: --unique --sort --version 5
+// RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -show-encoding %s | FileCheck --check-prefix=GFX11 --implicit-check-not=_e32 %s
+
+v_ceil_f16 v128, 0xfe0b
+// GFX11: v_ceil_f16_e64 v128, 0xfe0b             ; encoding: [0x80,0x00,0xdc,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_ceil_f16 v255, -1
+// GFX11: v_ceil_f16_e64 v255, -1                 ; encoding: [0xff,0x00,0xdc,0xd5,0xc1,0x00,0x00,0x00]
+
+v_ceil_f16 v255, 0.5
+// GFX11: v_ceil_f16_e64 v255, 0.5                ; encoding: [0xff,0x00,0xdc,0xd5,0xf0,0x00,0x00,0x00]
+
+v_ceil_f16 v255, exec_hi
+// GFX11: v_ceil_f16_e64 v255, exec_hi            ; encoding: [0xff,0x00,0xdc,0xd5,0x7f,0x00,0x00,0x00]
+
+v_ceil_f16 v255, exec_lo
+// GFX11: v_ceil_f16_e64 v255, exec_lo            ; encoding: [0xff,0x00,0xdc,0xd5,0x7e,0x00,0x00,0x00]
+
+v_ceil_f16 v255, m0
+// GFX11: v_ceil_f16_e64 v255, m0                 ; encoding: [0xff,0x00,0xdc,0xd5,0x7d,0x00,0x00,0x00]
+
+v_ceil_f16 v255, null
+// GFX11: v_ceil_f16_e64 v255, null               ; encoding: [0xff,0x00,0xdc,0xd5,0x7c,0x00,0x00,0x00]
+
+v_ceil_f16 v255, s1
+// GFX11: v_ceil_f16_e64 v255, s1                 ; encoding: [0xff,0x00,0xdc,0xd5,0x01,0x00,0x00,0x00]
+
+v_ceil_f16 v255, s105
+// GFX11: v_ceil_f16_e64 v255, s105               ; encoding: [0xff,0x00,0xdc,0xd5,0x69,0x00,0x00,0x00]
+
+v_ceil_f16 v255, src_scc
+// GFX11: v_ceil_f16_e64 v255, src_scc            ; encoding: [0xff,0x00,0xdc,0xd5,0xfd,0x00,0x00,0x00]
+
+v_ceil_f16 v255, ttmp15
+// GFX11: v_ceil_f16_e64 v255, ttmp15             ; encoding: [0xff,0x00,0xdc,0xd5,0x7b,0x00,0x00,0x00]
+
+v_ceil_f16 v255, v1
+// GFX11: v_ceil_f16_e64 v255, v1                 ; encoding: [0xff,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00]
+
+v_ceil_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_ceil_f16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_ceil_f16 v255, v127
+// GFX11: v_ceil_f16_e64 v255, v127               ; encoding: [0xff,0x00,0xdc,0xd5,0x7f,0x01,0x00,0x00]
+
+v_ceil_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_ceil_f16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_ceil_f16 v255, vcc_hi
+// GFX11: v_ceil_f16_e64 v255, vcc_hi             ; encoding: [0xff,0x00,0xdc,0xd5,0x6b,0x00,0x00,0x00]
+
+v_ceil_f16 v255, vcc_lo
+// GFX11: v_ceil_f16_e64 v255, vcc_lo             ; encoding: [0xff,0x00,0xdc,0xd5,0x6a,0x00,0x00,0x00]
+
+v_ceil_f16 v5, v199
+// GFX11: v_ceil_f16_e64 v5, v199                 ; encoding: [0x05,0x00,0xdc,0xd5,0xc7,0x01,0x00,0x00]
+
+v_ceil_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xdc,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_ceil_f16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_ceil_f16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xdc,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cos_f16 v128, 0xfe0b
+// GFX11: v_cos_f16_e64 v128, 0xfe0b              ; encoding: [0x80,0x00,0xe1,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cos_f16 v255, -1
+// GFX11: v_cos_f16_e64 v255, -1                  ; encoding: [0xff,0x00,0xe1,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cos_f16 v255, 0.5
+// GFX11: v_cos_f16_e64 v255, 0.5                 ; encoding: [0xff,0x00,0xe1,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cos_f16 v255, exec_hi
+// GFX11: v_cos_f16_e64 v255, exec_hi             ; encoding: [0xff,0x00,0xe1,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cos_f16 v255, exec_lo
+// GFX11: v_cos_f16_e64 v255, exec_lo             ; encoding: [0xff,0x00,0xe1,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cos_f16 v255, m0
+// GFX11: v_cos_f16_e64 v255, m0                  ; encoding: [0xff,0x00,0xe1,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cos_f16 v255, null
+// GFX11: v_cos_f16_e64 v255, null                ; encoding: [0xff,0x00,0xe1,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cos_f16 v255, s1
+// GFX11: v_cos_f16_e64 v255, s1                  ; encoding: [0xff,0x00,0xe1,0xd5,0x01,0x00,0x00,0x00]
+
+v_cos_f16 v255, s105
+// GFX11: v_cos_f16_e64 v255, s105                ; encoding: [0xff,0x00,0xe1,0xd5,0x69,0x00,0x00,0x00]
+
+v_cos_f16 v255, src_scc
+// GFX11: v_cos_f16_e64 v255, src_scc             ; encoding: [0xff,0x00,0xe1,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cos_f16 v255, ttmp15
+// GFX11: v_cos_f16_e64 v255, ttmp15              ; encoding: [0xff,0x00,0xe1,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cos_f16 v255, v1
+// GFX11: v_cos_f16_e64 v255, v1                  ; encoding: [0xff,0x00,0xe1,0xd5,0x01,0x01,0x00,0x00]
+
+v_cos_f16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cos_f16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cos_f16 v255, v127
+// GFX11: v_cos_f16_e64 v255, v127                ; encoding: [0xff,0x00,0xe1,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cos_f16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cos_f16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cos_f16 v255, vcc_hi
+// GFX11: v_cos_f16_e64 v255, vcc_hi              ; encoding: [0xff,0x00,0xe1,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cos_f16 v255, vcc_lo
+// GFX11: v_cos_f16_e64 v255, vcc_lo              ; encoding: [0xff,0x00,0xe1,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cos_f16 v5, v199
+// GFX11: v_cos_f16_e64 v5, v199                  ; encoding: [0x05,0x00,0xe1,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cos_f16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xe1,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cos_f16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cos_f16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xe1,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v128, 0xaf123456
+// GFX11: v_cvt_f16_f32_e64 v128, 0xaf123456      ; encoding: [0x80,0x00,0x8a,0xd5,0xff,0x00,0x00,0x00,0x56,0x34,0x12,0xaf]
+
+v_cvt_f16_f32 v255, -1
+// GFX11: v_cvt_f16_f32_e64 v255, -1              ; encoding: [0xff,0x00,0x8a,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, 0.5
+// GFX11: v_cvt_f16_f32_e64 v255, 0.5             ; encoding: [0xff,0x00,0x8a,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, exec_hi
+// GFX11: v_cvt_f16_f32_e64 v255, exec_hi         ; encoding: [0xff,0x00,0x8a,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, exec_lo
+// GFX11: v_cvt_f16_f32_e64 v255, exec_lo         ; encoding: [0xff,0x00,0x8a,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, m0
+// GFX11: v_cvt_f16_f32_e64 v255, m0              ; encoding: [0xff,0x00,0x8a,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, null
+// GFX11: v_cvt_f16_f32_e64 v255, null            ; encoding: [0xff,0x00,0x8a,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, s1
+// GFX11: v_cvt_f16_f32_e64 v255, s1              ; encoding: [0xff,0x00,0x8a,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, s105
+// GFX11: v_cvt_f16_f32_e64 v255, s105            ; encoding: [0xff,0x00,0x8a,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, src_scc
+// GFX11: v_cvt_f16_f32_e64 v255, src_scc         ; encoding: [0xff,0x00,0x8a,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, ttmp15
+// GFX11: v_cvt_f16_f32_e64 v255, ttmp15          ; encoding: [0xff,0x00,0x8a,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, v1
+// GFX11: v_cvt_f16_f32_e64 v255, v1              ; encoding: [0xff,0x00,0x8a,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_f32 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0x8a,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_f32 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0x8a,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v255, v255
+// GFX11: v_cvt_f16_f32_e64 v255, v255            ; encoding: [0xff,0x00,0x8a,0xd5,0xff,0x01,0x00,0x00]
+
+v_cvt_f16_f32 v255, v255 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v255 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0x8a,0xd5,0xe9,0x00,0x00,0x00,0xff,0x77,0x39,0x05]
+
+v_cvt_f16_f32 v255, v255 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_f32_e64_dpp v255, v255 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0x8a,0xd5,0xfa,0x00,0x00,0x00,0xff,0x1b,0x00,0xff]
+
+v_cvt_f16_f32 v255, vcc_hi
+// GFX11: v_cvt_f16_f32_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0x8a,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_f32 v255, vcc_lo
+// GFX11: v_cvt_f16_f32_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0x8a,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v128, 0xfe0b
+// GFX11: v_cvt_f16_i16_e64 v128, 0xfe0b          ; encoding: [0x80,0x00,0xd1,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cvt_f16_i16 v255, -1
+// GFX11: v_cvt_f16_i16_e64 v255, -1              ; encoding: [0xff,0x00,0xd1,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, 0.5
+// GFX11: v_cvt_f16_i16_e64 v255, 0.5             ; encoding: [0xff,0x00,0xd1,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, exec_hi
+// GFX11: v_cvt_f16_i16_e64 v255, exec_hi         ; encoding: [0xff,0x00,0xd1,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, exec_lo
+// GFX11: v_cvt_f16_i16_e64 v255, exec_lo         ; encoding: [0xff,0x00,0xd1,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, m0
+// GFX11: v_cvt_f16_i16_e64 v255, m0              ; encoding: [0xff,0x00,0xd1,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, null
+// GFX11: v_cvt_f16_i16_e64 v255, null            ; encoding: [0xff,0x00,0xd1,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, s1
+// GFX11: v_cvt_f16_i16_e64 v255, s1              ; encoding: [0xff,0x00,0xd1,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, s105
+// GFX11: v_cvt_f16_i16_e64 v255, s105            ; encoding: [0xff,0x00,0xd1,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, src_scc
+// GFX11: v_cvt_f16_i16_e64 v255, src_scc         ; encoding: [0xff,0x00,0xd1,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, ttmp15
+// GFX11: v_cvt_f16_i16_e64 v255, ttmp15          ; encoding: [0xff,0x00,0xd1,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, v1
+// GFX11: v_cvt_f16_i16_e64 v255, v1              ; encoding: [0xff,0x00,0xd1,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_i16 v255, v127
+// GFX11: v_cvt_f16_i16_e64 v255, v127            ; encoding: [0xff,0x00,0xd1,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cvt_f16_i16 v255, vcc_hi
+// GFX11: v_cvt_f16_i16_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0xd1,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v255, vcc_lo
+// GFX11: v_cvt_f16_i16_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0xd1,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_i16 v5, v199
+// GFX11: v_cvt_f16_i16_e64 v5, v199              ; encoding: [0x05,0x00,0xd1,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cvt_f16_i16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xd1,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cvt_f16_i16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_i16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xd1,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v128, 0xfe0b
+// GFX11: v_cvt_f16_u16_e64 v128, 0xfe0b          ; encoding: [0x80,0x00,0xd0,0xd5,0xff,0x00,0x00,0x00,0x0b,0xfe,0x00,0x00]
+
+v_cvt_f16_u16 v255, -1
+// GFX11: v_cvt_f16_u16_e64 v255, -1              ; encoding: [0xff,0x00,0xd0,0xd5,0xc1,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, 0.5
+// GFX11: v_cvt_f16_u16_e64 v255, 0.5             ; encoding: [0xff,0x00,0xd0,0xd5,0xf0,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, exec_hi
+// GFX11: v_cvt_f16_u16_e64 v255, exec_hi         ; encoding: [0xff,0x00,0xd0,0xd5,0x7f,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, exec_lo
+// GFX11: v_cvt_f16_u16_e64 v255, exec_lo         ; encoding: [0xff,0x00,0xd0,0xd5,0x7e,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, m0
+// GFX11: v_cvt_f16_u16_e64 v255, m0              ; encoding: [0xff,0x00,0xd0,0xd5,0x7d,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, null
+// GFX11: v_cvt_f16_u16_e64 v255, null            ; encoding: [0xff,0x00,0xd0,0xd5,0x7c,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, s1
+// GFX11: v_cvt_f16_u16_e64 v255, s1              ; encoding: [0xff,0x00,0xd0,0xd5,0x01,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, s105
+// GFX11: v_cvt_f16_u16_e64 v255, s105            ; encoding: [0xff,0x00,0xd0,0xd5,0x69,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, src_scc
+// GFX11: v_cvt_f16_u16_e64 v255, src_scc         ; encoding: [0xff,0x00,0xd0,0xd5,0xfd,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, ttmp15
+// GFX11: v_cvt_f16_u16_e64 v255, ttmp15          ; encoding: [0xff,0x00,0xd0,0xd5,0x7b,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, v1
+// GFX11: v_cvt_f16_u16_e64 v255, v1              ; encoding: [0xff,0x00,0xd0,0xd5,0x01,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v255, v1 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v1 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0x01,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v255, v1 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v1 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0x01,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v255, v127
+// GFX11: v_cvt_f16_u16_e64 v255, v127            ; encoding: [0xff,0x00,0xd0,0xd5,0x7f,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v255, v127 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v127 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0xff,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0x7f,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v255, v127 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v255, v127 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0xff,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0x7f,0x1b,0x00,0xff]
+
+v_cvt_f16_u16 v255, vcc_hi
+// GFX11: v_cvt_f16_u16_e64 v255, vcc_hi          ; encoding: [0xff,0x00,0xd0,0xd5,0x6b,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v255, vcc_lo
+// GFX11: v_cvt_f16_u16_e64 v255, vcc_lo          ; encoding: [0xff,0x00,0xd0,0xd5,0x6a,0x00,0x00,0x00]
+
+v_cvt_f16_u16 v5, v199
+// GFX11: v_cvt_f16_u16_e64 v5, v199              ; encoding: [0x05,0x00,0xd0,0xd5,0xc7,0x01,0x00,0x00]
+
+v_cvt_f16_u16 v5, v199 dpp8:[7,6,5,4,3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v5, v199 dpp8:[7,6,5,4,3,2,1,0] ; encoding: [0x05,0x00,0xd0,0xd5,0xe9,0x00,0x00,0x00,0xc7,0x77,0x39,0x05]
+
+v_cvt_f16_u16 v5, v199 quad_perm:[3,2,1,0]
+// GFX11: v_cvt_f16_u16_e64_dpp v5, v199 quad_perm:[3,2,1,0] row_mask:0xf bank_mask:0xf ; encoding: [0x05,0x00,0xd0,0xd5,0xfa,0x00,0x00,0x00,0xc7,0x1b,0x00,0xff]
+
+v_cvt_f32_f16 v5, v199
+// GFX11: ...
[truncated]

@@ -1,89 +0,0 @@
// RUN: not llvm-mc -triple=amdgcn -mcpu=gfx1100 -mattr=-real-true16,+wavefrontsize32 -show-encoding %s 2>&1 | FileCheck --check-prefix=GFX11 --implicit-check-not=error: %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we rename this to gfx11_asm_vop1_fake16_err.s? The convention seems to be adding the -fake16 suffix to whatever is the name of the non-fake16 version of the test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broxigarchen and I discussed it briefly. These tests were part of the initial bringup of GFX11 when we needed to disable registers over 128 on VOP12C, so the naming conventions were not well established at that time. The name is of course open to discussion. One factor is it is perhaps confusing to have both t16 and fake16 in the test name.

Copy link
Contributor

@Sisyph Sisyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding something to the commit message as a reminder that these tests are about promoting the instruction encoding to 64 bits if the used registers are not encodable in the 32-bit form would be helpful. LGTM.

@broxigarchen
Copy link
Contributor Author

Adding something to the commit message as a reminder that these tests are about promoting the instruction encoding to 64 bits if the used registers are not encodable in the 32-bit form would be helpful. LGTM.

done

@broxigarchen broxigarchen merged commit 343c784 into llvm:main Apr 28, 2025
16 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:
1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if
the used registers are not encodable in the 32-bit form.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:
1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if
the used registers are not encodable in the 32-bit form.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:
1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if
the used registers are not encodable in the 32-bit form.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:
1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if
the used registers are not encodable in the 32-bit form.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request May 9, 2025
This is a NFC patch.

Added error and promote test for fake16 flow. This includes two part:
1. "*vop1_t16_err-fake16.s" is renamed to "*vop1_fake16_err.s"
2. added missing "fake16-promote.s" and other "*fake16_err.s" files

These tests are about promoting the instruction encoding to 64 bits if
the used registers are not encodable in the 32-bit form.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants