Skip to content

AMDGPU: Fix foldImmediate breaking register class constraints #127481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3473,14 +3473,19 @@ bool SIInstrInfo::foldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
assert(UseMI.getOperand(1).getReg().isVirtual());
}

MachineFunction *MF = UseMI.getMF();
const MCInstrDesc &NewMCID = get(NewOpc);
if (DstReg.isPhysical() &&
!RI.getRegClass(NewMCID.operands()[0].RegClass)->contains(DstReg))
const TargetRegisterClass *NewDefRC = getRegClass(NewMCID, 0, &RI, *MF);

if (DstReg.isPhysical()) {
if (!NewDefRC->contains(DstReg))
return false;
} else if (!MRI->constrainRegClass(DstReg, NewDefRC))
return false;

UseMI.setDesc(NewMCID);
UseMI.getOperand(1).ChangeToImmediate(Imm.getSExtValue());
UseMI.addImplicitDefUseOperands(*UseMI.getParent()->getParent());
UseMI.addImplicitDefUseOperands(*MF);
return true;
}

Expand Down
43 changes: 24 additions & 19 deletions llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
Original file line number Diff line number Diff line change
Expand Up @@ -419,25 +419,30 @@ body: |

...

# FIXME:
# ---
# name: fold_v_mov_b64_64_to_unaligned
# body: |
# bb.0:
# %0:vreg_64_align2 = V_MOV_B64_e32 1311768467750121200, implicit $exec
# %1:vreg_64 = COPY killed %0
# SI_RETURN_TO_EPILOG implicit %1
# ...

# FIXME:
# ---
# name: fold_v_mov_b64_pseudo_64_to_unaligned
# body: |
# bb.0:
# %0:vreg_64_align2 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
# %1:vreg_64 = COPY killed %0
# SI_RETURN_TO_EPILOG implicit %1
# ...
---
name: fold_v_mov_b64_64_to_unaligned
body: |
bb.0:
; GCN-LABEL: name: fold_v_mov_b64_64_to_unaligned
; GCN: [[V_MOV_B64_e32_:%[0-9]+]]:vreg_64_align2 = V_MOV_B64_e32 1311768467750121200, implicit $exec
; GCN-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64_align2 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%1 has been forced to *_aling2 after the transformation. I assume it is because the src register in the original copy was of *_aling2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

; GCN-NEXT: SI_RETURN_TO_EPILOG implicit [[V_MOV_B]]
%0:vreg_64_align2 = V_MOV_B64_e32 1311768467750121200, implicit $exec
%1:vreg_64 = COPY killed %0
SI_RETURN_TO_EPILOG implicit %1
...

---
name: fold_v_mov_b64_pseudo_64_to_unaligned
body: |
bb.0:
; GCN-LABEL: name: fold_v_mov_b64_pseudo_64_to_unaligned
; GCN: [[V_MOV_B:%[0-9]+]]:vreg_64_align2 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
; GCN-NEXT: SI_RETURN_TO_EPILOG implicit [[V_MOV_B]]
%0:vreg_64_align2 = V_MOV_B64_PSEUDO 1311768467750121200, implicit $exec
%1:vreg_64 = COPY killed %0
SI_RETURN_TO_EPILOG implicit %1
...

---
name: fold_s_brev_b32_simm_virtual_0
Expand Down