Skip to content

[MachineInstr] add insert method for variadic instructions #67699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Oct 30, 2023

Conversation

nickdesaulniers
Copy link
Member

@nickdesaulniers nickdesaulniers commented Sep 28, 2023

As alluded to in #20571, it would be nice if we could mutate operand
lists of MachineInstr's more safely. Add an insert method that together
with removeOperand allows for easier splicing of operands.

Splitting this patch off early to get feedback; I need to either:

  • mutate an INLINEASM{_BR} MachinInstr's MachineOperands from being
    registers (physical or virtual) to memory
    (MachineOperandType::MO_FrameIndex). These are not 1:1 operand
    replacements, but N:M operand replacements. i.e. we need to
    update 2 MachineOperands into the middle of the operand list to 5 (at
    least for x86_64).
  • copy, modify, write a new MachineInstr which has its relevant operands
    replaced.

Either approaches are hazarded by existing references to either the
operands being moved, or the instruction being removed+replaced. For my
purposes in regalloc, either seem to work for me, so hopefully reviewers
can help me determine which approach is preferable. The second would
involve no new methods on MachineInstr.

One question I had while looking at this was: "why does MachineInstr
have BOTH a NumOperands member AND a MCInstrDesc member that itself has
a NumOperands member? How many operands can a MachineInstr have? Do I
need to update BOTH (keeping them in sync)?" FWICT, only "variadic"
MachineInstrs have MCInstrDesc with NumOperands (of the MCInstrDesc) set
to zero. If the MCInstrDesc's NumOperands is non-zero, then the NumOperands
on the MachineInstr itself cannot exceed this value (IIUC) else an assert will
be triggered.

For most non-psuedo instructions (or at least non-varidic instructions),
insert is less likely to be useful.

To run the newly added unittest:
$ pushd llvm/build; ninja CodeGenTests; popd
$ ./llvm/build/unittests/CodeGen/CodeGenTests
--gtest_filter=MachineInstrTest.SpliceOperands

This is meant to mirror MCInst::insert.

As alluded to in llvm#20571, it would be nice if we could mutate operand
lists of MachineInstr's more safely. Add an insert method that together
with removeOperand allows for easier splicing of operands.

Splitting this patch off early to get feedback; I need to either:
- mutate an INLINEASM{_BR} MachinInstr's MachineOperands from being
  registers (physical or virtual) to memory
  (MachineOperandType::MO_FrameIndex).
- copy, modify, write a new MachineInstr which has its relevant operands
  replaced.

Either approaches are hazarded by existing references to either the
operands being moved, or the instruction being removed+replaced. For my
purposes in regalloc, either seem to work for me, so hopefully reviewers
can help me determine which approach is preferable. The second would
involve no new methods on MachineInstr.

One question I had while looking at this was: "why does MachineInstr
have BOTH a NumOperands member AND a MCInstrDesc member that itself has
a NumOperands member? How many operands can a MachineInstr have? Do I
need to update BOTH (keeping them in sync)?" FWICT, only "variadic"
MachineInstrs have MCInstrDesc with NumOperands (of the MCInstrDesc) set
to zero.  If the MCInstrDesc is non-zero, then the NumOperands on the
MachineInstr itself cannot exceed this value (IIUC) else an assert will
be triggered.

For most non-psuedo instructions (or at least non-varidic instructions),
insert is less likely to be useful.

To run the newly added unittest:
    $ pushd llvm/build; ninja CodeGenTests; popd
    $ ./llvm/build/unittests/CodeGen/CodeGenTests \
        --gtest_filter=MachineInstrTest.SpliceOperands
@qcolombet
Copy link
Collaborator

One question I had while looking at this was: "why does MachineInstr
have BOTH a NumOperands member AND a MCInstrDesc member that itself has
a NumOperands member?

The short answer is:
MCInstrDesc holds what is known statically.
NumOperands holds how many actual operands there are attached to this particular MachineInstr at compile time.

E.g., consider a PHI instruction.
Its MCInstrDesc is going to describe the number of operands we must have and that's 1 (the definition of the PHI).
Now the NumOperands of the related instantiation of a PHI will report the definition and all the arguments.

Generally speaking, there is a guarantee that MI's NumOperands >= MCInstrDesc's NumOperands.
(See

if (MI->getNumOperands() < MCID.getNumOperands()) {
)

How many operands can a MachineInstr have?

Unlimited :).

Do I
need to update BOTH (keeping them in sync)?"

MCInstrDesc's NumOperands cannot be changed, so if you need to keep something in sync that means you need to set a different MCInstrDesc on the related MI. (E.g., go from ADDI32rr to ADDI32rm.)

MI's NumOperands is updated automatically through the calls to removeOperand and addOperand.

FWICT, only "variadic"
MachineInstrs have MCInstrDesc with NumOperands (of the MCInstrDesc) set
to zero. If the MCInstrDesc's NumOperands is non-zero, then the NumOperands
on the MachineInstr itself cannot exceed this value (IIUC) else an assert will
be triggered.

Hmm that doesn't sound right.
Where do you see this assert?

IIRC the only thing we check is that the number of definitions and the number of arguments are at least as big as what MCInstrDesc requires.
Then, of course, there's a bunch of other checks around register classes and whatnot, but with respect to the number of operands, that's about it I think.

Long story short, for inline asm, I don't think you have to worry about NumOperands.

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an API to insert operands in the middle of the operand list seems fine. I'd just leave semantics like implicit, variadic etc. out of it (and make that a problem for callers and verifiers).

@nickdesaulniers
Copy link
Member Author

Marking WIP as I've found one hazard with this interface.

Tied operands don't like being removed. We need to untie them, then re-tie them around the operands list updates.

@nickdesaulniers nickdesaulniers marked this pull request as ready for review October 27, 2023 22:05
@nickdesaulniers
Copy link
Member Author

Sorry for the delays. Was "wrastling" with register allocators this past week. This is ready for review.


for (unsigned I = 0; I < OpsToMove; ++I) {
MovingOps.emplace_back(getOperand(OpIdx));
removeOperand(OpIdx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do this in reverse order so we waste less cycles on the moveOperands() call within MachineInstr::removeOperand(). On that note I wonder if moveOperands() would work here as well instead of the removeOperand / addOperand for MovingOps here?

Admittedly it looks like moveOperands() doesn't really adjust the TiedTo markers so that's why it wouldn't work here?

Copy link
Member Author

@nickdesaulniers nickdesaulniers Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that note I wonder if moveOperands() would work here as well instead of the removeOperand / addOperand for MovingOps here?

The issue there is we would miss the calls to MachineFunction::allocateOperandArray from MachineInstr::addOperand. Since splice grows the operand list, we need to realloc the storage for the operands. MachineInstr::moveOperands just memmoves operands around which is not safe to do when the backing store hasn't been grown.

Admittedly it looks like moveOperands() doesn't really adjust the TiedTo markers so that's why it wouldn't work here?

That's also an issue, though I'm manually handling tied operands.

Maybe do this in reverse order so we waste less cycles on the moveOperands() call within MachineInstr::removeOperand().

I don't think there are any cycles to be saved? Perhaps if splice is being used to append operands on the end? Usually you splice operands into the middle, so you need to memmove the elements occurring after the insertion point multiple times.

Notice in this loop that the induction variable I is not being used as an index; we're removing OpIdx repeatedly as this is slicing off the elements past the insertion point.

Later, we need to re-add the operands back in the correct order (though it's not hard to reverse the loop below).

Did I perhaps miss something though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. I haven't looked at the code as deeply as you have; just wanted to ask the questions to make sure :)

@nickdesaulniers
Copy link
Member Author

nickdesaulniers commented Oct 30, 2023

Spooky, looks like my newly added tests are failing in presubmit.

When I run them locally via:

$ ./llvm/build/unittests/CodeGen/CodeGenTests --gtest_filter=MachineInstrTest.SpliceOperands

I observe no issues. When I run them via:

$ ./llvm/build/unittests/CodeGen/CodeGenTests

sometimes I observe a segfault, sometimes I observe

CodeGenTests: /android0/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:370: Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed.

or:

CodeGenTests: /android0/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:281: void llvm::MachineRegisterInfo::addRegOperandToUseList(MachineOperand *): Assertion `MO->getReg() == Head->getReg() && "Different regs on the same list!"' failed.

Is there some kind of race on MachineRegisterInfo for these tests?

@nickdesaulniers nickdesaulniers marked this pull request as draft October 30, 2023 16:19
Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


for (unsigned I = 0; I < OpsToMove; ++I) {
MovingOps.emplace_back(getOperand(OpIdx));
removeOperand(OpIdx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense. I haven't looked at the code as deeply as you have; just wanted to ask the questions to make sure :)

@nickdesaulniers
Copy link
Member Author

Spooky, looks like my newly added tests are failing in presubmit.

Still a draft because of this. I wonder if my non-MachineInstrBuilder raw usage of the MachineInstr is problematic.

FWICT, these two lines will sometimes crash (sometimes depends which):

MI->getOperand(0).ChangeToRegister(1, /*isDef=*/true);                                                                                     
MI->getOperand(1).ChangeToRegister(2, /*isDef=*/false);

I wonder if -DLLVM_ENABLE_ASSERTIONS=ON isn't being applied to the unittests?

@MatzeB
Copy link
Contributor

MatzeB commented Oct 30, 2023

I'd be extremely surprised if we mess up the assertion setting for unit tests.

Have you tried adding a call to verifyMachineFunction() yet? Sometimes that uncovers more problems...

@nickdesaulniers nickdesaulniers marked this pull request as ready for review October 30, 2023 18:47
@nickdesaulniers
Copy link
Member Author

Is there some kind of race on MachineRegisterInfo for these tests?

fixed in 8449626, will wait for a green presubmit run before submitting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants