-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[TargetInstrInfo] enable foldMemoryOperand for InlineAsm #70743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it generally correct to untie an operand?
I don't know how the
tie
constraints are set for inline asm so sorry if the question is dumb!Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only registers may be tied. Note how
MachineInstr::untieRegOperand()
does nothing if!MachineOperand::isReg()
and callsMachineOperand::isTied()
which asserts if!isReg()
.Because we are folding register
MachineOperand
s into Frame Indices, we must untie them, otherwise we will assert inMachineOperand::ChangeToFrameIndex
(line 582). Many of theMachineOperand::ChangeTo*
methods assert if the input is a tied register and you try to change them (without manually untieing them first).This is also the reason why I had to "remember" which operands were tied when splicing new ones in a41b149 ; if you splice new operands in the middle, you need to remember the existing ties, untie them, move them, insert new operands, then manually retie them.
Additional context on what even is a tied operand in the context of inline asm.
For inline asm, you can have inputs, outputs, in/outs, or 2 variables where one is an input, one is an output, but for the sake of register allocation the location chosen for the inline asm is one and the same.
x
is an output,y
is an input. But they will be allocated the same storage location for the duration of the inline asm;%0
will be equivalent to%1
.I don't know when that's ever preferable to just using one variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what I'm saying is tied operands and untied operands don't have the same semantic and I was wondering if this code could break that.
What I was thinking is, with tied operand, we'll have something like:
When we untie and fold, we'll end up with:
At this point,
a
and@a
don't hold the same value, so we broke the tied operand semantic.Now, I think this is not a problem for regular code because if
@a
is used somewhere else, regalloc should have produced the correct store operation, i.e.,My concern is around inline asm that regalloc cannot unbox.
What happens if the inline asm statement is a sequence of instructions:
That we turn into:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: A safe temporary solution could be to disallow the folding (i.e., use the
m
variant only) for tied operands.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that we call this recursively on the tied operand, so we would produce:
So in theory this is fine.
In practice, I don't believe this works all the time because IIRC, for instance, x86 cannot have a memory operand both as definition and as argument of the same instruction.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Support for tied operands is table stakes for resolving #20571 because clang will lower "+" modifiers ("in+out" variables) to tied operands. We need to support:
so 3 is table stakes.
If you have tied operands for inline asm, you may have 2 distinct variables (one as input, one as output), but for the duration of the inline asm blob, they have 1 (the same) storage location (register or stack slot). So in the inline asm, you'd only refer to the 1 storage location (be it register or memory) and not both values. So it doesn't matter if x86 cannot have a memory def and use as operands on the same instr. My example above wasn't clear in that regard, perhaps another example:
in the case,
%1
is not referenced in the inline asm, only%0
is. But if it were, it would have the same value as%0
because that's what tying is (at least my understanding of it, which may be wrong).That is correct but I think only so for register operands. If you have 2 memory operands which refer to the same stack slot, they are in effect tied. IIUC, tied operands only matter for the purpose of register allocation. If an input value is tied to an output value, tieing them just means "put them in the same place (be that a register or memory location)."
So in MIR if we have:
(i.e.
%0
is tied to%1
) then that means the developer is asking the register allocator "please assign%0
and%1
to the same storage location (be that register or stack slot) for the duration of the INLINEASM blob.Given that example as input, after my change (this change), we first notice that
%0
is tied, so we untie it, and transform the above (the recursive call) to:which replaces
%1
(a tied register def) with a frame index. Then upon return, we continue transforming%0
replacing that with the same frame index (in this case%stack.o
):(this is better reflected in the test added in the child patch in #70832)
Yes! Essentially if the output is tied to the input, then
@a = @a + ...
is produced. If it's not tied, then you'd geta = @a + ...
which is maybe what you want (for an input or output, but not an in/out).It's still possible that I'm completely misunderstanding something about tied operands; I hope the above cleared up some questions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TL;DR Sounds fine to me, we're not creating new problems.
Agree on all what you're saying but my point is when we replace operands with memory locations, the resulting inline assembly may not be correct or put differently, I don't see where we check that the resulting asm is correct.
I actually realize now that this is not specific to tied operands.
E.g., consider the following inline asm constraints on x86 (with
someop
being something that makes sense)The resulting code will be fine as long as regalloc spills at most one operand. If it spills two or more, we'll have more than one memory location on
someop
. Assuming someop is a single x86 operation, I believe this will produce something invalid.With tied operands, I believe we just make this more likely to happen, but then considering that previously we were just using the
m
constraint in these cases, we already had this issue anyway.Therefore, nothing to worry about, I guess this is the responsibility of the developers to set something that make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, but for that case, I would say that the inline asm is invalid and should only use
m
on one or two of the three operands:That's just another instance of "the constraints were wrongly specified, but you didn't notice until there was register pressure (and the compiler was forced to choose m rather than r)." And that's not specific to clang or GCC.
A lot of the cases you're thinking of seem to be related to any given single x86 instruction within inline asm likely not supporting more than 1 (or is it 2) memory operands. But for tied operands, you likely have more than one instruction in your inline asm, so that's less likely to be problematic.
so if we can't allocate
output
to a register, andinput
is tied, the single stack slot chosen for the both of them is not an issue, since they are used in different instructions (which happen to support register or memory for those operands).My point is that the use of tied operands is likely to be with inline asm w/ more than one instruction, so the limitations of most modern CISC ISAs (is x86 modern...well, relative to other CISC) around how many memory operands they can have I think is not a concern. (If it is, because only one instruction is being used, then it's probably a mistake for the programmer to specify "rm" rather than "r").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the confirmation.
Agree on everything you said :).
(And already approved the PR)