-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV][CostModel] Make VMV_S_* and VMV_*_S cost independent of LMUL #78739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -84,6 +84,10 @@ RISCVTTIImpl::getRISCVInstructionCost(ArrayRef<unsigned> OpCodes, MVT VT, | |
Cost += VL; | ||
break; | ||
} | ||
case RISCV::VMV_X_S: | ||
case RISCV::VMV_S_X: | ||
Cost += 1; | ||
break; | ||
default: | ||
Cost += LMULCost; | ||
} | ||
|
@@ -446,7 +450,7 @@ InstructionCost RISCVTTIImpl::getShuffleCost(TTI::ShuffleKind Kind, | |
// should be a very small constant for the constant pool load. As such, | ||
// we may bias towards large selects slightly more than truely warranted. | ||
return LT.first * | ||
(2 + getRISCVInstructionCost({RISCV::VMERGE_VVM}, | ||
(1 + getRISCVInstructionCost({RISCV::VMV_S_X, RISCV::VMERGE_VVM}, | ||
LT.second, CostKind)); | ||
} | ||
case TTI::SK_Broadcast: { | ||
|
@@ -475,10 +479,9 @@ InstructionCost RISCVTTIImpl::getShuffleCost(TTI::ShuffleKind Kind, | |
|
||
return LT.first * | ||
(TLI->getLMULCost(LT.second) + // FIXME: this should be 1 for andi | ||
TLI->getLMULCost( | ||
LT.second) + // FIXME: vmv.x.s is the same as extractelement | ||
getRISCVInstructionCost({RISCV::VMV_V_I, RISCV::VMERGE_VIM, | ||
RISCV::VMV_V_X, RISCV::VMSNE_VI}, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, one thing I'm noticing. I think the example code sequence above is only for short fixed vectors. For that to work on scalable vectors, I think we need a slide in there somewhere. Or, said alternatively, that remaining LMUL scaling is probably modeling the vslidedown more than the andi for scalable vectors and we need to adjust our example and comments slightly. (Not directly relevant to your change, more the existing code structure.) |
||
RISCV::VMV_X_S, RISCV::VMV_V_X, | ||
RISCV::VMSNE_VI}, | ||
LT.second, CostKind)); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Off topic for the actual change, but I think we can improve the code sequence above.
One idea would be to use a scalar multiply to do the broadcast since we know the value will either be 0 or (int16)-1.
Another would be to use a vrgather.vi in place of the extract/splat sequence.
Another would be to extract the full16 bit value, do the extract and splat in scalar, and insert a full 16 bit value back.
Not sure if you care about this case or not, I don't much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood! Let's consider revisiting it when it becomes more relevant!