Skip to content

[SimplifyCFG] Use hash map to continue hoisting the common instructions #78615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
260 changes: 194 additions & 66 deletions llvm/lib/Transforms/Utils/SimplifyCFG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,11 @@ static cl::opt<unsigned>
cl::desc("Allow reordering across at most this many "
"instructions when hoisting"));

static cl::opt<unsigned> BBSizeHoistLimit(
"simplifycfg-bb-size-hoist-limit", cl::Hidden, cl::init(40),
cl::desc("Allow hoisting instructions only if the smallest "
"basicblock has fewer instructions than this limit"));

static cl::opt<bool>
SinkCommon("simplifycfg-sink-common", cl::Hidden, cl::init(true),
cl::desc("Sink common instructions down to the end block"));
Expand Down Expand Up @@ -1582,19 +1587,28 @@ hoistLockstepIdenticalDPValues(Instruction *TI, Instruction *I1,
}
}

// Hash instructions based on following factors:
// 1- Instruction Opcode
// 2- Instruction type
// 3- Instruction operands
llvm::hash_code getHash(Instruction *Instr) {
return llvm::hash_combine(
Instr->getOpcode(), Instr->getType(),
hash_combine_range(Instr->value_op_begin(), Instr->value_op_end()));
}

/// Hoist any common code in the successor blocks up into the block. This
/// function guarantees that BB dominates all successors. If EqTermsOnly is
/// given, only perform hoisting in case both blocks only contain a terminator.
/// In that case, only the original BI will be replaced and selects for PHIs are
/// added.
bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,
bool EqTermsOnly) {
// This does very trivial matching, with limited scanning, to find identical
// instructions in the two blocks. In particular, we don't want to get into
// O(N1*N2*...) situations here where Ni are the sizes of these successors. As
// such, we currently just scan for obviously identical instructions in an
// identical order, possibly separated by the same number of non-identical
// instructions.
// We first sort successors based on the number of instructions each block
// holds. Then for each successor we make a hashmap from its instructions,
// except for the first successor. After that, we iterate over the
// instructions of the first successor. If we find identical instructions from
// every other successor, we hoist all of them into the predeccessor.
unsigned int SuccSize = succ_size(BB);
if (SuccSize < 2)
return false;
Expand All @@ -1608,16 +1622,42 @@ bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,

auto *TI = BB->getTerminator();

SmallVector<BasicBlock *, 8> SuccessorBBs;
for (auto *Succ : successors(BB)) {
// If we find an unreachable instruction at the beginning of a basic block,
// we can still hoist instructions from the rest of the basic blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put the hasAddressTaken check in this loop?

if (isa<UnreachableInst>(Succ->getFirstNonPHIOrDbg()))
continue;
SuccessorBBs.push_back(Succ);
}

// Find the smallest BB because we always want to iterate over instructions
// of the smallest Successor.
auto *SmallestBBPos =
std::min_element(SuccessorBBs.begin(), SuccessorBBs.end(),
[](BasicBlock *BB1, BasicBlock *BB2) {
return BB1->size() < BB2->size();
});

std::iter_swap(SuccessorBBs.begin(), SmallestBBPos);
BasicBlock *SmallestBB = *SmallestBBPos;

if (SmallestBB->size() > BBSizeHoistLimit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These size() calls should be sizeWithoutDebug(), otherwise the debug invariance property will be violated.

return false;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set a limit on the size of SmallestBB? It should be a bit larger than HoistCommonSkipLimit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to regress existing behavior though? Currently basic hoisting will work independently of block size and it seems undesirable to lose that entirely. Or do you think that hoisting for large blocks will not matter?

I do agree that we need to avoid needlessly hashing large blocks, but I don't think we can do this using a simple block size limit. One possibility would be to compute hashes up to N instructions ahead rather than doing it all upfront.

// The second of pair is a SkipFlags bitmask.
using SuccIterPair = std::pair<BasicBlock::iterator, unsigned>;
SmallVector<SuccIterPair, 8> SuccIterPairs;
for (auto *Succ : successors(BB)) {
for (auto *Succ : SuccessorBBs) {
BasicBlock::iterator SuccItr = Succ->begin();
if (isa<PHINode>(*SuccItr))
return false;
SuccIterPairs.push_back(SuccIterPair(SuccItr, 0));
}

if (SuccIterPairs.size() < 2)
return false;

// Check if only hoisting terminators is allowed. This does not add new
// instructions to the hoist location.
if (EqTermsOnly) {
Expand All @@ -1635,87 +1675,149 @@ bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,
// many instructions we skip, serving as a compilation time control as well as
// preventing excessive increase of life ranges.
unsigned NumSkipped = 0;
// If we find an unreachable instruction at the beginning of a basic block, we
// can still hoist instructions from the rest of the basic blocks.
if (SuccIterPairs.size() > 2) {
erase_if(SuccIterPairs,
[](const auto &Pair) { return isa<UnreachableInst>(Pair.first); });
if (SuccIterPairs.size() < 2)
return false;
}

bool Changed = false;
auto *SuccIterPairBegin = SuccIterPairs.begin();
++SuccIterPairBegin;
auto BBItrPair = *SuccIterPairBegin++;
auto OtherSuccIterPairRange =
iterator_range(SuccIterPairBegin, SuccIterPairs.end());

DenseMap<llvm::hash_code, SmallVector<Instruction *, 2>> OtherSuccessorsHash;
DenseMap<Instruction *, unsigned> InstToSkipFlag;

unsigned skipFlag = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsigned skipFlag = 0;
unsigned SkipFlag = 0;

Variables should start with an uppercase letter.

Instruction *I = nullptr;
do {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just iterating over the BB with a for(Instruction *I:BBItrPair.first) { ... if(I->isTerminator()) break; } is a lot more clear.

I = &*BBItrPair.first;
auto HashValue = getHash(I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto HashValue = getHash(I);
hash_code HashValue = getHash(I);

skipFlag |= skippedInstrFlags(I);
// For the first successor we created hashmap for, put all instructions
// in the hashmap execept for the ones that have the same hash as some
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// in the hashmap execept for the ones that have the same hash as some
// in the hashmap except for the ones that have the same hash as some

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this implementation only supports hoisting the first instruction with a given hash? If there are multiple instructions with the same hash, then the later ones will not be hoisted?

If so, this is going to cause non-determinism. The hash depends on address values, so you will see different hash collisions between different runs and depending on that different instructions will get hoisted.

The more proper way to handle this would be to use a custom DenseMapInfo where the hash it determined as you do, but isEqual() still checks the full instruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, the current implementation could have non-deterministic behavior in some very corner cases. I can fix it by using DenseMapInfo as you explained.

// previous instruction in that BB.
if (OtherSuccessorsHash.find(HashValue) == OtherSuccessorsHash.end()) {
OtherSuccessorsHash[HashValue] = {I};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use try_emplace to avoid duplicate hash lookup.

InstToSkipFlag[I] = skipFlag;
}
BBItrPair.first++;
} while (!I->isTerminator());

unsigned Index = 1;
for (auto BBItrPair : OtherSuccIterPairRange) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remain this to avoid shadow variable?

// Fill the hashmap for every other successor
unsigned skipFlag = 0;
Copy link
Contributor

@goldsteinn goldsteinn Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise shadow variable here. Maybe just add a scope to the above loop?

Instruction *I = nullptr;
do {
I = &*BBItrPair.first;
auto HashValue = getHash(I);
skipFlag |= skippedInstrFlags(I);
auto &InstVec = OtherSuccessorsHash[HashValue];
// For other successors put the instrcution in the map only if there are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// For other successors put the instrcution in the map only if there are
// For other successors put the instruction in the map only if there are

// instructions with the same hash from other successors and this is the
// first instruction with this hash value from current successor.
if (OtherSuccessorsHash.find(HashValue) != OtherSuccessorsHash.end() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check doesn't make sense to me, your InstVec access will force this to always exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly you meant to access InstVec only inside the condition?

InstVec.size() == Index) {
InstVec.push_back(I);
InstToSkipFlag[I] = skipFlag;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that InstToSkipFlag will effectively hold the skip flags for the last block, as you are always overwriting them here. That doesn't sound correct?

}
BBItrPair.first++;
} while (!I->isTerminator());
Index++;
}

// Keep track of instructions skipped in the first successor
unsigned SkipFlagsBB1 = 0;
bool SameLevelHoist = true;
for (;;) {
auto *SuccIterPairBegin = SuccIterPairs.begin();
auto &BB1ItrPair = *SuccIterPairBegin++;
auto OtherSuccIterPairRange =
iterator_range(SuccIterPairBegin, SuccIterPairs.end());
auto OtherSuccIterRange = make_first_range(OtherSuccIterPairRange);

Instruction *I1 = &*BB1ItrPair.first;

// Skip debug info if it is not identical.
bool AllDbgInstsAreIdentical = all_of(OtherSuccIterRange, [I1](auto &Iter) {
bool IdenticalDebugs = all_of(OtherSuccIterRange, [I1](auto &Iter) {
Instruction *I2 = &*Iter;
return I1->isIdenticalToWhenDefined(I2);
});
if (!AllDbgInstsAreIdentical) {
if (!IdenticalDebugs) {
while (isa<DbgInfoIntrinsic>(I1))
I1 = &*++BB1ItrPair.first;
for (auto &SuccIter : OtherSuccIterRange) {
Instruction *I2 = &*SuccIter;
while (isa<DbgInfoIntrinsic>(I2))
I2 = &*++SuccIter;
}
}

bool AllInstsAreIdentical = true;
bool HasTerminator = I1->isTerminator();
for (auto &SuccIter : OtherSuccIterRange) {
Instruction *I2 = &*SuccIter;
HasTerminator |= I2->isTerminator();
if (AllInstsAreIdentical && !I1->isIdenticalToWhenDefined(I2))
AllInstsAreIdentical = false;
auto OtherInsts = OtherSuccessorsHash[getHash(I1)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto OtherInsts = OtherSuccessorsHash[getHash(I1)];
const auto &OtherInsts = OtherSuccessorsHash[getHash(I1)];


// Check if there are identical instructions in all other successors
// We might face with same hash values for different instructions.
// If that happens, ignore the instruction.
bool HasIdenticalInst =
OtherSuccessorsHash.find(getHash(I1)) != OtherSuccessorsHash.end() &&
OtherInsts.size() == (SuccIterPairs.size() - 1) &&
all_of(OtherInsts, [&](Instruction *I2) {
Copy link
Contributor

@goldsteinn goldsteinn Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be any_of and you just grab the ones that are indentical?

return I2->isIdenticalToWhenDefined(I1);
});

if (!HasIdenticalInst) {
if (NumSkipped >= HoistCommonSkipLimit)
return Changed;
SkipFlagsBB1 |= skippedInstrFlags(I1);
if (SameLevelHoist) {
for (auto &SuccIterPair : OtherSuccIterPairRange) {
Instruction *I = &*SuccIterPair.first++;
SuccIterPair.second |= skippedInstrFlags(I);
}
}
++NumSkipped;
if (I1->isTerminator())
return Changed;
++BB1ItrPair.first;
continue;
}

SmallVector<Instruction *, 8> OtherInsts;
for (auto &SuccIter : OtherSuccIterRange)
OtherInsts.push_back(&*SuccIter);

// If we are hoisting the terminator instruction, don't move one (making a
// broken BB), instead clone it, and remove BI.
if (HasTerminator) {
if (I1->isTerminator()) {
// Even if BB, which contains only one unreachable instruction, is ignored
// at the beginning of the loop, we can hoist the terminator instruction.
// If any instructions remain in the block, we cannot hoist terminators.
if (NumSkipped || !AllInstsAreIdentical) {
if (NumSkipped) {
hoistLockstepIdenticalDPValues(TI, I1, OtherInsts);
return Changed;
}

return hoistSuccIdenticalTerminatorToSwitchOrIf(TI, I1, OtherInsts) ||
Changed;
SmallVector<Instruction *, 8> Insts;
for (auto *I2 : OtherInsts) {
// BB holding I2 should only contain the branch instruction
auto itr = I2->getParent()->instructionsWithoutDebug();
if (&*itr.begin() != I2)
return Changed;
Insts.push_back(I2);
}
return hoistSuccIdenticalTerminatorToSwitchOrIf(TI, I1, Insts) || Changed;
}

if (AllInstsAreIdentical) {
unsigned SkipFlagsBB1 = BB1ItrPair.second;
AllInstsAreIdentical =
isSafeToHoistInstr(I1, SkipFlagsBB1) &&
all_of(OtherSuccIterPairRange, [=](const auto &Pair) {
Instruction *I2 = &*Pair.first;
unsigned SkipFlagsBB2 = Pair.second;
// Even if the instructions are identical, it may not
// be safe to hoist them if we have skipped over
// instructions with side effects or their operands
// weren't hoisted.
return isSafeToHoistInstr(I2, SkipFlagsBB2) &&
shouldHoistCommonInstructions(I1, I2, TTI);
});
bool SafeToHoist = isSafeToHoistInstr(I1, SkipFlagsBB1);
for (auto [SuccIterPair, I2] : zip(OtherSuccIterPairRange, OtherInsts)) {
// If instructions of all successors are at the same level, use the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think simpler and faster would just be put if(!SafeToHoist) break. then get rid of the SafeToHoist && below.

// skipFlag of its BB, i.e., SameLevelHoist. Otherwise, use the skipFlag
// that was calculated initially for this instruction in the hashmap
if (SameLevelHoist && I2 == (&*(SuccIterPair.first))) {
SafeToHoist = SafeToHoist &&
isSafeToHoistInstr(I2, SuccIterPair.second) &&
shouldHoistCommonInstructions(I1, I2, TTI);
} else {
unsigned skipFlag = InstToSkipFlag[I2];
SafeToHoist = SafeToHoist && isSafeToHoistInstr(I2, skipFlag) &&
shouldHoistCommonInstructions(I1, I2, TTI);
SameLevelHoist = false;
}
}

if (AllInstsAreIdentical) {
if (SafeToHoist) {
BB1ItrPair.first++;
if (SameLevelHoist) {
for (auto &SuccIterPair : OtherSuccIterPairRange)
SuccIterPair.first++;
}
if (isa<DbgInfoIntrinsic>(I1)) {
// The debug location is an integral part of a debug info intrinsic
// and can't be separated from it or replaced. Instead of attempting
Expand All @@ -1725,8 +1827,7 @@ bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,
// leave any that were not hoisted behind (by calling moveBefore
// rather than moveBeforePreserving).
I1->moveBefore(TI);
for (auto &SuccIter : OtherSuccIterRange) {
auto *I2 = &*SuccIter++;
for (auto *I2 : OtherInsts) {
assert(isa<DbgInfoIntrinsic>(I2));
I2->moveBefore(TI);
}
Expand All @@ -1739,18 +1840,44 @@ bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,
// leave any that were not hoisted behind (by calling moveBefore
// rather than moveBeforePreserving).
I1->moveBefore(TI);
for (auto &SuccIter : OtherSuccIterRange) {
Instruction *I2 = &*SuccIter++;
for (auto *I2 : OtherInsts) {
assert(I2 != I1);
if (!I2->use_empty())
// Update hashcode of all instructions using I2
if (!I2->use_empty()) {
SmallVector<llvm::Instruction *, 8> PrevUsers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use llvm:: prefix in upstream LLVM.

// Once the uses of I1 are replaced, the hash value computed for
// those users are not valid anymore so we gather users and then
// recompute the hash codes for them. We need to do this only for
// the instructions located in the same block as I2 because we
// initially only hashed those instructions.
for (auto *user : I2->users()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (auto *user : I2->users()) {
for (User *U : I2->users()) {

if (auto *I = dyn_cast<Instruction>(user)) {
if (I->getParent() != I2->getParent())
continue;
PrevUsers.push_back(I);
}
}
I2->replaceAllUsesWith(I1);
for (auto &PrevUser : PrevUsers) {
auto NewHash = getHash(PrevUser);
if (OtherSuccessorsHash.find(NewHash) !=
OtherSuccessorsHash.end()) {
OtherSuccessorsHash[NewHash].push_back(PrevUser);
} else {
SmallVector<Instruction *, 2> InstVec{PrevUser};
OtherSuccessorsHash[NewHash] = InstVec;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler if we replaced all the instructions we need to replace at once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give more details please? I'm just updating hash value of users here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think that we can hoist a batch of instruction when use hash map. Then we may don't need to update the hash value.

}
Comment on lines +1863 to +1869
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (OtherSuccessorsHash.find(NewHash) !=
OtherSuccessorsHash.end()) {
OtherSuccessorsHash[NewHash].push_back(PrevUser);
} else {
SmallVector<Instruction *, 2> InstVec{PrevUser};
OtherSuccessorsHash[NewHash] = InstVec;
}
OtherSuccessorsHash[NewHash].push_back(PrevUser);

should work, I think? It will default initialize and then push if it doesn't exist yet.

}
}
I1->andIRFlags(I2);
combineMetadataForCSE(I1, I2, true);
// I1 and I2 are being combined into a single instruction. Its debug
// location is the merged locations of the original instructions.
I1->applyMergedLocation(I1->getDebugLoc(), I2->getDebugLoc());
OtherSuccessorsHash.erase(getHash(I1));
I2->eraseFromParent();
}
OtherSuccessorsHash.erase(getHash(I1));
}
if (!Changed)
NumHoistCommonCode += SuccIterPairs.size();
Expand All @@ -1764,9 +1891,13 @@ bool SimplifyCFGOpt::hoistCommonCodeFromSuccessors(BasicBlock *BB,
// We are about to skip over a pair of non-identical instructions. Record
// if any have characteristics that would prevent reordering instructions
// across them.
for (auto &SuccIterPair : SuccIterPairs) {
Instruction *I = &*SuccIterPair.first++;
SuccIterPair.second |= skippedInstrFlags(I);
BB1ItrPair.first++;
SkipFlagsBB1 |= skippedInstrFlags(I1);
if (SameLevelHoist) {
for (auto &SuccIterPair : OtherSuccIterPairRange) { // update flags
Instruction *I = &*SuccIterPair.first++;
SuccIterPair.second |= skippedInstrFlags(I);
}
}
++NumSkipped;
}
Expand All @@ -1788,8 +1919,6 @@ bool SimplifyCFGOpt::hoistSuccIdenticalTerminatorToSwitchOrIf(
auto *BB2 = I2->getParent();
if (BI) {
assert(OtherSuccTIs.size() == 1);
assert(BI->getSuccessor(0) == I1->getParent());
assert(BI->getSuccessor(1) == I2->getParent());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should drop braces.


// In the case of an if statement, we try to hoist an invoke.
Expand Down Expand Up @@ -3647,7 +3776,6 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
// Change the PHI node into a select instruction.
Value *TrueVal = PN->getIncomingValueForBlock(IfTrue);
Value *FalseVal = PN->getIncomingValueForBlock(IfFalse);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop the change.

Value *Sel = Builder.CreateSelect(IfCond, TrueVal, FalseVal, "", DomBI);
PN->replaceAllUsesWith(Sel);
Sel->takeName(PN);
Expand Down
Loading