Skip to content

[LAA] Always use DepCands when grouping runtime checks. #91196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions llvm/include/llvm/ADT/EquivalenceClasses.h
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,18 @@ class EquivalenceClasses {
return findLeader(TheMapping.find(V));
}

/// Erase the class containing \p V.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment could be clearer what "erasing" a class means. Reading the code, it removes all its members from the set.

void eraseClass(const ElemTy &V) {
if (TheMapping.find(V) == TheMapping.end())
return;
auto LeaderI = findValue(getLeaderValue(V));
Comment on lines +238 to +240
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (TheMapping.find(V) == TheMapping.end())
return;
auto LeaderI = findValue(getLeaderValue(V));
if (TheMapping.find(V) == TheMapping.end())
return;
iterator LeaderI = findValue(getLeaderValue(V));

for (auto MI = member_begin(LeaderI), ME = member_end(); MI != ME;) {
const auto &ToErase = *MI;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto &ToErase = *MI;
const ElemTy &ToErase = *MI;

++MI;
TheMapping.erase(ToErase);
}
}

/// union - Merge the two equivalence sets for the specified values, inserting
/// them if they do not already exist in the equivalence set.
member_iterator unionSets(const ElemTy &V1, const ElemTy &V2) {
Expand Down
11 changes: 5 additions & 6 deletions llvm/include/llvm/Analysis/LoopAccessAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -504,8 +504,8 @@ class RuntimePointerChecking {

/// Generate the checks and store it. This also performs the grouping
/// of pointers to reduce the number of memchecks necessary.
void generateChecks(MemoryDepChecker::DepCandidates &DepCands,
bool UseDependencies);
void generateChecks(const MemoryDepChecker &DepChecker,
MemoryDepChecker::DepCandidates &DepCands);

/// Returns the checks that generateChecks created. They can be used to ensure
/// no read/write accesses overlap across all loop iterations.
Expand Down Expand Up @@ -572,10 +572,9 @@ class RuntimePointerChecking {
private:
/// Groups pointers such that a single memcheck is required
/// between two different groups. This will clear the CheckingGroups vector
/// and re-compute it. We will only group dependecies if \p UseDependencies
/// is true, otherwise we will create a separate group for each pointer.
void groupChecks(MemoryDepChecker::DepCandidates &DepCands,
bool UseDependencies);
/// and re-compute it.
void groupChecks(const MemoryDepChecker &DepChecker,
MemoryDepChecker::DepCandidates &DepCands);

/// Generate the checks and return them.
SmallVector<RuntimePointerCheck, 4> generateChecks();
Expand Down
96 changes: 55 additions & 41 deletions llvm/lib/Analysis/LoopAccessAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -392,9 +392,10 @@ SmallVector<RuntimePointerCheck, 4> RuntimePointerChecking::generateChecks() {
}

void RuntimePointerChecking::generateChecks(
MemoryDepChecker::DepCandidates &DepCands, bool UseDependencies) {
const MemoryDepChecker &DepChecker,
MemoryDepChecker::DepCandidates &DepCands) {
assert(Checks.empty() && "Checks is not empty");
groupChecks(DepCands, UseDependencies);
groupChecks(DepChecker, DepCands);
Checks = generateChecks();
}

Expand Down Expand Up @@ -459,7 +460,8 @@ bool RuntimeCheckingPtrGroup::addPointer(unsigned Index, const SCEV *Start,
}

void RuntimePointerChecking::groupChecks(
MemoryDepChecker::DepCandidates &DepCands, bool UseDependencies) {
const MemoryDepChecker &DepChecker,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking if this pull request is still active. Also, it looks like we pass DepChecker here, but we do not use it. Is it needed for a future change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will try and update it in the next few days, thanks

MemoryDepChecker::DepCandidates &DepCands) {
// We build the groups from dependency candidates equivalence classes
// because:
// - We know that pointers in the same equivalence class share
Expand Down Expand Up @@ -495,21 +497,11 @@ void RuntimePointerChecking::groupChecks(
// us to perform an accurate check in this case.
//
// The above case requires that we have an UnknownDependence between
// accesses to the same underlying object. This cannot happen unless
// FoundNonConstantDistanceDependence is set, and therefore UseDependencies
// is also false. In this case we will use the fallback path and create
// separate checking groups for all pointers.

// If we don't have the dependency partitions, construct a new
// checking pointer group for each pointer. This is also required
// for correctness, because in this case we can have checking between
// pointers to the same underlying object.
if (!UseDependencies) {
for (unsigned I = 0; I < Pointers.size(); ++I)
CheckingGroups.push_back(RuntimeCheckingPtrGroup(I, *this));
return;
}

// accesses to the same underlying object. It is the caller's responsibility
// to clear any entries for accesses with unknown dependencies from the
// dependency partition (DepCands). This is required for correctness, because
// in this case we can have checking between pointers to the same underlying
// object.
unsigned TotalComparisons = 0;

DenseMap<Value *, SmallVector<unsigned>> PositionMap;
Expand All @@ -534,6 +526,13 @@ void RuntimePointerChecking::groupChecks(
MemoryDepChecker::MemAccessInfo Access(Pointers[I].PointerValue,
Pointers[I].IsWritePtr);

// If there is no entry in the dependency partition, there are no potential
// accesses to merge; simply add a new pointer checking group.
if (DepCands.findValue(Access) == DepCands.end()) {
CheckingGroups.push_back(RuntimeCheckingPtrGroup(I, *this));
continue;
}

SmallVector<RuntimeCheckingPtrGroup, 2> Groups;
auto LeaderI = DepCands.findValue(DepCands.getLeaderValue(Access));

Expand Down Expand Up @@ -702,8 +701,10 @@ class AccessAnalysis {
///
/// Returns true if we need no check or if we do and we can generate them
/// (i.e. the pointers have computable bounds).
bool canCheckPtrAtRT(RuntimePointerChecking &RtCheck, ScalarEvolution *SE,
Loop *TheLoop, const DenseMap<Value *, const SCEV *> &Strides,
bool canCheckPtrAtRT(const MemoryDepChecker &DepChecker,
RuntimePointerChecking &RtCheck, ScalarEvolution *SE,
Loop *TheLoop,
const DenseMap<Value *, const SCEV *> &Strides,
Comment on lines +704 to +707
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document the (additional) parameters?

RtCheck seems to be the [out] parameter, should it stay the first?

Value *&UncomputablePtr, bool ShouldCheckWrap = false);

/// Goes over all memory accesses, checks whether a RT check is needed
Expand All @@ -719,12 +720,6 @@ class AccessAnalysis {
/// dependency checking (i.e. FoundNonConstantDistanceDependence).
bool isDependencyCheckNeeded() { return !CheckDeps.empty(); }

/// We decided that no dependence analysis would be used. Reset the state.
void resetDepChecks(MemoryDepChecker &DepChecker) {
CheckDeps.clear();
DepChecker.clearDependences();
}

MemAccessInfoList &getDependenciesToCheck() { return CheckDeps; }

const DenseMap<Value *, SmallVector<const Value *, 16>> &
Expand Down Expand Up @@ -1109,7 +1104,8 @@ bool AccessAnalysis::createCheckForAccess(RuntimePointerChecking &RtCheck,
// The id of the dependence set.
unsigned DepId;

if (isDependencyCheckNeeded()) {
if (isDependencyCheckNeeded() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style only: I think you're basically adding a variant of isDependencyChecksNeeded which is specific for a particular access, and having a well named cover function would clarify the code.

DepCands.findValue(Access) != DepCands.end()) {
Value *Leader = DepCands.getLeaderValue(Access).getPointer();
unsigned &LeaderId = DepSetId[Leader];
if (!LeaderId)
Expand All @@ -1128,18 +1124,38 @@ bool AccessAnalysis::createCheckForAccess(RuntimePointerChecking &RtCheck,
return true;
}

bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
ScalarEvolution *SE, Loop *TheLoop,
const DenseMap<Value *, const SCEV *> &StridesMap,
Value *&UncomputablePtr, bool ShouldCheckWrap) {
bool AccessAnalysis::canCheckPtrAtRT(
const MemoryDepChecker &DepChecker, RuntimePointerChecking &RtCheck,
ScalarEvolution *SE, Loop *TheLoop,
const DenseMap<Value *, const SCEV *> &StridesMap, Value *&UncomputablePtr,
bool ShouldCheckWrap) {
// Find pointers with computable bounds. We are going to use this information
// to place a runtime bound check.
bool CanDoRT = true;

bool MayNeedRTCheck = false;
if (!IsRTCheckAnalysisNeeded) return true;

bool IsDepCheckNeeded = isDependencyCheckNeeded();
if (auto *Deps = DepChecker.getDependences()) {
// If there are unknown dependences, this means runtime checks are needed to
// ensure there's no overlap between accesses to the same underlying object.
// Remove the equivalence classes containing both source and destination
// accesses from DepCands. This ensures runtime checks will be generated
// between those accesses and prevents them from being grouped together.
for (const auto &Dep : *Deps) {
if (Dep.Type != MemoryDepChecker::Dependence::Unknown) {
assert(MemoryDepChecker::Dependence::isSafeForVectorization(Dep.Type) ==
MemoryDepChecker::VectorizationSafetyStatus::Safe &&
"Should only skip safe dependences");
continue;
}
Instruction *Src = Dep.getSource(DepChecker);
Instruction *Dst = Dep.getDestination(DepChecker);
DepCands.eraseClass({getPointerOperand(Src), Src->mayWriteToMemory()});
DepCands.eraseClass({getPointerOperand(Dst), Dst->mayWriteToMemory()});
}
} else
CheckDeps.clear();
Comment on lines +1154 to +1158
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[serious] Can we do without modifying state within a query function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look!

At the moment, canCheckPtrAtRT both checks if runtime checks can be generated and then also generates them, populating various members. While modifying DepCands here should be fine, I'll see if it is possible to restructure the code to make the check independent of actually generating runtime checks (which is for what the DepCands modifications are needed).


// We assign a consecutive id to access from different alias sets.
// Accesses between different groups doesn't need to be checked.
Expand Down Expand Up @@ -1265,7 +1281,7 @@ bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
}

if (MayNeedRTCheck && CanDoRT)
RtCheck.generateChecks(DepCands, IsDepCheckNeeded);
RtCheck.generateChecks(DepChecker, DepCands);

LLVM_DEBUG(dbgs() << "LAA: We need to do " << RtCheck.getNumberOfChecks()
<< " pointer comparisons.\n");
Expand Down Expand Up @@ -2632,9 +2648,9 @@ bool LoopAccessInfo::analyzeLoop(AAResults *AA, LoopInfo *LI,
// Find pointers with computable bounds. We are going to use this information
// to place a runtime bound check.
Value *UncomputablePtr = nullptr;
bool CanDoRTIfNeeded =
Accesses.canCheckPtrAtRT(*PtrRtChecking, PSE->getSE(), TheLoop,
SymbolicStrides, UncomputablePtr, false);
bool CanDoRTIfNeeded = Accesses.canCheckPtrAtRT(
getDepChecker(), *PtrRtChecking, PSE->getSE(), TheLoop, SymbolicStrides,
UncomputablePtr, false);
if (!CanDoRTIfNeeded) {
auto *I = dyn_cast_or_null<Instruction>(UncomputablePtr);
recordAnalysis("CantIdentifyArrayBounds", I)
Expand All @@ -2657,16 +2673,14 @@ bool LoopAccessInfo::analyzeLoop(AAResults *AA, LoopInfo *LI,
if (!DepsAreSafe && DepChecker->shouldRetryWithRuntimeCheck()) {
LLVM_DEBUG(dbgs() << "LAA: Retrying with memory checks\n");

// Clear the dependency checks. We assume they are not needed.
Accesses.resetDepChecks(*DepChecker);

PtrRtChecking->reset();
PtrRtChecking->Need = true;

auto *SE = PSE->getSE();
UncomputablePtr = nullptr;
CanDoRTIfNeeded = Accesses.canCheckPtrAtRT(
*PtrRtChecking, SE, TheLoop, SymbolicStrides, UncomputablePtr, true);
CanDoRTIfNeeded =
Accesses.canCheckPtrAtRT(getDepChecker(), *PtrRtChecking, SE, TheLoop,
SymbolicStrides, UncomputablePtr, true);

// Check that we found the bounds for the pointer.
if (!CanDoRTIfNeeded) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ define void @test_indirect_read_loop_also_modifies_pointer_array(ptr noundef %ar
; CHECK-NEXT: loop.2:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l.1 = load ptr, ptr %gep.iv.1, align 8, !tbaa !0 ->
; CHECK-NEXT: store i64 %l.2, ptr %gep.iv.2, align 8, !tbaa !0
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP1:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -158,6 +162,10 @@ define void @test_indirect_write_loop_also_modifies_pointer_array(ptr noundef %a
; CHECK-NEXT: loop.2:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l.1 = load ptr, ptr %gep.iv.1, align 8, !tbaa !0 ->
; CHECK-NEXT: store ptr %l.1, ptr %gep.iv.2, align 8, !tbaa !0
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP3:0x[0-9a-f]+]]):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,18 @@
; CHECK: .inner:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %4 = load i32, ptr %1, align 4 ->
; CHECK-NEXT: store i32 %8, ptr %6, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %3 = load i32, ptr %2, align 4 ->
; CHECK-NEXT: store i32 %8, ptr %6, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Forward:
; CHECK-NEXT: %7 = load i32, ptr %6, align 4 ->
; CHECK-NEXT: store i32 %8, ptr %6, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK: Check 0:
; CHECK: Check 1:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ define void @backward_min_distance_8(ptr %A, i64 %N) {
; COMMON-NEXT: loop:
; COMMON-NEXT: Memory dependences are safe with run-time checks
; COMMON-NEXT: Dependences:
; COMMON-NEXT: Unknown:
; COMMON-NEXT: %l = load i8, ptr %gep, align 4 ->
; COMMON-NEXT: store i8 %add, ptr %gep.off.iv, align 4
; COMMON-EMPTY:
; COMMON-NEXT: Run-time memory checks:
; COMMON-NEXT: Check 0:
; COMMON-NEXT: Comparing group ([[GRP1:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -76,6 +80,10 @@ define void @backward_min_distance_120(ptr %A, i64 %N) {
; COMMON-NEXT: loop:
; COMMON-NEXT: Memory dependences are safe with run-time checks
; COMMON-NEXT: Dependences:
; COMMON-NEXT: Unknown:
; COMMON-NEXT: %l = load i8, ptr %gep, align 4 ->
; COMMON-NEXT: store i8 %add, ptr %gep.off.iv, align 4
; COMMON-EMPTY:
; COMMON-NEXT: Run-time memory checks:
; COMMON-NEXT: Check 0:
; COMMON-NEXT: Comparing group ([[GRP3:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -138,6 +146,10 @@ define void @backward_min_distance_128(ptr %A, i64 %N) {
; COMMON-NEXT: loop:
; COMMON-NEXT: Memory dependences are safe with run-time checks
; COMMON-NEXT: Dependences:
; COMMON-NEXT: Unknown:
; COMMON-NEXT: %l = load i8, ptr %gep, align 4 ->
; COMMON-NEXT: store i8 %add, ptr %gep.off.iv, align 4
; COMMON-EMPTY:
; COMMON-NEXT: Run-time memory checks:
; COMMON-NEXT: Check 0:
; COMMON-NEXT: Comparing group ([[GRP13:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -200,6 +212,10 @@ define void @backward_min_distance_256(ptr %A, i64 %N) {
; MAXLEN-NEXT: loop:
; MAXLEN-NEXT: Memory dependences are safe with run-time checks
; MAXLEN-NEXT: Dependences:
; MAXLEN-NEXT: Unknown:
; MAXLEN-NEXT: %l = load i8, ptr %gep, align 4 ->
; MAXLEN-NEXT: store i8 %add, ptr %gep.off.iv, align 4
; MAXLEN-EMPTY:
; MAXLEN-NEXT: Run-time memory checks:
; MAXLEN-NEXT: Check 0:
; MAXLEN-NEXT: Comparing group ([[GRP17:0x[0-9a-f]+]]):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,10 @@ define void @offset_i32_known_positive_via_assume_forward_dep_1(ptr %A, i64 %off
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i8, ptr %gep.off, align 4 ->
; CHECK-NEXT: store i8 %add, ptr %gep, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP1:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -144,6 +148,10 @@ define void @offset_may_be_negative_via_assume_unknown_dep(ptr %A, i64 %offset,
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i32, ptr %gep, align 4 ->
; CHECK-NEXT: store i32 %add, ptr %gep.mul.2, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP3:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -191,6 +199,10 @@ define void @offset_no_assumes(ptr %A, i64 %offset, i64 %N) {
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i32, ptr %gep.off, align 4 ->
; CHECK-NEXT: store i32 %add, ptr %gep, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP5:0x[0-9a-f]+]]):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ define void @test_distance_positive_independent_via_trip_count(ptr %A) {
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i8, ptr %gep.A, align 1 ->
; CHECK-NEXT: store i32 %ext, ptr %gep.A.400, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP1:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -55,6 +59,10 @@ define void @test_distance_positive_backwards(ptr %A) {
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i8, ptr %gep.A, align 1 ->
; CHECK-NEXT: store i32 %ext, ptr %gep.A.400, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP3:0x[0-9a-f]+]]):
Expand Down Expand Up @@ -98,6 +106,10 @@ define void @test_distance_positive_via_assume(ptr %A, i64 %off) {
; CHECK-NEXT: loop:
; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %l = load i8, ptr %gep.A, align 1 ->
; CHECK-NEXT: store i32 %ext, ptr %gep.A.400, align 4
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: Check 0:
; CHECK-NEXT: Comparing group ([[GRP5:0x[0-9a-f]+]]):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ target triple = "x86_64-apple-macosx10.10.0"

; CHECK: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Unknown:
; CHECK-NEXT: %loadA = load i16, ptr %arrayidxA, align 2 ->
; CHECK-NEXT: store i16 %mul1, ptr %arrayidxA2, align 2
; CHECK-EMPTY:
; CHECK-NEXT: Run-time memory checks:
; CHECK-NEXT: 0:
; CHECK-NEXT: Comparing group
Expand Down
Loading
Loading