Skip to content

[Attributor] Fix an issue that could potentially cause AccessList and OffsetBins out of sync #106187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 29, 2024

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Aug 27, 2024

The implementation of AAPointerInfo::RangeList::set_difference doesn't
consider the case where two ranges have the same offset but different sizes.
This could cause AccessList and OffsetBins out of sync because a range has
been already updated in AccessList but missing in ToRemove.

I do have a reproducer but the reproducer itself is 248kb. llvm-reduce can't
further reduce it. Not sure how I can make a smaller reproducer.

Fixes: SWDEV-479757.

@shiltian shiltian marked this pull request as ready for review August 27, 2024 05:36
Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @shiltian and the rest of your teammates on Graphite Graphite

@llvmbot
Copy link
Member

llvmbot commented Aug 27, 2024

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-transforms

Author: Shilei Tian (shiltian)

Changes

The implementation of AAPointerInfo::RangeList::set_difference doesn't
consider the case where two ranges have the same offset but different sizes.
This could causes AccessList and OffsetBins out of sync because a range has
been already updated in AccessList but missing in ToRemove.

I do have a reproducer but the reproducer itself is 248kb. llvm-reduce can't
further reduce it. Not sure how I can make a smaller reproducer.

Fix SWDEV-479757.


Full diff: https://github.com/llvm/llvm-project/pull/106187.diff

1 Files Affected:

  • (modified) llvm/include/llvm/Transforms/IPO/Attributor.h (+4-1)
diff --git a/llvm/include/llvm/Transforms/IPO/Attributor.h b/llvm/include/llvm/Transforms/IPO/Attributor.h
index 718cf704cbdf1a..e4e94ad40acb8d 100644
--- a/llvm/include/llvm/Transforms/IPO/Attributor.h
+++ b/llvm/include/llvm/Transforms/IPO/Attributor.h
@@ -5818,7 +5818,10 @@ struct AAPointerInfo : public AbstractAttribute {
     static void set_difference(const RangeList &L, const RangeList &R,
                                RangeList &D) {
       std::set_difference(L.begin(), L.end(), R.begin(), R.end(),
-                          std::back_inserter(D), RangeTy::OffsetLessThan);
+                          std::back_inserter(D),
+                          [](const RangeTy &L, const RangeTy &R) {
+                            return (L.Offset < R.Offset) || (L != R);
+                          });
     }
 
     unsigned size() const { return Ranges.size(); }

@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from 3c51442 to cb98c62 Compare August 27, 2024 13:51
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe the claim that llvm-reduce can't further reduce it. Can you post it

@shiltian
Copy link
Contributor Author

I don't believe the claim that llvm-reduce can't further reduce it. Can you post it

I added it in the description.

@arsenm
Copy link
Contributor

arsenm commented Aug 28, 2024

Your reducer script runs all of clang, you should start reducing from the IR extracted before the failing pass and just run the one pass. When I do that, I get this:

; RUN: opt -passes=amdgpu-attributor %s

target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
target triple = "amdgcn-amd-amdhsa"

%struct.ShaderClosure.1472.2484.3440.3732.5208.6220.7176.7468 = type { <3 x float>, i32, float, <3 x float>, [10 x float], [8 x i8] }
%struct.ShaderData.1475.2487.3443.3735.5211.6223.7179.7471 = type { <3 x float>, <3 x float>, <3 x float>, <3 x float>, i32, i32, i32, i32, i32, float, float, i32, i32, float, float, %struct.differential3.1473.2485.3441.3733.5209.6221.7177.7469, %struct.differential3.1473.2485.3441.3733.5209.6221.7177.7469, %struct.differential.1474.2486.3442.3734.5210.6222.7178.7470, %struct.differential.1474.2486.3442.3734.5210.6222.7178.7470, <3 x float>, <3 x float>, <3 x float>, %struct.differential3.1473.2485.3441.3733.5209.6221.7177.7469, i32, i32, i32, float, <3 x float>, <3 x float>, <3 x float>, [64 x %struct.ShaderClosure.1472.2484.3440.3732.5208.6220.7176.7468] }
%struct.differential.1474.2486.3442.3734.5210.6222.7178.7470 = type { float, float }
%struct.differential3.1473.2485.3441.3733.5209.6221.7177.7469 = type { <3 x float>, <3 x float> }

define internal fastcc void @svm_eval_nodes(ptr %kg) {
entry:
  %closure.i25.i = getelementptr i8, ptr %kg, i64 336
  %num_closure.i26.i = getelementptr i8, ptr %kg, i64 276
  br label %while.cond

while.cond:                                       ; preds = %sw.bb92, %cond.true.i.i.i34, %while.cond, %entry
  %0 = load i32, ptr %kg, align 4
  %idxprom.i = zext i32 %0 to i64
  br label %while.cond

cond.true.i.i.i34:                                ; No predecessors!
  %arrayidx.i27.i = getelementptr float, ptr %kg, i64 %idxprom.i
  store float 0.000000e+00, ptr %arrayidx.i27.i, align 4
  br label %while.cond

sw.bb92:                                          ; No predecessors!
  %1 = load i32, ptr %kg, align 8
  %2 = insertelement <3 x i32> zeroinitializer, i32 %1, i64 0
  %splat.splatinsert.i = bitcast <3 x i32> %2 to <3 x float>
  %3 = shufflevector <3 x float> %splat.splatinsert.i, <3 x float> zeroinitializer, <4 x i32> zeroinitializer
  %4 = load i32, ptr %num_closure.i26.i, align 4
  %idxprom.i27.i = sext i32 %4 to i64
  %arrayidx.i28.i = getelementptr [64 x %struct.ShaderClosure.1472.2484.3440.3732.5208.6220.7176.7468], ptr %closure.i25.i, i64 0, i64 %idxprom.i27.i
  store <4 x float> %3, ptr %arrayidx.i28.i, align 16
  %inc.i30.i = or i32 %4, 1
  store i32 %inc.i30.i, ptr %num_closure.i26.i, align 4
  br label %while.cond
}

; Function Attrs: norecurse
define amdgpu_kernel void @kernel_ocl_displace() #0 {
entry:
  %sd.i11111111111 = alloca [0 x [0 x [0 x [0 x [0 x [0 x [0 x [0 x [0 x [0 x %struct.ShaderData.1475.2487.3443.3735.5211.6223.7179.7471]]]]]]]]]], i32 0, align 16, addrspace(5)
  %kglobals.ascast1 = addrspacecast ptr addrspace(5) %sd.i11111111111 to ptr
  %num_closure.i.i = getelementptr i8, ptr addrspace(5) %sd.i11111111111, i32 276
  store <2 x i32> zeroinitializer, ptr addrspace(5) %num_closure.i.i, align 4
  call fastcc void @svm_eval_nodes(ptr %kglobals.ascast1)
  ret void
}

attributes #0 = { norecurse }

@shiltian
Copy link
Contributor Author

@arsenm Aha! That is good to know. TIL. :-)

@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from cb98c62 to 1a23192 Compare August 28, 2024 17:57
@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from 1a23192 to ce4a9dd Compare August 28, 2024 18:08
@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from ce4a9dd to 5090341 Compare August 28, 2024 18:30
@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from 5090341 to 3d01afb Compare August 29, 2024 05:00
…nd `OffsetBins` out of sync

The implementation of `AAPointerInfo::RangeList::set_difference` doesn't
consider the case where two ranges have the same offset but different sizes.
This could causes `AccessList` and `OffsetBins` out of sync because a range has
been already updated in `AccessList` but missing in `ToRemove`.

I do have a reproducer but the reproducer itself is 248kb. `llvm-reduce` can't
further reduce it. Not sure how I can make a smaller reproducer.

Fix SWDEV-479757.
@shiltian shiltian force-pushed the users/shiltian/out-of-sync branch from 3d01afb to 6793303 Compare August 29, 2024 05:02
@shiltian shiltian merged commit 572d2fd into main Aug 29, 2024
4 of 6 checks passed
@shiltian shiltian deleted the users/shiltian/out-of-sync branch August 29, 2024 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants