Skip to content

[Polly] Data flow reduction detection to cover more cases #84901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 30, 2024

Conversation

kartcq
Copy link
Contributor

@kartcq kartcq commented Mar 12, 2024

The base concept is same as existing reduction algorithm where we get the list of candidate pairs <store,load>. But the existing algorithm works only if there is single binary operation between the load and store.
Example sum += a[i];

This algorithm extends to work with more than single binary operation as well. It is implemented using data flow reduction detection on basic block level. We propagate the loads, the number of times the load is used(flows into instruction) and binary operation performed until we reach a store.

Example sum += a[i] + b[i];

sum(Ld)     a[i](Ld)
      \  +  /
        tmp    b[i](Ld)
           \ + /
            sum(St)

In the above case the candidate pairs are formed by associating sum with all of its load inputs which are sum, a[i] and b[i]. Then check functions are used to filter a valid reduction pair ie {sum,sum}.

The base concept is same as existing reduction algorithm where we get
the list of candidate pairs <store,load>. But the existing algorithm
works only if there is single binary operation between the load and
store.
Example sum += a[i];

This algorithm extends to work with more than single binary operation
as well. It is implemented using data flow reduction detection on
basic block level. We propagate the loads, the number of times the
load is used(flows into instruction) and binary operation performed
until we reach a store.

Example sum += a[i] + b[i];
sum(Ld) a[i](Ld)
   \ + /
    tmp  b[i](Ld)
       \+/
       sum(St)

In the above case the candidate pairs are formed by associating sum
with all of its load inputs which are sum, a[i] and b[i]. Then check
functions are used to filter a valid reduction pair ie {sum,sum}.
@kartcq
Copy link
Contributor Author

kartcq commented Apr 19, 2024

ping @efriedma-quic @Meinersbur

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. I really appreciate the good code comments.

Could you add a test that checks two independent reductions in the same loop, e.g.:

for (int i = 0; i < n; ++i) {
  *sum += A[i];
  *prod += B[i];
}

ScopBuilder::buildEqivClassBlockStmts should create two ScopStmts for the same BasicBlock here.

kartcq added 3 commits May 8, 2024 03:07
The base concept is same as existing reduction algorithm where we get
the list of candidate pairs <store,load>. But the existing algorithm
works only if there is single binary operation between the load and
store.
Example sum += a[i];

This algorithm extends to work with more than single binary operation
as well. It is implemented using data flow reduction detection on
basic block level. We propagate the loads, the number of times the
load is used(flows into instruction) and binary operation performed
until we reach a store.

Example sum += a[i] + b[i];
sum(Ld) a[i](Ld)
   \ + /
    tmp  b[i](Ld)
       \+/
       sum(St)

In the above case the candidate pairs are formed by associating sum
with all of its load inputs which are sum, a[i] and b[i]. Then check
functions are used to filter a valid reduction pair ie {sum,sum}.
@kartcq kartcq requested a review from Meinersbur May 21, 2024 06:09
@kartcq
Copy link
Contributor Author

kartcq commented May 28, 2024

Ping @Meinersbur @efriedma-quic
Note : Couple of changes are added after requesting re-review.
For latest changes build is clean.

@kartcq
Copy link
Contributor Author

kartcq commented Jun 5, 2024

Ping @Meinersbur

1 similar comment
@kartcq
Copy link
Contributor Author

kartcq commented Jun 18, 2024

Ping @Meinersbur

@kartcq
Copy link
Contributor Author

kartcq commented Jul 2, 2024

Ping @efriedma-quic @Meinersbur

@@ -0,0 +1,39 @@
; RUN: opt %loadPolly -basic-aa -polly-print-dependences -polly-allow-nonaffine -disable-output < %s | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-passes=print<polly-dependences> (see #92918). I think legacy-pm -polly-print-scops wasn't ported yet, so it should be okay to continue using it for now.

Otherwise, patch looks fine.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't forget to update this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out Eli. Updated.

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Please excuse the delay.

@@ -2557,7 +2568,6 @@ bool checkCandidatePairAccesses(MemoryAccess *LoadMA, MemoryAccess *StoreMA,
.intersect_domain(isl::manage(Domain.copy()));
isl::set RS = R.range();
isl::set WS = W.range();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to avoid unrelated changes.

Copy link

github-actions bot commented Jul 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@kartcq
Copy link
Contributor Author

kartcq commented Jul 29, 2024

Thanks @Meinersbur and @efriedma-quic for your time to review and approvals.
I have addressed all the final comments as well.
With the builders passing this patch is good to go for merging.

@efriedma-quic efriedma-quic merged commit 1e5334b into llvm:main Jul 30, 2024
7 checks passed
@kartcq kartcq deleted the dataflow_red branch April 16, 2025 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants