-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAGCombiner] Add some very basic folds for ADDRSPACECAST #127733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 | ||
; RUN: llc < %s -mcpu=sm_20 -O0 | FileCheck %s | ||
; RUN: %if ptxas %{ llc < %s -mcpu=sm_20 -O0 | %ptxas-verify %} | ||
|
||
target triple = "nvptx64-unknown-unknown" | ||
|
||
define ptr @test1(ptr %p) { | ||
; CHECK-LABEL: test1( | ||
; CHECK: { | ||
; CHECK-NEXT: .reg .b64 %rd<2>; | ||
; CHECK-EMPTY: | ||
; CHECK-NEXT: // %bb.0: | ||
; CHECK-NEXT: ld.param.u64 %rd1, [test1_param_0]; | ||
; CHECK-NEXT: st.param.b64 [func_retval0], %rd1; | ||
; CHECK-NEXT: ret; | ||
%a = addrspacecast ptr %p to ptr addrspace(5) | ||
%b = addrspacecast ptr addrspace(5) %a to ptr | ||
ret ptr %b | ||
} | ||
|
||
define ptr addrspace(1) @test2(ptr addrspace(5) %p) { | ||
; CHECK-LABEL: test2( | ||
; CHECK: { | ||
; CHECK-NEXT: .reg .b64 %rd<2>; | ||
; CHECK-EMPTY: | ||
; CHECK-NEXT: // %bb.0: | ||
; CHECK-NEXT: ld.param.u64 %rd1, [test2_param_0]; | ||
; CHECK-NEXT: ret; | ||
%a = addrspacecast ptr addrspace(5) %p to ptr | ||
%b = addrspacecast ptr %a to ptr addrspace(1) | ||
ret ptr addrspace(1) %b | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -332,9 +332,7 @@ define signext i32 @setlength() { | |
; CHECK: lgr [[MALLOC:[0-9]+]],3 | ||
; CHECK: basr 7,6 | ||
; CHECK: lgr [[LENGTH:[0-9]+]],3 | ||
; CHECK: la [[ADDR:[0-9]+]],4([[MALLOC]]) | ||
; CHECK: llgtr [[ADDR]],[[ADDR]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Removing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prior to my change the DAG looked like this:
Which was lowered to this:
And than simplified to this:
With my change, the addrspacecasts are eliminated and as a result the 'and' is no longer present, this seems like an improvement and it should also be correct, as the LangRef states:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ok, that is a problem. Any suggestion how to get around this issue? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that the X86 backend has a similar issue. They also use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Doesn't this imply that the pointers must have the same size? |
||
; CHECK: stg [[LENGTH]],0([[ADDR]]) | ||
; CHECK: stg [[LENGTH]],4([[MALLOC]]) | ||
entry: | ||
%call = tail call ptr @__malloc31(i64 noundef 8) | ||
%call1 = tail call signext i32 @foo() | ||
|
@@ -357,7 +355,7 @@ define signext i32 @setlength2() { | |
; CHECK: basr 7,6 | ||
; CHECK: lgr [[LENGTH:[0-9]+]],3 | ||
; CHECK: ahi [[MALLOC]],4 | ||
; CHECK: llgtr [[ADDR]],[[MALLOC]] | ||
; CHECK: llgtr [[ADDR:[0-9]+]],[[MALLOC]] | ||
; CHECK: stg [[LENGTH]],0([[ADDR]]) | ||
entry: | ||
%call = tail call ptr addrspace(1) @domalloc(i64 noundef 8) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is AS cast always transitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's treated as such by instcombine (https://godbolt.org/z/dhfTd18xs) so doing the same thing here seems safe.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikic ^^ Would you happen to know if that's a safe assumption.
I'm concerned that it would depend on the specifics of the AS arrangement known only to the target. It may be too optimistic to assume that ASC between all address spaces on all targets are transitive.
A thought experiment that could conceivably be applicable to existing hardware (even X86).
It's not uncommon for RAM to have multiple mappings in the address space, with specific properties (e.g. cacheability).
Let's say we have cached/write-through (C), uncached (U) and write-back(W) cacheable windows.
C/U/W addresses have 1:1 mapping to each other.
We also have a virtual AS (V), which contains all the windows for C/U/W.
ASC(C->U->V)
andASC(C->W->V)
will be distinctly different fromASC(C->V)
, as each will be pointing at a different window in V.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For integral address spaces, if the pointer is dereferenceable, the cast is assumed reversible (though we should have an instruction flag to indicate when we know the object is valid)
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my example above, each individual cast is reversible, but that does not guarantee transitivity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikic Could you please help clarify the semantics here? In practice, the fact that we're treating ASC as transitive in IC makes me think no target has defined address spaces as Artem is describing, meaning it should be safe here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My example is hypothetical. I don't think we have actual uses that match that pattern.
The point is that we can't imply AS transitivity implicitly. If LLVM relies on it, it should be documented explicitly, IMO.
Right now I can't tell whether the assumption of AS transitivity by existing code in LLVM is intentional or not. If it's by design, then your patch is fine. If not, then we may need to figure out how to deal with that. Probably we'll need to make transitivity target-dependent. I believe we did discuss describing additional target's AS properties in the past.
Right now LLVM does not say much about AS assumptions. The closest I can find is this; https://llvm.org/docs/LangRef.html#pointer-type
This suggests that whether ASC is transitive is target specific.
This phrase can be interpreted that we're allowed to assume transitivity, in principle, as data in my example is accessible, even if the ASC results in a different pointer value.
However, even in that case it requires the ASCs to be "proven accessible", and that in general requires target-specific knowledge. The way I read it, collapsing
local->shared->generic
into justlocal->generic
is not allowed, as the source pointer in local AS pointer can't be accessed via shared AS. We could collapse such an invalid transition into a poison value, though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at a different aspect: Shouldn't there a check if the address spaces are integral? Because this certainly makes the first fold invalid. And reading https://llvm.org/docs/LangRef.html#nointptrtype, it seems to me that this also means that the transitive fold is invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the numeric pointer value, you may end up with a different bitpattern. At the ultimate access, through either path, there needs to be a single dereferenceable object at a single address in V. It's possible we need the usage context implies UB on access to assume transitivity