-
Notifications
You must be signed in to change notification settings - Fork 787
[OpenCL] Disable vector to scalar types coercion for OpenCL #8160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenCL] Disable vector to scalar types coercion for OpenCL #8160
Conversation
For x86 target, vector types (both result and arguments) can be coerced to scalars of the same size, e.g: define zeroext i1 @_Z18convert_ulong4_rteDv4_t(<4 x i16> %x) ; becomes define zeroext i1 @_Z18convert_ulong4_rteDv4_t(i64 %x.coerced) Such behavior is completely valid for x86, but the backend vectorizer cannot work with scalars instead of vectors. With this patch, argument and result types will be leaved unchanged in the CodeGen. New option fopencl-force-vector-abi is also added to force-disables vector to scalar coercion when provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't seem to be dependent on SYCL, why don't we commit it directly to LLORG instead?
We will land open source OpenCL CPU RT code to SYCLOS. OpenCL CPU RT has dependency on this PR. As the original code is used internally for OpenCL CPU RT, so we need this PR to be landed in SYCLOS even if https://reviews.llvm.org/D142948 is rejected by LLORG. It's urgent for us. |
@@ -4427,6 +4469,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs, | |||
} | |||
|
|||
void WinX86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extend the test below for this target as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
triple x86_64-unknown-unknown in the test is already covering WinX86_64ABIInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that because the test is being run on Windows? Otherwise, I am not sure how it defaults to Windows and not say, for example, Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that because the test is being run on Windows?
Yes. On linux, the test runs in X86_64ABIInfo::computeInfo path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, please add a Windows-specific triple to the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
triple x86_64-unknown-unknown in the test is already covering WinX86_64ABIInfo
sorry this answer is incorrect. You're right that windows x86_64 target isn't covered in the first commit. I've added WinX86_64ABIInfo test in the new commit.
I also changed i686-unknown-unknown triple to i686-pc-win32-gnu, in order to match with the use scenario of this PR in OpenCL builtin build. DEVICE_TRIPLE is i686-pc-win32-gnu-elf and x86_64-pc-win32-gnu-elf for windows build in file backend/libraries/CMakeLists.txt
|
||
// NOCOER: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}}) | ||
// COER32CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}}) | ||
// COER64CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(double noundef %{{.*}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 'double' the right type here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not related to this PR.
If we add ElementType->isSpecificBuiltinType(BuiltinType::UShort) to
llvm/clang/lib/CodeGen/TargetInfo.cpp
Line 2974 in aa69e4d
ElementType->isSpecificBuiltinType(BuiltinType::ULong))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should, otherwise I am concerned the representation will be incorrect. Is something like isIntegerType or isIntegralOrEnumerationType() more appropriate there instead of checking each integral type kind there?
@erichkeane, could you please give your input here? I am really not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @erichkeane . thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 'double' type issue isn't related to this PR. @premanandrao could you please approve this PR if there is no other issues? This PR blocks the pre-commit build of #8216
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just got to the office after being away last week for the WG21 meeting. We definitely need to not make the element type here double, I can't imagine the optimizer is going to do good things with that/those conversions. I don't follow this well enough to know whether this cna be approved yet, so I'll leave that to Prem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @erichkeane!
@hewj03, I think TargetInfo.cpp should be fixed to not have this converted to a double. Please create an issue to fix that in a separate PR if it won't be done as part of this. I don't think we should lose track of that issue. And add a FIXME to this test where we see the possibly incorrect 'double' types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue to fix that in a separate PR if it won't be done as part of this.
Issue #8347 is created for the 'double' type.
@@ -4427,6 +4469,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs, | |||
} | |||
|
|||
void WinX86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, please add a Windows-specific triple to the test.
|
||
// NOCOER: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}}) | ||
// COER32CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}}) | ||
// COER64CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(double noundef %{{.*}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @erichkeane!
@hewj03, I think TargetInfo.cpp should be fixed to not have this converted to a double. Please create an issue to fix that in a separate PR if it won't be done as part of this. I don't think we should lose track of that issue. And add a FIXME to this test where we see the possibly incorrect 'double' types.
ping @intel/dpcpp-clang-driver-reviewers. Please review this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@intel/llvm-gatekeepers Please help to merge this PR. |
The LLVM test suites shown error info like "Error: Fail to load /localdisk2/github/_work/llvm/llvm/./llvm/devops/actions/llvm_test_suite/action.yml". Are there some issues in test infrastructure? |
@cdai2, please, update your branch. There should be no issues with the tip of the branch. |
Thanks a lot to rebase this PR. I have restarted the last failed Precommit test of RHEL build. It failed because of timeout. |
@bader All checks have passed. thanks |
For x86 target, vector types (both result and arguments) can be coerced to scalars of the same size, e.g: define zeroext i1 @_Z18convert_ulong4_rteDv4_t(<4 x i16> %x) ; becomes define zeroext i1 @_Z18convert_ulong4_rteDv4_t(i64 %x.coerced) Such behavior is completely valid for x86, but the backend vectorizer cannot work with scalars instead of vectors. With this patch, argument and result types will be leaved unchanged in the CodeGen. New option fopencl-force-vector-abi is also added to force-disables vector to scalar coercion when provided. --------- Co-authored-by: Wenju He <[email protected]> Co-authored-by: Alexey Bader <[email protected]>
For x86 target, vector types (both result and arguments) can be coerced to scalars of the same size, e.g:
Such behavior is completely valid for x86, but the backend vectorizer cannot work with scalars instead of vectors.
With this patch, argument and result types will be leaved unchanged in the CodeGen.
New option fopencl-force-vector-abi is also added to force-disables vector to scalar coercion when provided.