Skip to content

[OpenCL] Disable vector to scalar types coercion for OpenCL #8160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 20, 2023

Conversation

cdai2
Copy link
Contributor

@cdai2 cdai2 commented Jan 31, 2023

For x86 target, vector types (both result and arguments) can be coerced to scalars of the same size, e.g:

  define zeroext i1 @_Z18convert_ulong4_rteDv4_t(<4 x i16> %x)
  ; becomes
  define zeroext i1 @_Z18convert_ulong4_rteDv4_t(i64 %x.coerced)

Such behavior is completely valid for x86, but the backend vectorizer cannot work with scalars instead of vectors.

With this patch, argument and result types will be leaved unchanged in the CodeGen.

New option fopencl-force-vector-abi is also added to force-disables vector to scalar coercion when provided.

For x86 target, vector types (both result and arguments) can be coerced
to scalars of the same size, e.g:

      define zeroext i1 @_Z18convert_ulong4_rteDv4_t(<4 x i16> %x)
      ; becomes
      define zeroext i1 @_Z18convert_ulong4_rteDv4_t(i64 %x.coerced)

Such behavior is completely valid for x86, but the backend vectorizer
cannot work with scalars instead of vectors.

With this patch, argument and result types will be leaved unchanged in
the CodeGen.

New option fopencl-force-vector-abi is also added to force-disables
vector to scalar coercion when provided.
@cdai2 cdai2 requested review from a team as code owners January 31, 2023 08:35
@cdai2 cdai2 requested review from romanovvlad and bader January 31, 2023 08:35
@cdai2 cdai2 temporarily deployed to aws January 31, 2023 09:22 — with GitHub Actions Inactive
Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't seem to be dependent on SYCL, why don't we commit it directly to LLORG instead?

@cdai2
Copy link
Contributor Author

cdai2 commented Jan 31, 2023

That doesn't seem to be dependent on SYCL, why don't we commit it directly to LLORG instead?

We will land open source OpenCL CPU RT code to SYCLOS. OpenCL CPU RT has dependency on this PR.
I have submitted PR https://reviews.llvm.org/D142948 to LLORG. It's still under code review.

As the original code is used internally for OpenCL CPU RT, so we need this PR to be landed in SYCLOS even if https://reviews.llvm.org/D142948 is rejected by LLORG. It's urgent for us.

@cdai2 cdai2 temporarily deployed to aws January 31, 2023 15:49 — with GitHub Actions Inactive
@@ -4427,6 +4469,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs,
}

void WinX86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extend the test below for this target as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triple x86_64-unknown-unknown in the test is already covering WinX86_64ABIInfo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because the test is being run on Windows? Otherwise, I am not sure how it defaults to Windows and not say, for example, Linux.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because the test is being run on Windows?

Yes. On linux, the test runs in X86_64ABIInfo::computeInfo path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, please add a Windows-specific triple to the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

triple x86_64-unknown-unknown in the test is already covering WinX86_64ABIInfo

sorry this answer is incorrect. You're right that windows x86_64 target isn't covered in the first commit. I've added WinX86_64ABIInfo test in the new commit.

I also changed i686-unknown-unknown triple to i686-pc-win32-gnu, in order to match with the use scenario of this PR in OpenCL builtin build. DEVICE_TRIPLE is i686-pc-win32-gnu-elf and x86_64-pc-win32-gnu-elf for windows build in file backend/libraries/CMakeLists.txt


// NOCOER: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}})
// COER32CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}})
// COER64CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(double noundef %{{.*}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 'double' the right type here?

Copy link
Contributor

@wenju-he wenju-he Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not related to this PR.

If we add ElementType->isSpecificBuiltinType(BuiltinType::UShort) to

ElementType->isSpecificBuiltinType(BuiltinType::ULong)))
, the 'double' will be changed to 'i64'. Should we add signed/unsigned i8/i16/i32 types to that place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should, otherwise I am concerned the representation will be incorrect. Is something like isIntegerType or isIntegralOrEnumerationType() more appropriate there instead of checking each integral type kind there?

@erichkeane, could you please give your input here? I am really not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @erichkeane . thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 'double' type issue isn't related to this PR. @premanandrao could you please approve this PR if there is no other issues? This PR blocks the pre-commit build of #8216

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got to the office after being away last week for the WG21 meeting. We definitely need to not make the element type here double, I can't imagine the optimizer is going to do good things with that/those conversions. I don't follow this well enough to know whether this cna be approved yet, so I'll leave that to Prem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @erichkeane!
@hewj03, I think TargetInfo.cpp should be fixed to not have this converted to a double. Please create an issue to fix that in a separate PR if it won't be done as part of this. I don't think we should lose track of that issue. And add a FIXME to this test where we see the possibly incorrect 'double' types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create an issue to fix that in a separate PR if it won't be done as part of this.

Issue #8347 is created for the 'double' type.

@cdai2 cdai2 temporarily deployed to aws February 6, 2023 18:44 — with GitHub Actions Inactive
@cdai2 cdai2 temporarily deployed to aws February 6, 2023 20:21 — with GitHub Actions Inactive
@wenju-he wenju-he requested a review from premanandrao February 9, 2023 03:58
@@ -4427,6 +4469,10 @@ ABIArgInfo WinX86_64ABIInfo::classify(QualType Ty, unsigned &FreeSSERegs,
}

void WinX86_64ABIInfo::computeInfo(CGFunctionInfo &FI) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, please add a Windows-specific triple to the test.


// NOCOER: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(<4 x i16> noundef %{{.*}})
// COER32CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(i64 noundef %{{.*}})
// COER64CL: define {{.*}}<4 x i64> @_Z18convert_ulong4_rteDv4_t(double noundef %{{.*}})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @erichkeane!
@hewj03, I think TargetInfo.cpp should be fixed to not have this converted to a double. Please create an issue to fix that in a separate PR if it won't be done as part of this. I don't think we should lose track of that issue. And add a FIXME to this test where we see the possibly incorrect 'double' types.

@wenju-he wenju-he temporarily deployed to aws February 16, 2023 00:08 — with GitHub Actions Inactive
@cdai2
Copy link
Contributor Author

cdai2 commented Feb 17, 2023

ping @intel/dpcpp-clang-driver-reviewers. Please review this PR.

Copy link
Contributor

@hchilama hchilama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cdai2 cdai2 requested a review from a team February 17, 2023 06:52
@cdai2
Copy link
Contributor Author

cdai2 commented Feb 17, 2023

@intel/llvm-gatekeepers Please help to merge this PR.

@wenju-he wenju-he temporarily deployed to aws February 17, 2023 15:46 — with GitHub Actions Inactive
@wenju-he wenju-he temporarily deployed to aws February 17, 2023 15:47 — with GitHub Actions Inactive
@cdai2
Copy link
Contributor Author

cdai2 commented Feb 17, 2023

The LLVM test suites shown error info like "Error: Fail to load /localdisk2/github/_work/llvm/llvm/./llvm/devops/actions/llvm_test_suite/action.yml". Are there some issues in test infrastructure?

@bader
Copy link
Contributor

bader commented Feb 17, 2023

The LLVM test suites shown error info like "Error: Fail to load /localdisk2/github/_work/llvm/llvm/./llvm/devops/actions/llvm_test_suite/action.yml". Are there some issues in test infrastructure?

@cdai2, please, update your branch. There should be no issues with the tip of the branch.

@bader bader temporarily deployed to aws February 18, 2023 20:14 — with GitHub Actions Inactive
@bader bader temporarily deployed to aws February 18, 2023 20:54 — with GitHub Actions Inactive
@cdai2
Copy link
Contributor Author

cdai2 commented Feb 20, 2023

The LLVM test suites shown error info like "Error: Fail to load /localdisk2/github/_work/llvm/llvm/./llvm/devops/actions/llvm_test_suite/action.yml". Are there some issues in test infrastructure?

@cdai2, please, update your branch. There should be no issues with the tip of the branch.

Thanks a lot to rebase this PR. I have restarted the last failed Precommit test of RHEL build. It failed because of timeout.

@cdai2
Copy link
Contributor Author

cdai2 commented Feb 20, 2023

@bader All checks have passed. thanks

@bader bader merged commit 8b55761 into intel:sycl Feb 20, 2023
callumfare pushed a commit to callumfare/llvm that referenced this pull request Feb 28, 2023
For x86 target, vector types (both result and arguments) can be coerced
to scalars of the same size, e.g:

      define zeroext i1 @_Z18convert_ulong4_rteDv4_t(<4 x i16> %x)
      ; becomes
      define zeroext i1 @_Z18convert_ulong4_rteDv4_t(i64 %x.coerced)

Such behavior is completely valid for x86, but the backend vectorizer
cannot work with scalars instead of vectors.

With this patch, argument and result types will be leaved unchanged in
the CodeGen.

New option fopencl-force-vector-abi is also added to force-disables
vector to scalar coercion when provided.

---------

Co-authored-by: Wenju He <[email protected]>
Co-authored-by: Alexey Bader <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants