Skip to content

[SYCL] Remove free function queries call detection from FE #5178

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 23, 2021

Conversation

alexbatashev
Copy link
Contributor

Removing free function call detection from clang front-end dramatically improves compile times of device-side code (up to 3 times in some cases).

This has the following implications:

  1. Range rounding is no longer disabled based on free-function queries. Users are encouraged to disable range rounding manually.
  2. Free function queries are no longer supported on host device.

@alexbatashev alexbatashev marked this pull request as ready for review December 19, 2021 13:21
@alexbatashev
Copy link
Contributor Author

/summary:run

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain why removing FE detection changes this behavior you mentioned in description?

Free function queries are no longer supported on host device.

@alexbatashev
Copy link
Contributor Author

Can you please explain why removing FE detection changes this behavior you mentioned in description?

Free function queries are no longer supported on host device

@elizabethandrews @smanna12 these queries are not native to host device and runtime relies on integration header to make sure we store necessary info. This makes this feature usable only if integration header is present. A runtime-only implementation is possible, but it would be inefficient and hurt runtime performance. And since host device is rarely used, and the feature can not be implemented efficiently, it is much easier to simply drop this feature support for host device. This aligns with C++ policy "do not pay for what you don't use".

@smanna12
Copy link
Contributor

Can you please explain why removing FE detection changes this behavior you mentioned in description?

Free function queries are no longer supported on host device

@elizabethandrews @smanna12 these queries are not native to host device and runtime relies on integration header to make sure we store necessary info. This makes this feature usable only if integration header is present. A runtime-only implementation is possible, but it would be inefficient and hurt runtime performance. And since host device is rarely used, and the feature can not be implemented efficiently, it is much easier to simply drop this feature support for host device. This aligns with C++ policy "do not pay for what you don't use".

Thanks for the explanation.

@alexbatashev
Copy link
Contributor Author

/verify with intel/llvm-test-suite#654

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes LGTM

Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes look good to me.

* upstream/sycl: (1382 commits)
  [SYCL][XPTI] Report memory allocation info from SYCL runtime (intel#5172)
  [CI] Switch labels for OCL x64 job (intel#5185)
  [SYCL] Add basic support for the generic_space address space (intel#5148)
  [CI] Update CODEOWNERS for SYCL printf support passes (intel#5199)
  [SYCL][Matrix] Enable wi_slice for joint_matrix (intel#4979)
  [SYCL][Group algorithms] Move group sort extension to experimental (intel#5169)
  [SYCL] Fix kernel bundles don't really carry kernel IDs (intel#5121)
  [SYCL] Initial printf support for non-constant AS format strings (intel#5069)
  [SYCL][NFC] Fix static code analysis concerns (intel#5189)
  [SYCL][Doc] Fix typos to fix doc build (intel#5190)
  [Driver][SYCL] Turn on -fsycl-dead-args-optimization by default (intel#3004)
  [SYCL][L0][Plugin] Add support for batching copy commands (intel#5155)
  [CI] Add cache checkout script to docker containers (intel#5184)
  [SYCL][Doc] Add HIP backend to the filter selector (intel#5176)
  [Doc] Add documentation for Docker images (intel#4778)
  [LIBCLC] Add functionality for in-kernel asserts for CUDA backend (intel#5174)
  Force opt to use new pass manager in exponential-deferred-inlining test after a8c2ba1
  [SYCL] Add vec and marray support to known_identity type trait (intel#5163)
  Correctly resolve merge conflicts
  Update SPV_INTEL_hw_thread_queries to rev 2
  ...
Copy link
Contributor

@sergey-semenov sergey-semenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime part LGTM

@bader
Copy link
Contributor

bader commented Dec 23, 2021

I think intel/llvm-test-suite#654 should go first.

@bader bader merged commit e4791d1 into intel:sycl Dec 23, 2021
@alexbatashev
Copy link
Contributor Author

@psalz this commit should also help with performance of CTS compilation

psalz added a commit to KhronosGroup/SYCL-CTS that referenced this pull request Dec 27, 2021
This includes two changes that should improve CTS compilation times
(intel/llvm#5127, intel/llvm#5178).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants