Skip to content

[SYCL] Avoid re-computing group_range in nd_item class #4621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 23, 2021

Conversation

v-klochkov
Copy link
Contributor

This fix is mostly NFC.
Instead of doing costly division operations and re-computing the group_range
inside nd_item class, it is better to call group::get_group_range() that
does not do divisions as the group class keeps the group_range
as a pre-computed field.

Signed-off-by: Vyacheslav N Klochkov [email protected]

This fix is mostly NFC.
Instead of doing costly division operations and re-computing the group_range
inside nd_item class, it is better to call group::get_group_range() that
does not do divisions as the group class keeps the group_range
as a pre-computed field.

Signed-off-by: Vyacheslav N Klochkov <[email protected]>
@v-klochkov v-klochkov requested a review from a team as a code owner September 23, 2021 00:28
dm-vodopyanov
dm-vodopyanov previously approved these changes Sep 23, 2021
@dm-vodopyanov
Copy link
Contributor

@v-klochkov, please fix clang-format

Signed-off-by: Vyacheslav N Klochkov <[email protected]>
@v-klochkov v-klochkov merged commit 0cd7b7e into intel:sycl Sep 23, 2021
@v-klochkov v-klochkov deleted the get_group_range_fix branch September 23, 2021 20:44
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Sep 24, 2021
* upstream/sycl: (2344 commits)
  [ESIMD] Rename slm_load4/slm_store4 to slm_load_rgba/slm_store_rgba (intel#4158)
  [SYCL] Avoid re-computing group_range in nd_item::get_group_range() (intel#4621)
  [clang-offload-extract] Ignore zero padding in .tgting section (intel#4622)
  [Driver][SYCL] Fix -fsycl-help output when redirected (intel#4619)
  [Driver][SYCL][FPGA] Do not unbundle aoco as an archive for hardware (intel#4477)
  [Driver][SYCL] Fix offload-bundler and offload-deps triples (intel#4616)
  [SYCL] Fix bit_cast for half type (intel#4603)
  [SYCL] Fix a typo in accessor::get_range method (intel#4556)
  [SYCL] Detach allocas from their dependencies regardless of linked alloca presence (intel#4573)
  [SYCL][L0] Make sure that we only query/sync host-visible events from the host. (intel#4613)
  Fix tests with wrong alias metadata
  [Driver][SYCL] Fixup arguments to llc call for PIC and code-model (intel#4614)
  [SYCL][L0] Add ownership control for LeveL-Zero kernel_bundle interop. (intel#4576)
  [SYCL][Driver] Expose llvm-foreach --jobs functionality through a driver option (intel#4543)
  [SYCL] Prevent stream buffer leak on constructor exception (intel#4594)
  [ESIMD] Replace mask_type_t with simd_mask to represent Gen predicates. (intel#4230)
  Fix for a bunch of fixed point integer SPIR-V instructions (intel#1213)
  Amend SingleElementVectorINTEL decoration use cases according to spec update (intel#1192)
  Enforce UserSemantic decoration if no FPGA decorations found
  [SYCL][CUDA] Fix context scope in kernel launch (intel#4606)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants