Skip to content

[SYCL][Fusion] Enable fusion of kernels with different ND-ranges #8209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

victor-eds
Copy link
Contributor

All kernels with the same (or unspecified) local size and offset can be fused. In order to make this work, some builtins getting index space information must be remapped and the resulting ND-range of the fused kernel, calculated.

The ND-range of the fused kernel will have:

  1. The same number of dimensions as the input ND-range with the higher number of dimensions;
  2. The same local size as the shared local size (or unspecified)
  3. The same offset as the shared offset
  4. The global size will be the greatest input global size as per the following ordering:
    i. Number of work items (enforces correctness);
    ii. Number of occurrences (less remappings needed);
    iii. Lexical order of the dimensions (introduces determinism).

Builtins obtaining the local/global size/id, work-group id, number of work-groups or offset are remapped introducing as per an alwaysinline function that can be reused along the fusion pass. More information can be found in the Builtins.cpp file, where the remapping logic is implemented.

Signed-off-by: Victor Perez [email protected]

@victor-eds victor-eds requested a review from Naghasan as a code owner February 6, 2023 10:29
@victor-eds victor-eds self-assigned this Feb 6, 2023
@victor-eds victor-eds requested review from sommerlukas and a team as code owners February 6, 2023 10:29
All kernels with the same (or unspecified) local size and offset can
be fused. In order to make this work, some builtins getting index
space information must  be remapped and the resulting ND-range of the
fused kernel, calculated.

The ND-range of the fused kernel will have:
1. The same number of dimensions as the input ND-range with the higher
number of dimensions;
2. The same local size as the shared local size (or unspecified)
3. The same offset as the shared offset
4. The global size will be the **greatest** input global size as per
the following ordering:
  i. Number of work items (enforces correctness);
  ii. Number of occurrences (less remappings needed);
  iii. Lexical order of the dimensions (introduces determinism).

Builtins obtaining the local/global size/id, work-group id, number of
work-groups or offset are remapped introducing as per an alwaysinline
function that can be reused along the fusion pass. More information
can be found in the Builtins.cpp file, where the remapping logic is
implemented.

Signed-off-by: Victor Perez <[email protected]>
@victor-eds victor-eds force-pushed the kernel-fusion/different-nd-ranges branch from 5e2927d to eff2d2f Compare February 6, 2023 10:35
@victor-eds
Copy link
Contributor Author

Reviewers guide:
Files under the syc-fusion directory are to be reviewed only by @sommerlukas and @Naghasan. That leaves a single file to be reviewed by the rest of reviewers.

victor-eds added a commit to victor-eds/llvm-test-suite that referenced this pull request Feb 6, 2023
Check that kernel fusion works when fusing kernels with different
ND-ranges (different global sizes and dimensions).

Implementation: intel/llvm#8209

Signed-off-by: Victor Perez <[email protected]>
@victor-eds victor-eds temporarily deployed to aws February 6, 2023 11:00 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 6, 2023 11:32 — with GitHub Actions Inactive
@victor-eds
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1575

@victor-eds victor-eds temporarily deployed to aws February 6, 2023 18:01 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 6, 2023 19:13 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 7, 2023 15:20 — with GitHub Actions Inactive
Copy link
Contributor

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detailed comments inline.

Did you test with a shared library build? I suspect that the linker script also must be updated, now that the interface to JIT compiler library was extended & changed.

@victor-eds victor-eds temporarily deployed to aws February 8, 2023 07:48 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 8, 2023 11:45 — with GitHub Actions Inactive
Copy link
Contributor

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good now, just a few comments about CMake setup.

@victor-eds victor-eds temporarily deployed to aws February 8, 2023 16:05 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 8, 2023 17:42 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 9, 2023 08:34 — with GitHub Actions Inactive
@victor-eds victor-eds added sycl-mlir Pull requests or issues for sycl-mlir branch and removed sycl-mlir Pull requests or issues for sycl-mlir branch labels Feb 9, 2023
@victor-eds
Copy link
Contributor Author

I have no rights to merge this commit. Can I get it merged, please? @intel/llvm-gatekeepers

@steffenlarsen
Copy link
Contributor

Pre-commit is failing to build it. Looks like you need an #include <array> in Kernel.h.

@victor-eds victor-eds temporarily deployed to aws February 13, 2023 13:24 — with GitHub Actions Inactive
@victor-eds victor-eds temporarily deployed to aws February 13, 2023 14:41 — with GitHub Actions Inactive
@victor-eds
Copy link
Contributor Author

victor-eds commented Feb 13, 2023

Pre-commit is failing to build it. Looks like you need an #include <array> in Kernel.h.

@steffenlarsen Thanks for pointing out. I've added the header. Failures are now due to missing tests updates (PR in same state).

@steffenlarsen
Copy link
Contributor

/verify with intel/llvm-test-suite#1575

@victor-eds
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1575

It appears there are two unrelated tests failing. Fusion should not affect those whatsoever.

@steffenlarsen
Copy link
Contributor

Failure in ESIMD/imulh_umulh.cpp addressed in intel/llvm-test-suite#1593.

steffenlarsen pushed a commit to intel/llvm-test-suite that referenced this pull request Feb 14, 2023
Check that kernel fusion works when fusing kernels with different
ND-ranges (different global sizes and dimensions).

Implementation: intel/llvm#8209

---------

Signed-off-by: Victor Perez <[email protected]>
Co-authored-by: Lukas Sommer <[email protected]>
@steffenlarsen steffenlarsen merged commit 325bc4e into intel:sycl Feb 14, 2023
@victor-eds victor-eds deleted the kernel-fusion/different-nd-ranges branch February 14, 2023 10:04
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
…/llvm-test-suite#1575)

Check that kernel fusion works when fusing kernels with different
ND-ranges (different global sizes and dimensions).

Implementation: intel#8209

---------

Signed-off-by: Victor Perez <[email protected]>
Co-authored-by: Lukas Sommer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants