Skip to content

[ESIMD] Add semi-dynamic SLM allocation - esimd::experimental::slm_allocator. #7759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 28, 2022

Conversation

kbobrovs
Copy link
Contributor

@kbobrovs kbobrovs commented Dec 13, 2022

This patch adds new class - slm_allocator - and its lowering. This is RAII-style class used to implement "semi-dynamic" SLM allocation. SLM is allocated in the constructor and released in the destructor, that's why it is "dynamic", as opposed to fully static allocation style of 'slm_init'. Actual offset of SLM chunk allocated by the call is calculated at compile time, that's why it is "semi-". To calculate SLM usage by a kernel, compiler finds a path in a callgraph with the largest amount of SLM "locked" by slm_allocator objects live along the paths. slm_init call also participates in calculating SLM budget. It can be modelled as slm_allocator object declared at the very beginning of a kernel and live till its the very end.

Since a call graph is used, function pointers and recursion is not supported.
Complementary E2E test: intel/llvm-test-suite#1449.

Signed-off-by: Konstantin S Bobrovsky [email protected]

…locator.

This patch adds new class - slm_allocator - and its lowering. This is
RAII-style class used to implement "semi-dynamic" SLM allocation.
SLM is allocated in the constructor and released in the destructor,
that's why it is "dynamic", as opposed to fully static allocation
style of 'slm_init'. Actual offset of SLM chunk allocated by the call
is calculated at compile time, that's why it is "semi-". To calculate
SLM usage by a kernel, compiler finds a path in a callgraph with the
largest amount of SLM "locked" by slm_allocator objects live along
the paths. slm_init call also participates in calculating SLM budget.
It can be modelled as slm_allocator object declared at the very
beginning of a kernel and live till its the very end.

Since a call graph is used, function pointers and recursion is not
supported.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs kbobrovs requested review from v-klochkov, rolandschulz and a team as code owners December 13, 2022 08:15
@kbobrovs
Copy link
Contributor Author

Unit tests are in flight

@@ -1755,6 +1755,36 @@ __ESIMD_API simd<T, N> atomic_update(T *p, simd<unsigned, N> offset,
p, offset, src1, src0, mask);
}

/// RAII-style class used to implement "semi-dynamic" SLM allocation.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petercaday, please review this part. E2E test - intel/llvm-test-suite#1449.

@@ -19,7 +19,9 @@
// deducible when BE is ready

#include "llvm/SYCLLowerIR/LowerInvokeSimd.h"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rolandschulz, please review this file

@kbobrovs kbobrovs requested a review from kychendev December 13, 2022 08:45
@v-klochkov
Copy link
Contributor

v-klochkov commented Dec 13, 2022

@kbobrovs - please take a look at BUILD+LIT fail: esimd/genx_func_attr.cpp

@kbobrovs
Copy link
Contributor Author

@kbobrovs - please take a look at BUILD+LIT fail: esimd/genx_func_attr.cpp

Yes, looking into this.

Copy link
Contributor

@v-klochkov v-klochkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at few ESIMD fails, e.g.
slm_gather_scatter_rgba.cpp test prints:
LLVM ERROR: slm_init call met in non-kernel function _ZNK6KernelIiLj8ELN4sycl3_V13ext5intel5esimd17rgba_channel_maskE1EEclENS1_7nd_itemILi1EEE

@kbobrovs
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1449

@kbobrovs kbobrovs requested a review from v-klochkov December 15, 2022 09:27
@v-klochkov
Copy link
Contributor

@kbobrovs - I just noticed that esimd/genx_func_attr.cpp test fails. Is that the test that you mentioned as the one failing with slm_init call in SYCL EXTERNAL FUNCTION ?

@v-klochkov
Copy link
Contributor

@asudarsa - would you please review this patch to let Konst fix the issues in sycl-post-link if there are any concerns from your side.

@asudarsa
Copy link
Contributor

I have started looking at this. Sorry for delay. I had a quick question about how we handle overflows. Thanks

@kbobrovs
Copy link
Contributor Author

I have started looking at this. Sorry for delay. I had a quick question about how we handle overflows. Thanks

Overflows are detected at runtime by the JITter, as we don't know available SLM size at compile-time.

@kbobrovs
Copy link
Contributor Author

@kbobrovs - I just noticed that esimd/genx_func_attr.cpp test fails. Is that the test that you mentioned as the one failing with slm_init call in SYCL EXTERNAL FUNCTION ?

Yes, I'm looking into this

@asudarsa
Copy link
Contributor

asudarsa commented Dec 16, 2022

Hi @kbobrovs and @v-klochkov

All the sycl-post-link related changes are NFC changes and look fine. I will take a quick look at the full patch before approval. Hope that's ok. Please let me know if I missed something.

Thanks

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from DPCPP Tools perspective. Thanks

/// as a class' template argument.
///
/// Since a call graph is used, function pointers and recursion is not
/// supported.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these cases detected and flagged as compile-time errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, no. Recursion is not a problem, because it is not supported by SYCL. But presence of function pointers should be detected, agree. Can diagnostics be added as a follow-up commit? Tagging @v-klochkov.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But presence of function pointers should be detected, agree.

@petercaday, I added diagnostics in 8d20150

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @kbobrovs!

Copy link

@petercaday petercaday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for implementing this, @kbobrovs!

@bader
Copy link
Contributor

bader commented Dec 22, 2022

@v-klochkov, could you fix pre-commit failures, please?

@v-klochkov
Copy link
Contributor

@v-klochkov, could you fix pre-commit failures, please?

I'll start looking at this LIT test fail after returning from vacation (Jan 2023)

@kbobrovs
Copy link
Contributor Author

@v-klochkov, could you fix pre-commit failures, please?

I'll start looking at this LIT test fail after returning from vacation (Jan 2023)

Folks, I'm looking into this failure. Hope can fix before the NY

kbobrovs added 3 commits December 27, 2022 18:23
  * maximum SLM offset calculation for each scope
  * maximum SLM usage per kernel
  This also fixed a bug in calculating maximum SLM usage per kernel.
- Update the unit test to use new API and to require correct SLM size for each
  kernel. Multiple slm_init's and slm_init's in non-inlined SYCL_EXTERNAL
  functions did not work anyway.

Signed-off-by: kbobrovs <[email protected]>
Signed-off-by: kbobrovs <[email protected]>
@kbobrovs kbobrovs temporarily deployed to aws December 27, 2022 14:13 — with GitHub Actions Inactive
@kbobrovs kbobrovs temporarily deployed to aws December 27, 2022 14:58 — with GitHub Actions Inactive
@kbobrovs kbobrovs temporarily deployed to aws December 27, 2022 15:28 — with GitHub Actions Inactive
@kbobrovs kbobrovs temporarily deployed to aws December 28, 2022 09:28 — with GitHub Actions Inactive
@kbobrovs kbobrovs temporarily deployed to aws December 28, 2022 09:58 — with GitHub Actions Inactive
@kbobrovs
Copy link
Contributor Author

@bader, would you mind merging, please?

@AlexeySachkov AlexeySachkov merged commit e767401 into intel:sycl Dec 28, 2022
@AlexeySachkov
Copy link
Contributor

Post-commit failed on MacOS due to missing include, the fix is in #7880

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants