Skip to content

Enable matrix_load, matrix_store, and matrix_mad #4076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 17, 2021

Conversation

yubingex007-a11y
Copy link
Contributor

Change-Id: I97b939c5559317ad82bec460221bccb83975b004

Change-Id: I97b939c5559317ad82bec460221bccb83975b004
@yubingex007-a11y yubingex007-a11y requested a review from a team as a code owner July 8, 2021 12:52
bader
bader previously requested changes Jul 18, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, put files to ext/intel directory instead of ONEAPI.

@dkhaldi
Copy link
Contributor

dkhaldi commented Jul 19, 2021

Please, put files to ext/intel directory instead of ONEAPI.

@bader, Although the matrix feature is in experimental phase, it is a oneAPI feature, it is not an intel specific feature. I think we should keep it as it was already.

@bader
Copy link
Contributor

bader commented Jul 19, 2021

Sorry, I meant to say ext/oneapi. Please, rebase on the tip of the branch to resolve the conflicts.
See d703f57 for more details on the requested change. We moved all oneAPI extensions from ONEAPI to ext/oneapi directory.

@dkhaldi
Copy link
Contributor

dkhaldi commented Jul 19, 2021

Sorry, I meant to say ext/oneapi. Please, rebase on the tip of the branch to resolve the conflicts.
See d703f57 for more details on the requested change. We moved all oneAPI extensions from ONEAPI to ext/oneapi directory.

What would be the namespace we should use in this case?
Currently we are using: sycl:ext::intel::experimental::matrix;
Should it be: sycl:ext::ONEAPI::experimental::matrix;?
BTW, I see experimental directory in ONEAPI/. Should we move matrix/ there?

@bader
Copy link
Contributor

bader commented Jul 19, 2021

The namespace should be sycl::ext::oneapi::experimental::matrix.

According to my understanding ONEAPI directory is left for backward compatibility only and should not be used for new features.
All new features should follow SYCL spec guidelines and use sycl::ext::oneapi namespace (implementation guideline is to put files into ext/oneapi directory reflecting namespace structure).

Tagging @dm-vodopyanov and @romanovvlad to correct me if I miss something.

@romanovvlad
Copy link
Contributor

The namespace should be sycl::ext::oneapi::experimental::matrix.

According to my understanding ONEAPI directory is left for backward compatibility only and should not be used for new features.
All new features should follow SYCL spec guidelines and use sycl::ext::oneapi namespace (implementation guideline is to put files into ext/oneapi directory reflecting namespace structure).

Tagging @dm-vodopyanov and @romanovvlad to correct me if I miss something.

Please, use sycl::ext::oneapi::experimental for general extension and sycl::ext::intel::experimental if the extension is specific to Intel HW.

@@ -38,7 +38,7 @@
__SYCL_INLINE_NAMESPACE(cl) {
namespace sycl {
namespace ext {
namespace intel {
namespace oneapi {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, @bader @dkhaldi , should we move this file to ext/intel directory since it works on intel HW?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mistakenly change it, sorry for inconvience.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, @bader @dkhaldi , should we move this file to ext/intel directory since it works on intel HW?

Please keep everything in ext/oneapi.
The feature is still oneAPI "matrix". It is just that the initial implementation targeted only AMX.

1. change namespace to ext::intel
2. move matrix-aot-amx.hpp to ext/intel directory
matrix_layout Layout = matrix_layout::row_major,
access::address_space Space>
inline __SYCL_ALWAYS_INLINE void joint_matrix_store(
Group sg, joint_matrix<Group, T, NumRows, NumCols, Layout> &res,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Group sg, joint_matrix<Group, T, NumRows, NumCols, Layout> &res,
Group sg, joint_matrix<Group, T, NumRows, NumCols, Layout> &obj,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to change joint_matrix_load as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexeySotkin can you please elaborate why you asked for the change res --> obj?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just improving the name. There is no result for the store operation. It returns void. This parameter is a matrix object which we want to store.

access::address_space Space>
inline __SYCL_ALWAYS_INLINE void joint_matrix_load(
Group sg, joint_matrix<Group, T, NumRows, NumCols, Layout> &res,
multi_ptr<T, Space> src, size_t stride, matrix_layout L = Layout) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function argument L is not used. Probably we should use it as operand of __spirv_MatrixLoadINTEL, but it must be a constant.
It seems like we have two layouts here: one for the memory we read from, and another for the matrix object, right? If so, I think we need another template parameter to specify layout of the memory.

Copy link
Contributor

@dkhaldi dkhaldi Jul 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the changes we agreed on in our meeting today:

  • remove matrix_layout L = Layout
  • pass L (the memory layout) to the load/store functions
  • rename L --> memL
  • rename Layout --> matL
  • rename the header file spirv API from matrix to joint_matrix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not quite clear what is "pass L to the load/store functions". I think we should move the last function parameter to template parameters, right?
Like:

template <typename Group, typename T, size_t NumRows, size_t NumCols,
          matrix_layout MemL = matrix_layout::row_major,
          matrix_layout MatL = matrix_layout::row_major,
          access::address_space Space>
inline __SYCL_ALWAYS_INLINE void joint_matrix_load(
    Group sg, joint_matrix<Group, T, NumRows, NumCols, Layout> &res,
    multi_ptr<T, Space> src, size_t stride) {
...
  res.spvm = __spirv_MatrixLoadINTEL<T, NumRows, NumCols,
                                     spv_matrix_layout_traits<MemL>::value,
                                     spv_matrix_layout_traits<MatL>::value>(Ptr, stride);
...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, the user does not need to specify any template argument when calling load/store. If we move the memory layout to the template, the user will have to specify each of the parameters (deduce all or nothing) when calling load/store.

The other reason is that I would like to not converge too much (unless there is a big reason) from Vulkan and WMMA APIs thus keeping the meory layout as argument to the functions

T *Ptr = src.get();
res.spvm = __spirv_MatrixLoadINTEL<T, NumRows, NumCols,
spv_matrix_layout_traits<Layout>::value>(
Ptr, stride, spv_matrix_layout_traits<Layout>::value);
Copy link
Contributor Author

@yubingex007-a11y yubingex007-a11y Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemLayout

};

template <typename Group, typename T, size_t NumRows, size_t NumCols,
matrix_layout Layout = matrix_layout::row_major,
Copy link
Contributor Author

@yubingex007-a11y yubingex007-a11y Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matL

@@ -1,4 +1,4 @@
// RUN: %clangxx -march=sapphirerapids -fsycl -O2 %s -o %t.out
// RUN: %clangxx -DSYCL_EXT_ONEAPI_MATRIX=1 -march=sapphirerapids -fsycl -O2 %s -o %t.out
#include <CL/sycl.hpp>
#if (SYCL_EXT_ONEAPI_MATRIX == 1)
Copy link
Contributor

@dkhaldi dkhaldi Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Only the leader perform AMX computation.
Why do you have "the leader" test in the code. this is wrong for two reasons:
1- SG size is one so we don't need to test that.
2- all work items in the subgroup must enter the joint matrix code. So we should NOT have diverged code in there


// Only the leader perform AMX computation.
if (spmd_item.get_local_id(1) % TILE_SZ)
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should not have "the leader" test in the code. this is wrong because all work items in the subgroup must enter the joint matrix code. So we should not have diverge code in there.
The OpenCL backend will not vectorize the code so we should be able to generate ONE AMX code for the whole sub group


// Only the leader perform AMX computation.
if (spmd_item.get_local_id(1) % TILE_SZ)
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above


// Only the leader perform AMX computation.
if (spmd_item.get_local_id(1) % TILE_SZ)
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above


// Only the leader perform AMX computation.
if (spmd_item.get_local_id(1) % TILE_SZ)
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

case matrix_layout::row_major:
res.spvm =
__spirv_JointMatrixLoadINTEL<T, NumRows, NumCols,
spv_matrix_layout_traits<Layout>::value>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you omitted the group argument here right?

res.spvm =
__spirv_JointMatrixLoadINTEL<T, NumRows, NumCols,
spv_matrix_layout_traits<Layout>::value>(
Ptr, stride, __spv::MatrixLayout::ColumnMajor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

res.spvm =
__spirv_JointMatrixLoadINTEL<T, NumRows, NumCols,
spv_matrix_layout_traits<Layout>::value>(
Ptr, stride, __spv::MatrixLayout::PackedA);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spirv should have scope argument so sg should be passed in all load/store/mad functions

case matrix_layout::col_major:
__spirv_JointMatrixStoreINTEL<T, NumRows, NumCols,
spv_matrix_layout_traits<MatL>::value>(
Ptr, obj.spvm, stride, __spv::MatrixLayout::ColumnMajor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we said obj should be change to "src" right?

@dkhaldi
Copy link
Contributor

dkhaldi commented Aug 10, 2021

LGTM

@yubingex007-a11y yubingex007-a11y requested a review from bader August 10, 2021 14:44
@bader bader dismissed their stale review August 10, 2021 17:22

My comment is addressed. Thanks.

@bader
Copy link
Contributor

bader commented Aug 10, 2021

LGTM

@dkhaldi, you must approve using GitHub UI to unblock the merge. Currently it's blocked by your previous review where changes were requested.

@dkhaldi
Copy link
Contributor

dkhaldi commented Aug 10, 2021

LGTM

@dkhaldi, you must approve using GitHub UI to unblock the merge. Currently it's blocked by your previous review where changes were requested.

@bader done

@yubingex007-a11y
Copy link
Contributor Author

@intel/llvm-reviewers-runtime Hi, Could you do a code review please?

1- we move the Group template argument on joint_matrix to the last position and make it optional. By default set Group to subgroup. So the user does not need to specify this each time
2- we change the "mad" function name from joint_matrix_mad to joint_matrix_mma.
@dkhaldi
Copy link
Contributor

dkhaldi commented Aug 13, 2021

LGTM, @bader can you help merge this PR?

@bader
Copy link
Contributor

bader commented Aug 13, 2021

LGTM, @bader can you help merge this PR?

The PR is waiting on code owner review from @intel/llvm-reviewers-runtime.

Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matrix-amx.hpp => matrix-aot-amx.hpp
Does this renaming follow API/ABI compatibility policy?

sycl/include/CL/sycl/ONEAPI/matrix/matrix-amx.hpp said one should include matrix-amx.hpp and the latter file becomes unavailable now.

Probably, the user won't be happy.

Other than that, the changes seem to be legit.

@yubingex007-a11y
Copy link
Contributor Author

matrix-amx.hpp => matrix-aot-amx.hpp
Does this renaming follow API/ABI compatibility policy?

sycl/include/CL/sycl/ONEAPI/matrix/matrix-amx.hpp said one should include matrix-amx.hpp and the latter file becomes unavailable now.

Probably, the user won't be happy.

Other than that, the changes seem to be legit.
@s-kanaev , Thanks for your review.
This is an experimental interface so it is assumed to be changed.

Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@romanovvlad romanovvlad merged commit 7f21853 into intel:sycl Aug 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants