[SYCL] Implement matrix extension using new unified interface #7413

yubingex007-a11y · 2022-11-16T08:32:55Z

No description provided.

JackAKirk · 2022-11-16T11:45:27Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+// uint16_t, this interpretation is possible. This design choice was made before
+// the introduction of SYCL experimental bfloat16 type. Our plan is to move
+// towards using the SYCL bfloat16. But since it is still experimental, we will
+// probably keep both uint16 interpretation and SYCL bfloat16.


The PR moving bfloat16 out of experimental namespace is about to be merged: #6524
In my PR I have removed the uint16_t cases now. I think it is recommended to remove it because we want people to use bfloat16 and not be confused by the existence of uint16_t.

JackAKirk · 2022-11-16T11:46:12Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+  // fp32=>bf16). This is a workaround until we are able to use
+  // __spirv_ConvertFToBF16INTEL and __spirv_ConvertBF16ToFINTEL once these are
+  // supported in the CPU backend
+  static float make_fp32(uint16_t x) {


This function is no longer required: see #6524

JackAKirk · 2022-11-16T11:46:31Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+    return *res;
+  }
+
+  static uint16_t make_bf16(float x) {


This function is no longer required: see #6524

JackAKirk · 2022-11-16T11:51:28Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+// with no member variables. Morally, it is equivalent to an enumeration--it
+// just uses the type system to communicate the desired accuracy of arithmetic
+// computations. Users can't construct a tf32
+namespace precision {


Do you want to move precision::tf32 etc and layout enum to spirv_types also? @dkhaldi had some ideas on how it should be organized. I don't mind how we deal with this, I'll leave it up to you.

dkhaldi · 2022-11-16T14:19:47Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+
+// unnecessary was introduced for backward compatibility.
+// Once the use implementation is stable, "unnecessary" value will be omitted
+enum class use { a, b, accumulator, unnecessary };


we should not need to have unecessary

dkhaldi · 2022-11-16T14:25:08Z

sycl/include/sycl/ext/oneapi/matrix/matrix.hpp

@@ -29,3 +29,6 @@
 #if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 3)
 #include <sycl/ext/oneapi/matrix/matrix-tensorcore.hpp>
 #endif // SYCL_EXT_ONEAPI_MATRIX_VERSION
+#if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 4)


move unified to use 2.
Remove use implementation .
We need to wait for https://github.com/intel/llvm/pull/7077to get merged first

dkhaldi · 2022-11-16T14:31:12Z

sycl/test/matrix/matrix-elemwise-ops.cpp

-           joint_matrix<int8_t, TK, TN, matrix_layout::packed_b> sub_b(sg);
-           joint_matrix<int32_t, TM, TN> sub_c(sg);
+           joint_matrix<int8_t, use::b, TK, TN, layout::packed> sub_b(sg);
+           joint_matrix<int32_t, use::accumulator, TM, TN, layout::dynamic> sub_c(sg);


remove layout::dynamic

dkhaldi · 2022-11-16T14:36:17Z

In matrix-intel.hpp, we need to add store of A and B,
In unified: load of A and B has no layout argument.
In unified: load of C must have layout
In Intel version, users can load A and B with layout argument.
These cases are needed for the tests for corner cases we made for element wise operations (ewo on B and the store B, we don't need load on B, so there is no need to specify layout in matrix B.)

dkhaldi · 2022-11-16T15:10:58Z

sycl/include/CL/__spirv/spirv_types.hpp

+  Packed = 2,
+  Dynamic = 3
+};
+#else
 enum class MatrixLayout : uint32_t {
  RowMajor = 0,
  ColumnMajor = 1,
  PackedA = 2,
  PackedB = 3,
  Unused = 4


unused is not needed

dkhaldi · 2022-11-16T15:13:38Z

sycl/include/CL/__spirv/spirv_types.hpp

+  Packed = 2,
+  Dynamic = 3
+};
+#else
 enum class MatrixLayout : uint32_t {
  RowMajor = 0,
  ColumnMajor = 1,
  PackedA = 2,
  PackedB = 3,
  Unused = 4


As for use, Dmitry is removing unecessary in this patch #7335
it will be merged soon. Once it is merged, this PR should be updated as well.

@yubingex007-a11y #7335 has been merged. You can rebase your patch to remove all "unnecessary" use occurrences.

dkhaldi · 2022-11-21T20:34:56Z

sycl/include/sycl/ext/oneapi/matrix/static-query-use.hpp

  template <typename Group>
-  using joint_matrix_c = joint_matrix<Tc, defaultM, defaultN, use::accumulator,
+  using joint_matrix_c = joint_matrix<Tc, use::accumulator, defaultM, defaultN,
                                      layout::row_major, Group>;



tpu_params will have to take layout of A and layout of B as template arguments. Then you pass them here.
Layout if the accumulator matrix should be layout::dynamic

dkhaldi · 2022-11-21T20:37:10Z

sycl/include/sycl/ext/oneapi/matrix/static-query-use.hpp

  template <typename Group>
  using joint_matrix_b =
-      joint_matrix<Tb, defaultK, defaultN, use::b, layout::packed_b, Group>;
+      joint_matrix<Tb, use::b, defaultK, defaultN, layout::packed, Group>;


Please take a look at #7307, packed now should be part of an Intel specific name space.
namespace sycl::ext::intel::experimental::matrix {
enum class layout {
packed
};

dkhaldi · 2022-11-30T14:26:28Z

sycl/test/matrix/matrix-bfloat16-test-use.cpp

@@ -60,13 +60,13 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C,
           const auto sg_starty = global_idy - spmd_item.get_local_id(1);

           sycl::ext::oneapi::sub_group sg = spmd_item.get_sub_group();
-           joint_matrix<bfloat16, TM, TK, use::a> sub_a(sg);
+           joint_matrix<bfloat16, use::a, TM, TK, layout::row_major> sub_a(sg);


remove use from the name

dkhaldi · 2022-11-30T14:30:50Z

sycl/include/sycl/ext/oneapi/matrix/matrix.hpp

@@ -23,7 +23,7 @@
 #include <sycl/ext/oneapi/matrix/static-query.hpp>
 #endif // SYCL_EXT_ONEAPI_MATRIX_VERSION
 #if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 2)
-#include <sycl/ext/oneapi/matrix/matrix-jit-use.hpp>
+#include <sycl/ext/oneapi/matrix/matrix-unified.hpp>


remove matrix-use.hpp

dkhaldi · 2022-12-02T16:29:48Z

sycl/include/sycl/ext/oneapi/matrix/static-query-use.hpp

  template <typename Group>
-  using joint_matrix_c = joint_matrix<Tc, defaultM, defaultN, use::accumulator,
+  using joint_matrix_c = joint_matrix<Tc, use::accumulator, defaultM, defaultN,


Please rebase this patch to use the changes in
#6981
it was merged.
These are mainly just changing names. So no big deal

dkhaldi · 2022-12-02T16:32:14Z

sycl/include/sycl/ext/oneapi/matrix/static-query-use.hpp

@@ -206,12 +206,12 @@ struct tpu_params<

  template <typename Group>
  using joint_matrix_a =
-      joint_matrix<Ta, defaultM, defaultK, use::a, layout::row_major, Group>;
+      joint_matrix<Ta, use::a, defaultM, defaultK, layout::row_major, Group>;


tpu_params should now take layout A and layout B as template arguments so we can pass them here.

…_data a free function and move packed into intel namespace

dkhaldi · 2022-12-08T15:48:10Z

sycl/test/matrix/matrix-bfloat16-test-use.cpp

@@ -60,13 +60,19 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C,
           const auto sg_starty = global_idy - spmd_item.get_local_id(1);

           sycl::ext::oneapi::sub_group sg = spmd_item.get_sub_group();


sub_group should be part of sycl namespace so no need for sycl::ext::oneapi
remove use from the name of the tests

dkhaldi · 2022-12-08T15:48:25Z

sycl/test/matrix/matrix-bfloat16-test-use.cpp

@@ -60,13 +60,19 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C,
           const auto sg_starty = global_idy - spmd_item.get_local_id(1);

           sycl::ext::oneapi::sub_group sg = spmd_item.get_sub_group();
-           joint_matrix<bfloat16, TM, TK, use::a> sub_a(sg);
+           joint_matrix<sycl::ext::oneapi::sub_group, bfloat16, use::a, TM, TK,


replace with sub_group

dkhaldi · 2022-12-08T16:07:15Z

sycl/include/sycl/ext/oneapi/matrix/static-query-use.hpp

@@ -152,13 +152,14 @@ struct tpu_params<tpu::amx, Ta, Tb, Tc, 0, 0, 0,

  template <typename Group>
  using joint_matrix_a =
-      joint_matrix<Ta, defaultM, defaultK, use::a, layout::row_major, Group>;
+      joint_matrix<Group, Ta, use::a, defaultM, defaultK, layout::row_major>;


this should take layout as argument. we should not only assume row major layout

dkhaldi · 2022-12-13T14:14:52Z

sycl/include/sycl/ext/oneapi/matrix/matrix.hpp

@@ -23,7 +23,7 @@
 #include <sycl/ext/oneapi/matrix/static-query.hpp>
 #endif // SYCL_EXT_ONEAPI_MATRIX_VERSION
 #if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 2)


leave it as 4 so it does not break users'code of Jack's implementation

yubingex007-a11y · 2022-12-14T11:22:04Z

/verify with intel/llvm-test-suite#1334

yubingex007-a11y · 2022-12-14T19:21:40Z

I am taking care of the testing and my changes do not break CUDA tests in intel/llvm-test-suite#1334
take notes for myself "/verify xxx" can't work for cuda's testcases, we verify cuda's testcase in local machine.

bader

Please, remove a.out.

yubingex007-a11y · 2022-12-15T06:07:14Z

Please, remove a.out.

😂 sorry i should have noticed it.

yubingex007-a11y · 2022-12-15T07:22:45Z

ping? @dkhaldi @steffenlarsen @hdelan

yubingex007-a11y · 2022-12-15T08:15:32Z

/verify with intel/llvm-test-suite#1391

dkhaldi · 2022-12-15T14:28:03Z

sycl/include/sycl/ext/oneapi/matrix/matrix.hpp

@@ -24,11 +24,11 @@
 #endif // SYCL_EXT_ONEAPI_MATRIX_VERSION
 #if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 2)


remove matrix-jit-use. it does not work anymore because we removed SPIRV and codegen support for this.

dkhaldi · 2022-12-15T14:31:38Z

sycl/include/sycl/ext/oneapi/matrix/matrix-unified-utils.hpp

+
+enum class use { a, b, accumulator };
+
+enum class layout { row_major = 0, col_major = 1, dynamic = 3 };


what was the reason behind this change:
enum class layout { row_major, col_major, dynamic };
to
enum class layout { row_major = 0, col_major = 1, dynamic = 3 };
?

because layout::packed carry the value 2

dkhaldi · 2022-12-15T14:35:03Z

sycl/test/matrix/matrix-bfloat16-test-use.cpp

@@ -1,4 +1,4 @@
-// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=2 %s -o %t.out
+// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4 %s -o %t.out


remove "-use" from the name of the tests that use the unified API

hdelan · 2022-12-15T16:20:41Z

sycl/include/sycl/ext/oneapi/matrix/matrix-intel.hpp

+__SYCL_INLINE_VER_NAMESPACE(_V1) {
+namespace ext {
+namespace intel::experimental::matrix::layout {
+constexpr sycl::ext::oneapi::experimental::matrix::layout packed =


Why are we introducing the layout in a new namespace here?

@hdelan, I see you are not in sync with the changes we made ;) to make this a unified API: write one code, run on Intel AMX, Intel XMX and Nvidia Tensor Cores.
The PR for documentation is in #7307

The basic idea is that anything that is Intel specific (like packed which is the VNNI layout) should go to a new Intel extension with a new namespace.

Correct I am not fully up to date with joint matrix atm! OK thanks for explanation, that makes sense to me.

dkhaldi

LGTM

yubingex007-a11y · 2022-12-16T15:58:45Z

Ping? @intel/llvm-reviewers-runtime

steffenlarsen

Tiny nit, but LGTM.

steffenlarsen · 2022-12-16T16:03:52Z

sycl/include/sycl/ext/oneapi/matrix/matrix-unified.hpp

+#endif // defined(__SYCL_DEVICE_ONLY__)
+#endif


Suggested change

#endif // defined(__SYCL_DEVICE_ONLY__)

#endif

#endif // defined(__NVPTX__)

#endif // defined(__SYCL_DEVICE_ONLY__)

yubingex007-a11y · 2022-12-16T16:11:30Z

would you please help merge it? @steffenlarsen

[SYCL][INTEL] Implementation of matrix ext using new unified interface

5451a62

yubingex007-a11y requested review from dkhaldi, MrSidims and JackAKirk November 16, 2022 08:32

yubingex007-a11y requested a review from a team as a code owner November 16, 2022 08:32

yubingex007-a11y requested a review from sergey-semenov November 16, 2022 08:32

JackAKirk reviewed Nov 16, 2022

View reviewed changes

JackAKirk requested changes Nov 16, 2022

View reviewed changes

dkhaldi reviewed Nov 16, 2022

View reviewed changes

dkhaldi mentioned this pull request Nov 17, 2022

Add matrix tests that use the new API (unified API) intel/llvm-test-suite#1391

Merged

yubingex007-a11y added 4 commits November 18, 2022 17:32

Merge remote-tracking branch 'intel_llvm/sycl' into matrix-unified-intel

7cca50d

address comments

8dd77e9

removing dynamic when defining sub_c in testcases

76cc0de

remove unnecesary from enum class use

bbf05ed

dkhaldi reviewed Nov 21, 2022

View reviewed changes

JackAKirk mentioned this pull request Nov 28, 2022

[SYCL][CUDA] Implementation of matrix ext using new "unified" interface #7077

Merged

dkhaldi reviewed Nov 30, 2022

View reviewed changes

dkhaldi reviewed Dec 2, 2022

View reviewed changes

move Group to the start of joint_matrix's parameter list, make get_wi…

aaf2baf

…_data a free function and move packed into intel namespace

dkhaldi reviewed Dec 8, 2022

View reviewed changes

dkhaldi reviewed Dec 13, 2022

View reviewed changes

Merge remote-tracking branch 'intel_llvm/sycl' into matrix-unified-intel

46ec9ad

yubingex007-a11y requested review from AerialMantis, JackAKirk and dkhaldi December 14, 2022 07:25

yubingex007-a11y added 2 commits December 14, 2022 16:03

fix lint's issue

2a0fd7c

fix lint's issue

99c428f

yubingex007-a11y requested a review from hdelan December 14, 2022 16:03

move wi_data unified again

afd4373

yubingex007-a11y requested a review from bader as a code owner December 14, 2022 19:14

bader reviewed Dec 14, 2022

View reviewed changes

fix some smal issues

19923b9

yubingex007-a11y requested a review from steffenlarsen December 15, 2022 07:22

dkhaldi requested changes Dec 15, 2022

View reviewed changes

dkhaldi approved these changes Dec 15, 2022

View reviewed changes

hdelan reviewed Dec 15, 2022

View reviewed changes

address dounia's comments

b8edc68

dkhaldi approved these changes Dec 16, 2022

View reviewed changes

steffenlarsen approved these changes Dec 16, 2022

View reviewed changes

steffenlarsen changed the title ~~[SYCL][INTEL] Implementation of matrix ext using new unified interface~~ [SYCL] Implement matrix extension using new unified interface Dec 16, 2022

steffenlarsen merged commit f4a9ef1 into intel:sycl Dec 16, 2022

JackAKirk mentioned this pull request Jan 5, 2023

[SYCL][NFC] Cleanup wi_data/joint_matrix code #7929

Merged

yubingex007-a11y mentioned this pull request Oct 10, 2023

[SYCL][Matrix] syntax changes as preparation before moving joint matrix from experimental namespace #11215

Merged

		@@ -60,13 +60,19 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C,
		const auto sg_starty = global_idy - spmd_item.get_local_id(1);

		sycl::ext::oneapi::sub_group sg = spmd_item.get_sub_group();

		@@ -24,11 +24,11 @@
		#endif // SYCL_EXT_ONEAPI_MATRIX_VERSION
		#if (SYCL_EXT_ONEAPI_MATRIX_VERSION == 2)


		enum class use { a, b, accumulator };

		enum class layout { row_major = 0, col_major = 1, dynamic = 3 };

		@@ -1,4 +1,4 @@
		// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=2 %s -o %t.out
		// RUN: %clangxx -fsycl -O2 -DSYCL_EXT_ONEAPI_MATRIX_VERSION=4 %s -o %t.out

[SYCL] Implement matrix extension using new unified interface #7413

[SYCL] Implement matrix extension using new unified interface #7413

Uh oh!

Conversation

yubingex007-a11y commented Nov 16, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkhaldi commented Nov 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkhaldi Dec 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yubingex007-a11y commented Dec 14, 2022

Uh oh!

yubingex007-a11y commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bader left a comment

Choose a reason for hiding this comment

Uh oh!

yubingex007-a11y commented Dec 15, 2022

Uh oh!

yubingex007-a11y commented Dec 15, 2022

Uh oh!

yubingex007-a11y commented Dec 15, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dkhaldi left a comment

dkhaldi commented Nov 16, 2022 •

edited

Loading

dkhaldi Dec 8, 2022 •

edited

Loading

yubingex007-a11y commented Dec 14, 2022 •

edited

Loading