Skip to content

[SYCL][Joint Matrix Spec] Add new API for out of bounds fill/load/store #11172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Mar 19, 2024
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,67 @@ joint_matrix_apply(sg, A, [=](T &val, size_t row, size_t col) {
sum_local_rows[row] += val;
});
```

=== Load/Store/Fill With Out-of-Bounds Checks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand how this API can work. We might need to schedule a meeting where we can draw some pictures.

Wouldn't it make more sense to require the user to set Width and Height according to the remaining amount of memory in the "big matrix"? That way, the implementation simply checks to see if the template parameter Rows is greater than Height and if the template parameter Cols is greater than Width. If they are both less, the operations behave like normal loads and stores. If they are less, then the load sets the out-of-range elements to zero and the store ignores the out-of-range elements.

From the description, though, it doesn't seem like the APIs work this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check is done compared to the big matrix size comparing the address to be loaded (ptr) with the big matrix size (Height). If ptr > Height, the element should not be loaded.
if the remainder is passed, how is the implementation going to compare with it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With your recent change that adds coordX and coordY parameters, I see how this could work for the load and store APIs. However, I still don't see how the fill API can work.

This API structure seems confusing to me, though, and probably less efficient. I suspect users will get confused that the src pointer to joint_matrix_load is completely different from the src_base pointer to joint_matrix_load_checked. It seems like it would be less confusing and more efficient to have the user pass values for Width and Height that are the remaining amount of memory in the big matrix. Maybe this would be easier to explain in a meeting where we can draw pictures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's discuss fill and remainder approach in a meeting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I unresolve this comment as there was no follow up and I probably has missed the meeting? :) I also don't understand, why we need coordinates for fill

`joint_matrix_load`, `joint_matrix_store`, and `joint_matrix_fill`
operations do not do bounds checking. When the global matrix size does
not multiply `Rows` and `Cols` of the joint matrix, this extension
adds a new form of API to load, store, and fill joint matrix while
checking the bounds. For load and fill, the out-of-bounds elements are
set to 0. For the store, they are dropped out.

These functions are similar to the existing ones but they
take extra arguments to determine the global bounds `Height` and `Width` of the
two-dimensional region to load to or store from.

```c++
namespace sycl::ext::intel::experimental::matrix {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fda0 does this change address the missing arguments to generate efficiently 2d load/store in IGC?

Copy link
Contributor Author

@dkhaldi dkhaldi Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fda0 Yes I changed it to add the two offset coordinates separately and excluded them from the base pointer calculation.

template <typename Group, typename T, size_t Rows, size_t Cols,
use Use, layout Layout, typename Tv>
void joint_matrix_fill_checked(Group g, joint_matrix<Group, T, Use, Rows,
Cols, Layout> &m, Tv v, size_t Height, size_t Width);

// Only available when std::is_same_v<T1, std::remove_const_t<T2>>
template <typename Group, typename T1, typename T2,
size_t Rows, size_t Cols,
access::address_space Space, access::decorated IsDecorated>
void joint_matrix_load_checked(Group g,
joint_matrix<Group, T1, use::accumulator, Rows, Cols, layout::dynamic> &res,
multi_ptr<T2, Space, IsDecorated> src, size_t stride,
layout Layout, size_t Height, size_t Width);

// Only available when Layout != layout::dynamic
// and when std::is_same_v<T1, std::remove_const_t<T2>>
template <typename Group, typename T1, typename T2,
size_t Rows, size_t Cols,
use Use, layout Layout,
access::address_space Space, access::decorated IsDecorated>
void joint_matrix_load_checked(Group g,
joint_matrix<Group, T1, Use, Rows, Cols, Layout> &res,
multi_ptr<T2, Space, IsDecorated> src, size_t stride,
size_t Height, size_t Width);

template <typename Group, typename T, size_t Rows, size_t Cols,
access::address_space Space, access::decorated IsDecorated>
void joint_matrix_store_checked(Group g,
const joint_matrix<Group, T, use::accumulator, Rows, Cols, layout::dynamic> &res,
multi_ptr<T, Space, IsDecorated> dest, size_t stride, layout Layout,
size_t Height, size_t Width);

// Only available when Layout != layout::dynamic
template <typename Group, typename T, size_t Rows, size_t Cols,
use Use, layout Layout, access::address_space Space,
access::decorated IsDecorated>
void joint_matrix_store_checked(Group g,
const joint_matrix<Group, T, Use, Rows, Cols, Layout> &res,
multi_ptr<T, Space, IsDecorated> src, size_t stride,
size_t Height, size_t Width);

} // namespace sycl::ext::intel::experimental::matrix
```


=== New Device Information Descriptor
Besides the query we provide in
link:sycl_ext_oneapi_matrix.asciidoc[sycl_ext_oneapi_matrix],
Expand Down