Skip to content

[SYCL][Graph] Finalize Graph for single device #11565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 19, 2023

Conversation

EwanC
Copy link
Contributor

@EwanC EwanC commented Oct 17, 2023

Only finalize a modifiable command_graph for the SYCL device that was passed on construction. We
currently finalize for every device in the context, which is unnecessary overhead and may lead to
unintended interactions with buffers used in these contexts.

PR also includes code to start waiting on for deps when adding commands to a command graph. This prevents issues where allocation commands with dependent copies could be delayed due to device being busy and execute in an incorrect order with regards to future command graph executions.

Only finalize a modifiable `command_graph` for the
SYCL device that was passed on construction. We
currently finalize for every device in the context,
which is unnecessary overhead and may lead to
unintended interactions with buffers used in these
contexts.
@EwanC EwanC temporarily deployed to WindowsCILock October 17, 2023 15:36 — with GitHub Actions Inactive
@EwanC EwanC marked this pull request as ready for review October 17, 2023 15:40
@EwanC EwanC requested a review from a team as a code owner October 17, 2023 15:40
- Commands which enqueue to a command graph now correctly respect their dependencies by waiting on them.
- Prevents issues where allocation commands with dependent copies could be delayed due to device being busy and execute in an incorrect order with regards to future command graph executions.
@EwanC EwanC requested a review from a team as a code owner October 17, 2023 15:47
@EwanC EwanC requested a review from steffenlarsen October 17, 2023 15:47
@EwanC EwanC temporarily deployed to WindowsCILock October 17, 2023 15:49 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock October 17, 2023 16:09 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock October 18, 2023 08:02 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock October 18, 2023 08:21 — with GitHub Actions Inactive
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@EwanC EwanC temporarily deployed to WindowsCILock October 18, 2023 15:04 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock October 18, 2023 15:31 — with GitHub Actions Inactive
Copy link
Contributor

@reble reble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EwanC EwanC temporarily deployed to WindowsCILock October 19, 2023 06:31 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock October 19, 2023 06:50 — with GitHub Actions Inactive
@EwanC
Copy link
Contributor Author

EwanC commented Oct 19, 2023

@intel/llvm-gatekeepers Could we merge this please

@steffenlarsen steffenlarsen merged commit 01310ee into intel:sycl Oct 19, 2023
@EwanC
Copy link
Contributor Author

EwanC commented Oct 19, 2023

Based on post-commit CI https://github.com/intel/llvm/actions/runs/6572360130/job/17855217912 it looks like this fixes some of the command-buffer tests on Arc disabled in #11434

Unexpectedly Passed Tests (13):
  SYCL :: Graph/Explicit/basic_buffer.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/Explicit/temp_buffer_reinterpret.cpp
  SYCL :: Graph/Explicit/usm_copy.cpp
  SYCL :: Graph/RecordReplay/basic_buffer.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/RecordReplay/temp_buffer_reinterpret.cpp
  SYCL :: Graph/RecordReplay/usm_copy.cpp
  SYCL :: Graph/RecordReplay/usm_copy_in_order.cpp

However, this isn't all the tests that are marked xfail for this platform. Will see if the next post-commit, produces an identical fail list or not. Then will either create a PR to remove XFAIL for this platform if test pass/fail is consistent, or skip the tests rather than xfail if inconsistent.

EDIT: xpass tests look consistent, so re-enabling these tests in #11602

@EwanC EwanC deleted the ewan/graph_finalize_for_single_device branch October 19, 2023 12:43
EwanC added a commit to reble/llvm that referenced this pull request Oct 19, 2023
The following 13 graphs tests now pass on the Arc
post commit testing after the merge of intel#11565

```
Unexpectedly Passed Tests (13):
  SYCL :: Graph/Explicit/basic_buffer.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/Explicit/temp_buffer_reinterpret.cpp
  SYCL :: Graph/Explicit/usm_copy.cpp
  SYCL :: Graph/RecordReplay/basic_buffer.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/RecordReplay/temp_buffer_reinterpret.cpp
  SYCL :: Graph/RecordReplay/usm_copy.cpp
  SYCL :: Graph/RecordReplay/usm_copy_in_order.cpp
```

There are some Graphs E2E tests which are still marked XFAIL on
Arc, but the same subset failed(via XPASS) in two post-commit runs:

* https://github.com/intel/llvm/actions/runs/6576278057/job/17865822384
* https://github.com/intel/llvm/actions/runs/6572360130/job/17855217912
againull pushed a commit that referenced this pull request Oct 19, 2023
The following 13 graphs tests originally disabled by
#11434 now pass on the Arc post
commit testing after the merge of
#11565

```
Unexpectedly Passed Tests (13):
  SYCL :: Graph/Explicit/basic_buffer.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/Explicit/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/Explicit/temp_buffer_reinterpret.cpp
  SYCL :: Graph/Explicit/usm_copy.cpp
  SYCL :: Graph/RecordReplay/basic_buffer.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_2d.cpp
  SYCL :: Graph/RecordReplay/buffer_copy_host2target_offset.cpp
  SYCL :: Graph/RecordReplay/temp_buffer_reinterpret.cpp
  SYCL :: Graph/RecordReplay/usm_copy.cpp
  SYCL :: Graph/RecordReplay/usm_copy_in_order.cpp
```

There are some Graphs E2E tests which are still marked XFAIL on Arc, but
the same subset failed(via XPASS) in two post-commit runs:

* https://github.com/intel/llvm/actions/runs/6576278057/job/17865822384
* https://github.com/intel/llvm/actions/runs/6572360130/job/17855217912
againull pushed a commit that referenced this pull request Oct 20, 2023
PR #11602 renabled some Graph test-e2e tests which started passing on
Arc in post-commit CI after the merge of
#11565. However, these passes are not
guaranteed to be deterministic. See post-commit
https://github.com/intel/llvm/actions/runs/6580530144/job/17879085527
run where `Graph/RecordReplay/buffer_copy_host2target_offset.cpp` failed
after being renabled.

Mark the tests reenabled by #11602 as unsupported to keep post-commit CI
clean by skipping the tests on Arc, until support for the Arc target in
the Graph tests can be investigated properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants