Skip to content

[SYCL] Static linking support in Level Zero plugin #5266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 7, 2022

Conversation

gmlueck
Copy link
Contributor

@gmlueck gmlueck commented Jan 6, 2022

The Level Zero driver now has "static linking" support, the ability to
create a single Level Zero module from a list of SPIR-V modules. Use
this feature to implement piProgramLink, replacing the previous
implementation that relied on "dynamic linking" of Level Zero modules.
There are two main improvements:

  • Static linking provides more optimizations because the compiler sees
    the SPIR-V for all modules, which enables cross-module optimizations.

  • The previous implementation was mostly using mock driver APIs, and it
    didn't work in the general case when piProgramLink tried to link
    more than one input module together.

This is mostly an infrastructure improvement, which does not provide
any new features or bug fixes. Since there is no Level Zero interop
API that can create a kernel_bundle in object or input state,
there is no need currently to link multiple Level Zero modules together
with piProgramLink.

However, this commit provides the infrastructure we need to enable
online linking of the "device library" into device code. That feature
will be enabled in a future commit.

Note that a temporary environment variable has also been added
(SYCL_FORCE_LINK=[0|1]), which was added to help test this change.
Defining this variable to 1 forces the runtime to call
piProgramCompile / piProgramLink instead of piProgramBuild even
if the program is created from a single SPIR-V module. This is usually
safe, but it sometimes results in false errors, so only enable this
variable if you know what you are doing. We expect support for this
environment variable to removed soon.

The Level Zero driver now has "static linking" support, the ability to
create a single Level Zero module from a list of SPIR-V modules.  Use
this feature to implement `piProgramLink`, replacing the previous
implementation that relied on "dynamic linking" of Level Zero modules.
There are two main improvements:

* Static linking provides more optimizations because the compiler sees
  the SPIR-V for all modules, which enables cross-module optimizations.

* The previous implementation was mostly using mock driver APIs, and it
  didn't work in the general case when `piProgramLink` tried to link
  more than one input module together.

This is mostly an infrastructure improvement, which does not provide
any new features or bug fixes.  Since there is no Level Zero interop
API that can create a `kernel_bundle` in `object` or `input` state,
there is no need currently to link multiple Level Zero modules together
with `piProgramLink`.

However, this commit provides the infrastructure we need to enable
online linking of the "device library" into device code.  That feature
will be enabled in a future commit.
The previous commit adds new logic to the Level Zero implementation of
`piProgramLink`, but this logic is only used currently if the SYCL
application explicitly calls `sycl::link`.  When an application merely
executes a kernel, the SYCL runtime generally calls `piProgramBuild`
instead of `piProgramCompile` / `piProgramLink`.

In order to enable wider testing of the new `piProgramLink` logic, this
commit adds a temporary environment variable which forces the SYCL
runtime to build all programs by calling `piProgramCompile` /
`piProgramLink`.  This is usually safe, though there are a few cases
where it is unsafe and results in false errors.  Therefore, use this
environment variable with caution.

Usage is like:

```
SYCL_FORCE_LINK=[0|1]
```

When running tests with `llvm-lit`, you can set this environment
variable by passing `--param extra_environment=SYCL_FORCE_LINK=1`:

```
$ llvm-lit --param extra_environment=SYCL_FORCE_LINK=1  \
  --param sycl_be=level_zero --param target_devices=gpu \
  -j12 -v -s <test>
```

Here are the known cases when specifying `SYCL_FORCE_LINK=1` results in
a false error:

* Tests using AOT (ahead-of-time compilation) generally fail because
  a `pi_program` that is created from native code cannot be compiled
  with `piProgramCompile`.

* Some tests enable tracing with `SYCL_PI_TRACE` and explicitly check
  for a call to `piProgramBuild`.  These tests fail because the runtime
  calls `piProgramCompile` / `piProgramLink` instead of calling
  `piProgramBuild`.
@gmlueck gmlueck requested review from a team as code owners January 6, 2022 18:06
@gmlueck
Copy link
Contributor Author

gmlueck commented Jan 6, 2022

@againull: @smaslov-intel thought you would be interested in reviewing the thread locking logic in this PR. See the use of _pi_program::Mutex, which protects state within the _pi_program object.

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@againull : please give your explicit OK for program mutex usage here

Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of _pi_program::Mutex looks good, thank you!

@againull againull merged commit c690ac8 into intel:sycl Jan 7, 2022
@@ -963,8 +963,11 @@ ProgramManager::ProgramPtr ProgramManager::build(
DeviceLibReqMask);
}

static const char *ForceLinkEnv = std::getenv("SYCL_FORCE_LINK");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this environment variable to help the device driver team reproduce two bugs that I reported. Since I expect to remove the environment variable after those bugs are fixed, I didn't bother adding documentation. If people want, I can add documentation now, but we will just remove it again once the bugs are fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK to add this to experimental section

@gmlueck gmlueck deleted the gmlueck/l0-static-linking3-a branch January 18, 2022 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants