Skip to content

[SYCL][libdevice] Get rid of builtins for memcpy memset in libdevice #4919

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 24, 2021

Conversation

jinge90
Copy link
Contributor

@jinge90 jinge90 commented Nov 9, 2021

Previously, memcpy and memset are implemented in libdevice by calling
to __builtin_memcpy and __builtin_memset. Such code will lead to
infinite loop if underying CPU runtime implements memcpy/set builtin by
calling memcpy, memset. The function call chain is following:
In libdevice:
memcpy(dest, src, n) {
return __devicelib_memcpy(dest, src, n);
}

__devicelib_memcpy(dest, src, n) {
return __builtin_memcpy(dest, src, n);
}

In CPU runtime:
Handing of __builtin_memcpy<-------------------|
| |
| |
|---->memcpy------>__devicelib_memcpy
In order to fix this, we have to provide implementation for memcpy/set
without using _builtin*.

@jinge90 jinge90 requested a review from vzakhari as a code owner November 9, 2021 07:26
@jinge90 jinge90 changed the title [SYCL][libdevice] Get rid of builtins for memcpy memset in libdevice [SYCL][libdevice][WIP] Get rid of builtins for memcpy memset in libdevice Nov 9, 2021
@jinge90
Copy link
Contributor Author

jinge90 commented Nov 9, 2021

The patch is done but we need more extensive testing for it, we are working on enhance tests for libdevice memcpy/set using USM: intel/llvm-test-suite#558

Thanks very much.

Previously, memcpy and memset are implemented in libdevice by calling
to __builtin_memcpy and __builtin_memset. Such code will lead to
infinite loop if underying CPU runtime implements memcpy/set builtin by
calling memcpy, memset. The function call chain is following:
In libdevice:
memcpy(dest, src, n) {
  return __devicelib_memcpy(dest, src, n);
}

__devicelib_memcpy(dest, src, n) {
  return __builtin_memcpy(dest, src, n);
}

In CPU runtime:
Handing of __builtin_memcpy<--------------------|
                  |				|
		  |				|
		  |---->memcpy------>__devicelib_memcpy
In order to fix this, we have to provide implementation for memcpy/set
without using __builtin_*.
@jinge90
Copy link
Contributor Author

jinge90 commented Nov 10, 2021

/summary:run

@jinge90 jinge90 changed the title [SYCL][libdevice][WIP] Get rid of builtins for memcpy memset in libdevice [SYCL][libdevice] Get rid of builtins for memcpy memset in libdevice Nov 16, 2021
@jinge90
Copy link
Contributor Author

jinge90 commented Nov 16, 2021

Hi, @vzakhari
Could you help review this patch?
Thanks very much.

@jinge90
Copy link
Contributor Author

jinge90 commented Nov 17, 2021

/summary:run

Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few style changes required to align with the project coding guidelines.

@jinge90 jinge90 requested a review from bader November 17, 2021 06:55
@jinge90
Copy link
Contributor Author

jinge90 commented Nov 23, 2021

Hi, @vzakhari
Could you take a look?
Thanks very much.

@bader bader merged commit cb5e8ae into intel:sycl Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants