Skip to content

[SYCL] Fixed bug reported by a static verifier #10452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 24, 2023

Conversation

eunakim0103
Copy link
Contributor

@eunakim0103 eunakim0103 commented Jul 18, 2023

The static verifier tool reported following three issues and fixed with adding a copy constructor & assignment operator, and removing dead code.

Logically dead code
Missing assignment operator
Copy without assign

@eunakim0103 eunakim0103 requested a review from a team as a code owner July 18, 2023 11:53
@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 12:24 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 13:15 — with GitHub Actions Inactive
@dm-vodopyanov dm-vodopyanov changed the title [SYCL] Coverity bug fix - corrected [SYCL] Coverity bug fixes Jul 18, 2023
@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 14:09 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 15:40 — with GitHub Actions Inactive
Comment on lines 96 to 101
ProxyLoader(bool loaded, xpti_plugin_handle_t fw_plugin_handle)
: m_loaded(loaded), m_fw_plugin_handle(fw_plugin_handle) {}

ProxyLoader(const ProxyLoader &rhs)
: ProxyLoader(rhs.m_loaded, rhs.m_fw_plugin_handle) {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need these two new constructors? Looks like they are not used in the code

Copy link
Contributor Author

@eunakim0103 eunakim0103 Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right -- It should be as below as other missing assignment operator Coverity violation. Will fix.
ProxyLoader&operator=(const ProxyLoader&) = delete;

@@ -70,6 +70,20 @@ bool CurrentCodeLocationValid() {
struct DemangleHandle {
char *p;
DemangleHandle(char *ptr) : p(ptr) {}

DemangleHandle &operator=(const DemangleHandle &rhs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here, we don't have any assignment in the code, shouldn't it be deleted (= delete)?

Copy link
Contributor Author

@eunakim0103 eunakim0103 Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above:
2023850 Missing assignment operator
This class, which frees resources in its destructor, does not have a user-written copy assignment operator, but is copy-assigned. Unless the resource is managed separately from the copy operation, this will cause use-after-free errors.

Class that owns resources lacks a user-written assignment operator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't answer the question why we can't use = delete; instead.

@@ -221,6 +221,18 @@ class LeavesCollection {
(!MGenericIsActive && MHACIt != Rhs.MHACIt));
}

IteratorT &operator=(const IteratorT<IsConst> &Rhs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this copy assignment operator is not used too in the code

@@ -184,9 +184,6 @@ program_impl::program_impl(ContextImplPtr Context,
OptionsVector.data(), nullptr);
std::string Options(OptionsVector.begin(), OptionsVector.end());
switch (BinaryType) {
case PI_PROGRAM_BINARY_TYPE_NONE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good diagnostic to find errors during some code refactoring, now it is removed, so suggest adding the default case and add assert(false) in it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion but basically this "Logically dead code" is because there is code to check if BinaryType is PI_PROGRAM_BINARY_TYPE_NONE and throw an exception in line 173. Coverity reported case PI_PROGRAM_BINARY_TYPE_NONE is redundancy and logically dead code.

173 if (BinaryType == PI_PROGRAM_BINARY_TYPE_NONE) {
174 throw invalid_object_error(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still it is helpful for us during the refactoring if the exception is removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a default case as advised

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do

switch () {
case 1:
 ...
 return;
case 2:
case 3:
  ...
  return;
}
assert(false) or __builtin_unreachable();

instead?

Copy link
Contributor Author

@eunakim0103 eunakim0103 Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree logically. I need to check the BinaryType values if there is any cases it shouldn't return in the any other part of code since the original code doesn't have default case and return. It may cause behavior changes when BinaryType is neither PI_PROGRAM_BINARY_TYPE_NONE nor when other values are not covered in the case selections if we add assert or return. I will investigate and update shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no other values of BinaryType which are not covered in switch case. I believe we can add assert(false) and return in the default case. Thanks!

Copy link
Contributor Author

@eunakim0103 eunakim0103 Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the BinaryType value is invalid and switch-default addition caught it. The BinaryType was 3 in Intel Gen9 system which did not happen with my testing machine.

  • This may be an uninitialized variable issue. I fixed it to be initialized with 0 (PI_PROGRAM_BINARY_TYPE_NONE) --> this is wrong. With the initialization it still appears 3.
  • Here's where that came from: UR_PROGRAM_BUILD_INFO_BINARY_TYPE = 3

cl_program_binary_type BinaryType;

typedef enum {
PI_PROGRAM_BINARY_TYPE_NONE = 0x0,
PI_PROGRAM_BINARY_TYPE_COMPILED_OBJECT = 0x1,
PI_PROGRAM_BINARY_TYPE_LIBRARY = 0x2,
PI_PROGRAM_BINARY_TYPE_EXECUTABLE = 0x4
} _pi_program_binary_type;


command stderr:
online_compiler_L0.cpp.tmp.out: /__w/llvm/llvm/src/sycl/source/detail/program_impl.cpp:200: sycl::_V1::detail::program_impl::program_impl(sycl::_V1::detail::ContextImplPtr, pi_native_handle, sycl::_V1::detail::pi::PiProgram): Assertion `false && "BinaryType is invalid."' failed.

@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 17:00 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 18, 2023 17:57 — with GitHub Actions Inactive
@dm-vodopyanov
Copy link
Contributor

@aelovikov-intel is the failure in SYCL Pre Commit on Linux / Linux / e2e-tests (lin_intel, SYCL E2E on Intel CPU/GEN9 GPU, Linux, gen known?

stderr:
Warning: ONEAPI_DEVICE_SELECTOR environment variable is set to opencl:cpu.
To see the correct device id, please unset ONEAPI_DEVICE_SELECTOR.


lit.py: /__w/llvm/llvm/llvm/sycl/test-e2e/lit.cfg.py:413: note: SG sizes for opencl:cpu: 
6 errors, exiting.
FAILED: CMakeFiles/check-sycl-e2e /__w/llvm/llvm/build-e2e/CMakeFiles/check-sycl-e2e 
cd /__w/llvm/llvm/build-e2e && /usr/bin/python3.10 /__w/llvm/llvm/llvm/llvm/utils/lit/lit.py -sv .
ninja: build stopped: subcommand failed.

@aelovikov-intel
Copy link
Contributor

aelovikov-intel commented Jul 18, 2023

For HIP we had a faulty runner amdgpu-2 that has been rebooted. GEN9 failure is new to me. Regardless, we've switched to GEN12 earlier today, so I'd suggest to merge in origin/sycl.

@eunakim0103 eunakim0103 temporarily deployed to aws July 19, 2023 19:18 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 19, 2023 19:57 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 marked this pull request as draft July 19, 2023 20:02
@eunakim0103 eunakim0103 temporarily deployed to aws July 19, 2023 20:54 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 19, 2023 21:35 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 19, 2023 23:10 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 20, 2023 00:05 — with GitHub Actions Inactive
Signed-off-by: Kim, Euna <[email protected]>
@eunakim0103 eunakim0103 temporarily deployed to aws July 20, 2023 01:13 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 temporarily deployed to aws July 20, 2023 01:54 — with GitHub Actions Inactive
@eunakim0103 eunakim0103 marked this pull request as ready for review August 21, 2023 14:10
@@ -197,6 +194,13 @@ program_impl::program_impl(ContextImplPtr Context,
MState = program_state::linked;
MLinkOptions = "";
MBuildOptions = Options;
break;
default:
// TODO: assert() will be reverted when UR_PROGRAM_BINARY_TYPE_EXECUTABLE
Copy link
Contributor

@dm-vodopyanov dm-vodopyanov Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eunakim0103, @aelovikov-intel proposed another place for the assert (link), can you implement his option?

Fixed the break to return and changed the position of adding assert()
eunakim0103 and others added 2 commits August 23, 2023 09:38
Co-authored-by: Dmitry Vodopyanov <[email protected]>
Co-authored-by: Dmitry Vodopyanov <[email protected]>
@aelovikov-intel
Copy link
Contributor

eunakim0103 requested a review from aelovikov-intel 2 hours ago

I'm not closely familiar with this code and you already have an approval needed to merge. Please explain/fix the test failure and ping @intel/llvm-gatekeepers afterwards.

@eunakim0103
Copy link
Contributor Author

All test passed. This PR is ready for merge @intel/llvm-gatekeepers

@eunakim0103 eunakim0103 changed the title [SYCL] Coverity bug fixes [SYCL] Fixed bug reported by a static verifier Aug 24, 2023
@dm-vodopyanov
Copy link
Contributor

@eunakim0103 please don't hide flaky issues, if tests passed after the re-run, please still report them. It's also highly encouraged to submit issues on them in intel/llvm.
In this case, fail on Intel Gen12, issue #10955:

TIMEOUT: SYCL :: Basic/sycl_2020_images/host_task_sampled_image_read_linear.cpp (512 of 1621)
******************** TEST 'SYCL :: Basic/sycl_2020_images/host_task_sampled_image_read_linear.cpp' FAILED ********************
Script:
--
: 'RUN: at line 1';    /__w/llvm/llvm/toolchain/bin//clang++   -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Basic/sycl_2020_images/host_task_sampled_image_read_linear.cpp -o /__w/llvm/llvm/build-e2e/Basic/sycl_2020_images/Output/host_task_sampled_image_read_linear.cpp.tmp.out
: 'RUN: at line 3';    /__w/llvm/llvm/build-e2e/Basic/sycl_2020_images/Output/host_task_sampled_image_read_linear.cpp.tmp.out
--
Exit Code: -9
Timeout: Reached timeout of 600 seconds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants