Skip to content

[SYCL][CUDA] Updated CUDA documentation #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

Ruyk
Copy link
Contributor

@Ruyk Ruyk commented Feb 28, 2020

Various updates to CUDA backend documentation:

  • CMake help strings
  • Get Started Guide
    - [ ] Basic doxygen generation

@Ruyk Ruyk requested a review from rodburns February 28, 2020 11:37
@Ruyk Ruyk self-assigned this Feb 28, 2020
Copy link

@Alexander-Johnston Alexander-Johnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better. Just some minor changes

@fwyzard
Copy link

fwyzard commented Feb 28, 2020

I would suggest to add few things:

  • that if one builds an application for the CUDA backend, one must use choose a CUDA device, either explictly or through the SYCL_BE variable - otherwise the runtime will likely pick and OpenCL device, and fail to run (this is something I find particularly confusing);
  • that one can build an application for both the OpenCL and the CUDA back ends at the same time, using -fsycl-targets=nvptx64-*-*-sycldevice,spir64-*-*-sycldevice (or -fsycl-targets=nvptx64-unknown-unknown-sycldevice,spir64-unknown-unknown-sycldevice`, etc.);
  • that one can specify the CUDA device to target using -Xsycl-target-backend=nvptx64-*-*-sycldevice '--cuda-gpu-arch=sm_35'

Though, the last may be more relevant for a slightly more in-depth documentation that the "getting started" document.

@Ruyk
Copy link
Contributor Author

Ruyk commented Feb 28, 2020

Thanks for your suggestions (and all the issue reporting!)

I would suggest to add few things:

  • that if one builds an application for the CUDA backend, one must use choose a CUDA device, either explictly or through the SYCL_BE variable - otherwise the runtime will likely pick and OpenCL device, and fail to run (this is something I find particularly confusing);

That is an interesting one. How do you build an application for the CUDA backend? You can't really do that at the moment. The only thing you can do is to generate the kernel code for nvptx64 target and link against the libsycl.so, which will have both OpenCL and CUDA support. There is no connection between device images and backends in the SYCL RT. Once a plugin is loaded, all device images are passed to all backends. In the case of the OpenCL, it fails if the OpenCL platform used is not the NVidia one, but in theory, you can generate a PTX device image and use it on the NVidia OpenCL platform and it should work!
A possible solution is to add some metadata to the device images that ties them to a backend, but to make that usable we would have to somehow ban using PTX on the OpenCL backend.

This is an interesting discussion that we are having in Khronos SYCL WG as well. The SYCL generalization and the SYCL modules are trying to address this problems, but we are still working on them.

As for the documentation, maybe I can add that when building for an nvptx64 target, SYCL_BE=PI_CUDA should be used at runtime to ensure the OpenCL backend is not used?
The CUDA Device Selector that I added on the examples is another route.

  • that one can build an application for both the OpenCL and the CUDA back ends at the same time, using -fsycl-targets=nvptx64-*-*-sycldevice,spir64-*-*-sycldevice (or -fsycl-targets=nvptx64-unknown-unknown-sycldevice,spir64-unknown-unknown-sycldevice`, etc.);

I am reluctant to add this until intel/llvm#1194 is solved, otherwise it may cause further confusion.

  • that one can specify the CUDA device to target using -Xsycl-target-backend=nvptx64-*-*-sycldevice '--cuda-gpu-arch=sm_35'

Though, the last may be more relevant for a slightly more in-depth documentation that the "getting started" document.

I think I could add it, but I rather see a solution to intel/llvm#1170 first. @Naghasan and @Alexander-Johnston are looking into a number of clang driver issues, so this may come soon.

@fwyzard
Copy link

fwyzard commented Feb 28, 2020

I see - yes, I agree that waiting until intel#1194 and intel#1170 are fixed will make those topics less confusing.

I still need to understand the first part better ...

@Ruyk Ruyk force-pushed the updated-cuda-doc branch 2 times, most recently from 6cb61a1 to 99d4665 Compare March 12, 2020 15:08
@Ruyk
Copy link
Contributor Author

Ruyk commented Mar 12, 2020

There are other issues with doxygen upstream, so i'll do that on a separate branch

@Ruyk Ruyk changed the title [WIP][SYCL][CUDA] Updated CUDA documentation [SYCL][CUDA] Updated CUDA documentation Mar 12, 2020
Co-Authored-By: Alexander Johnston <[email protected]>
Signed-off-by: Ruyman Reyes <[email protected]>
@Ruyk Ruyk force-pushed the updated-cuda-doc branch from 99d4665 to a8ad72c Compare March 12, 2020 15:20
@Ruyk Ruyk closed this Mar 12, 2020
codeplay-sycl pushed a commit that referenced this pull request Aug 22, 2020
When `Target::GetEntryPointAddress()` calls `exe_module->GetObjectFile()->GetEntryPointAddress()`, and the returned
`entry_addr` is valid, it can immediately be returned.

However, just before that, an `llvm::Error` value has been setup, but in this case it is not consumed before returning, like is done further below in the function.

In https://bugs.freebsd.org/248745 we got a bug report for this, where a very simple test case aborts and dumps core:

```
* thread #1, name = 'testcase', stop reason = breakpoint 1.1
    frame #0: 0x00000000002018d4 testcase`main(argc=1, argv=0x00007fffffffea18) at testcase.c:3:5
   1	int main(int argc, char *argv[])
   2	{
-> 3	    return 0;
   4	}
(lldb) p argc
Program aborted due to an unhandled Error:
Error value was Success. (Note: Success values must still be checked prior to being destroyed).

Thread 1 received signal SIGABRT, Aborted.
thr_kill () at thr_kill.S:3
3	thr_kill.S: No such file or directory.
(gdb) bt
#0  thr_kill () at thr_kill.S:3
#1  0x00000008049a0004 in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x0000000804916229 in abort () at /usr/src/lib/libc/stdlib/abort.c:67
#3  0x000000000451b5f5 in fatalUncheckedError () at /usr/src/contrib/llvm-project/llvm/lib/Support/Error.cpp:112
#4  0x00000000019cf008 in GetEntryPointAddress () at /usr/src/contrib/llvm-project/llvm/include/llvm/Support/Error.h:267
#5  0x0000000001bccbd8 in ConstructorSetup () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:67
#6  0x0000000001bcd2c0 in ThreadPlanCallFunction () at /usr/src/contrib/llvm-project/lldb/source/Target/ThreadPlanCallFunction.cpp:114
#7  0x00000000020076d4 in InferiorCallMmap () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/Utility/InferiorCallPOSIX.cpp:97
#8  0x0000000001f4be33 in DoAllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Plugins/Process/FreeBSD/ProcessFreeBSD.cpp:604
#9  0x0000000001fe51b9 in AllocatePage () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:347
#10 0x0000000001fe5385 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Memory.cpp:383
#11 0x0000000001974da2 in AllocateMemory () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2301
#12 CanJIT () at /usr/src/contrib/llvm-project/lldb/source/Target/Process.cpp:2331
#13 0x0000000001a1bf3d in Evaluate () at /usr/src/contrib/llvm-project/lldb/source/Expression/UserExpression.cpp:190
#14 0x00000000019ce7a2 in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Target/Target.cpp:2372
#15 0x0000000001ad784c in EvaluateExpression () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:414
#16 0x0000000001ad86ae in DoExecute () at /usr/src/contrib/llvm-project/lldb/source/Commands/CommandObjectExpression.cpp:646
#17 0x0000000001a5e3ed in Execute () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandObject.cpp:1003
#18 0x0000000001a6c4a3 in HandleCommand () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:1762
#19 0x0000000001a6f98c in IOHandlerInputComplete () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2760
#20 0x0000000001a90b08 in Run () at /usr/src/contrib/llvm-project/lldb/source/Core/IOHandler.cpp:548
#21 0x00000000019a6c6a in ExecuteIOHandlers () at /usr/src/contrib/llvm-project/lldb/source/Core/Debugger.cpp:903
#22 0x0000000001a70337 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/Interpreter/CommandInterpreter.cpp:2946
#23 0x0000000001d9d812 in RunCommandInterpreter () at /usr/src/contrib/llvm-project/lldb/source/API/SBDebugger.cpp:1169
#24 0x0000000001918be8 in MainLoop () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:675
#25 0x000000000191a114 in main () at /usr/src/contrib/llvm-project/lldb/tools/driver/Driver.cpp:890```

Fix the incorrect error catch by only instantiating an `Error` object if it is necessary.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D86355
codeplay-sycl pushed a commit that referenced this pull request Dec 9, 2020
CXXDeductionGuideDecl with a local typedef has its own copy of the
TypedefDecl with the CXXDeductionGuideDecl as the DeclContext of that
TypedefDecl.
```
      template <typename T> struct A {
        typedef T U;
        A(U, T);
      };
      A a{(int)0, (int)0};
```
Related discussion on cfe-dev:
http://lists.llvm.org/pipermail/cfe-dev/2020-November/067252.html

Without this fix, when we import the CXXDeductionGuideDecl (via
VisitFunctionDecl) then before creating the Decl we must import the
FunctionType. However, the first parameter's type is the afore mentioned
local typedef. So, we then start importing the TypedefDecl whose
DeclContext is the CXXDeductionGuideDecl itself. The infinite loop is
formed.
```
 #0 clang::ASTNodeImporter::VisitCXXDeductionGuideDecl(clang::CXXDeductionGuideDecl*) clang/lib/AST/ASTImporter.cpp:3543:0
 #1 clang::declvisitor::Base<std::add_pointer, clang::ASTNodeImporter, llvm::Expected<clang::Decl*> >::Visit(clang::Decl*) /home/egbomrt/WORK/llvm5/build/debug/tools/clang/include/clang/AST/DeclNodes.inc:405:0
 #2 clang::ASTImporter::ImportImpl(clang::Decl*) clang/lib/AST/ASTImporter.cpp:8038:0
 #3 clang::ASTImporter::Import(clang::Decl*) clang/lib/AST/ASTImporter.cpp:8200:0
 #4 clang::ASTImporter::ImportContext(clang::DeclContext*) clang/lib/AST/ASTImporter.cpp:8297:0
 #5 clang::ASTNodeImporter::ImportDeclContext(clang::Decl*, clang::DeclContext*&, clang::DeclContext*&) clang/lib/AST/ASTImporter.cpp:1852:0
 #6 clang::ASTNodeImporter::ImportDeclParts(clang::NamedDecl*, clang::DeclContext*&, clang::DeclContext*&, clang::DeclarationName&, clang::NamedDecl*&, clang::SourceLocation&) clang/lib/AST/ASTImporter.cpp:1628:0
 #7 clang::ASTNodeImporter::VisitTypedefNameDecl(clang::TypedefNameDecl*, bool) clang/lib/AST/ASTImporter.cpp:2419:0
 #8 clang::ASTNodeImporter::VisitTypedefDecl(clang::TypedefDecl*) clang/lib/AST/ASTImporter.cpp:2500:0
 #9 clang::declvisitor::Base<std::add_pointer, clang::ASTNodeImporter, llvm::Expected<clang::Decl*> >::Visit(clang::Decl*) /home/egbomrt/WORK/llvm5/build/debug/tools/clang/include/clang/AST/DeclNodes.inc:315:0
 #10 clang::ASTImporter::ImportImpl(clang::Decl*) clang/lib/AST/ASTImporter.cpp:8038:0
 #11 clang::ASTImporter::Import(clang::Decl*) clang/lib/AST/ASTImporter.cpp:8200:0
 #12 llvm::Expected<clang::TypedefNameDecl*> clang::ASTNodeImporter::import<clang::TypedefNameDecl>(clang::TypedefNameDecl*) clang/lib/AST/ASTImporter.cpp:165:0
 #13 clang::ASTNodeImporter::VisitTypedefType(clang::TypedefType const*) clang/lib/AST/ASTImporter.cpp:1304:0
 #14 clang::TypeVisitor<clang::ASTNodeImporter, llvm::Expected<clang::QualType> >::Visit(clang::Type const*) /home/egbomrt/WORK/llvm5/build/debug/tools/clang/include/clang/AST/TypeNodes.inc:74:0
 #15 clang::ASTImporter::Import(clang::QualType) clang/lib/AST/ASTImporter.cpp:8071:0
 #16 llvm::Expected<clang::QualType> clang::ASTNodeImporter::import<clang::QualType>(clang::QualType const&) clang/lib/AST/ASTImporter.cpp:179:0
 #17 clang::ASTNodeImporter::VisitFunctionProtoType(clang::FunctionProtoType const*) clang/lib/AST/ASTImporter.cpp:1244:0
 #18 clang::TypeVisitor<clang::ASTNodeImporter, llvm::Expected<clang::QualType> >::Visit(clang::Type const*) /home/egbomrt/WORK/llvm5/build/debug/tools/clang/include/clang/AST/TypeNodes.inc:47:0
 #19 clang::ASTImporter::Import(clang::QualType) clang/lib/AST/ASTImporter.cpp:8071:0
 #20 llvm::Expected<clang::QualType> clang::ASTNodeImporter::import<clang::QualType>(clang::QualType const&) clang/lib/AST/ASTImporter.cpp:179:0
 #21 clang::QualType clang::ASTNodeImporter::importChecked<clang::QualType>(llvm::Error&, clang::QualType const&) clang/lib/AST/ASTImporter.cpp:198:0
 #22 clang::ASTNodeImporter::VisitFunctionDecl(clang::FunctionDecl*) clang/lib/AST/ASTImporter.cpp:3313:0
 #23 clang::ASTNodeImporter::VisitCXXDeductionGuideDecl(clang::CXXDeductionGuideDecl*) clang/lib/AST/ASTImporter.cpp:3543:0
```

The fix is to first create the TypedefDecl and only then start to import
the DeclContext.
Basically, we could do this during the import of all other Decls (not
just for typedefs). But it seems, there is only one another AST
construct that has a similar cycle: a struct defined as a function
parameter:
```
int struct_in_proto(struct data_t{int a;int b;} *d);

```
In that case, however, we had decided to return simply with an error
back then because that seemed to be a very rare construct.

Differential Revision: https://reviews.llvm.org/D92209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants