-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[libc] Update GPU testing documentation #85459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-libc Author: Joseph Huber (jhuber6) ChangesSummary: Full diff: https://github.com/llvm/llvm-project/pull/85459.diff 3 Files Affected:
diff --git a/libc/docs/gpu/building.rst b/libc/docs/gpu/building.rst
index dab21e1324d281..6d94134a407d34 100644
--- a/libc/docs/gpu/building.rst
+++ b/libc/docs/gpu/building.rst
@@ -220,11 +220,15 @@ targets. This section will briefly describe their purpose.
be used to enable host services for anyone looking to interface with the
:ref:`RPC client<libc_gpu_rpc>`.
+.. _gpu_cmake_options:
+
CMake options
=============
This section briefly lists a few of the CMake variables that specifically
-control the GPU build of the C library.
+control the GPU build of the C library. These options can be passed individually
+to each target using ``-DRUNTIMES_<target>_<variable>=<value>`` when using a
+standard runtime build.
**LLVM_LIBC_FULL_BUILD**:BOOL
This flag controls whether or not the libc build will generate its own
diff --git a/libc/docs/gpu/testing.rst b/libc/docs/gpu/testing.rst
index 9842a675283619..4f26a4acb5bd8e 100644
--- a/libc/docs/gpu/testing.rst
+++ b/libc/docs/gpu/testing.rst
@@ -1,9 +1,9 @@
.. _libc_gpu_testing:
-============================
-Testing the GPU libc library
-============================
+=========================
+Testing the GPU C library
+=========================
.. note::
Running GPU tests with high parallelism is likely to cause spurious failures,
@@ -14,24 +14,132 @@ Testing the GPU libc library
:depth: 4
:local:
-Testing Infrastructure
+Testing infrastructure
======================
-The testing support in LLVM's libc implementation for GPUs is designed to mimic
-the standard unit tests as much as possible. We use the :ref:`libc_gpu_rpc`
-support to provide the necessary utilities like printing from the GPU. Execution
-is performed by emitting a ``_start`` kernel from the GPU
-that is then called by an external loader utility. This is an example of how
-this can be done manually:
+The LLVM C library supports different kinds of :ref:`tests <build_and_test>`
+depending on the build configuration. The GPU target is considered a full build
+and therefore provides all of its own utilities to build and run the generated
+tests. Currently the GPU supports two kinds of tests.
+
+#. **Hermetic tests** - These are unit tests built with a test suite similar to
+ Google's ``gtest`` infrastructure. These use the same infrastructure as unit
+ tests except that the entire environment is self-hosted. This allows us to
+ run them on the GPU using our custom utilities. These are used to test the
+ majority of functional implementations.
+
+#. **Integration tests** - These are lightweight tests that simply call a
+ ``main`` function and checks if it returns non-zero. These are primarily used
+ to test interfaces that are sensitive to threading.
+
+The GPU uses the same testing infrastructure as the other supported ``libc``
+targets. We do this by treating the GPU as a standard hosted environment capable
+of launching a ``main`` function. Effectively, this means building our own
+startup libraries and loader.
+
+Testing utilities
+=================
+
+We provide two utilities to execute arbitrary programs on the GPU. That is the
+``loader`` and the ``start`` object.
+
+Startup object
+--------------
+
+This object mimics the standard object used by existing C library
+implementations. Its job is to perform the necessary setup prior to calling the
+``main`` function. In the GPU case, this means exporting GPU kernels that will
+perform the necessary operations. Here we use ``_begin`` and ``_end`` to handle
+calling global constructors and destructors while ``_start`` begins the standard
+execution. The following code block shows the implementation for AMDGPU
+architectures.
+
+.. code-block:: c++
+
+ extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
+ _begin(int argc, char **argv, char **env) {
+ LIBC_NAMESPACE::atexit(&LIBC_NAMESPACE::call_fini_array_callbacks);
+ LIBC_NAMESPACE::call_init_array_callbacks(argc, argv, env);
+ }
+
+ extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
+ _start(int argc, char **argv, char **envp, int *ret) {
+ __atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED);
+ }
+
+ extern "C" [[gnu::visibility("protected"), clang::amdgpu_kernel]] void
+ _end(int retval) {
+ LIBC_NAMESPACE::exit(retval);
+ }
+
+Loader runtime
+--------------
+
+The startup object provides a GPU executable with callable kernels for the
+respective runtime. We can then define a minimal runtime that will launch these
+kernels on the given device. Currently we provide the ``amdhsa-loader`` and
+``nvptx-loader`` targeting the AMD HSA runtime and CUDA driver runtime
+respectively. By default these will launch with a single thread on the GPU.
.. code-block:: sh
- $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=gfx90a -flto
- $> ./amdhsa_loader --threads 1 --blocks 1 a.out
+ $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=native -flto
+ $> amdhsa_loader --threads 1 --blocks 1 ./a.out
Test Passed!
-Unlike the exported ``libcgpu.a``, the testing architecture can only support a
-single architecture at a time. This is either detected automatically, or set
-manually by the user using ``LIBC_GPU_TEST_ARCHITECTURE``. The latter is useful
-in cases where the user does not build LLVM's libc on machine with the GPU to
-use for testing.
+The loader utility will forward any arguments passed after the executable image
+to the program on the GPU as well as any set environment variables. The number
+of threads and blocks to be set can be controlled with ``--threads`` and
+``--blocks``. These also accept additional ``x``, ``y``, ``z`` variants for
+multidimensional grids.
+
+Running tests
+=============
+
+Tests will only be built and run if a GPU target architecture is set and the
+corresponding loader utility was built. These can be overridden with the
+``LIBC_GPU_TEST_ARCHITECTURE`` and ``LIBC_GPU_LOADER_EXECUTABLE`` :ref:`CMake
+options <gpu_cmake_options>`. Once built, they can be run like any other tests.
+The CMake target depends on how the library was built.
+
+#. **Cross build** - If the C library was built using ``LLVM_ENABLE_PROJECTS``
+ or a runtimes cross build, then the standard targets will be present in the
+ base CMake build directory.
+
+ #. All tests - You can run all supported tests with the command:
+
+ .. code-block:: sh
+
+ $> ninja check-libc
+
+ #. Hermetic tests - You can run hermetic with tests the command:
+
+ .. code-block:: sh
+
+ $> ninja libc-hermetic-tests
+
+ #. Integration tests - You can run integration tests by the command:
+
+ .. code-block:: sh
+
+ $> ninja libc-integration-tests
+
+#. **Runtimes build** - If the library was built using ``LLVM_ENABLE_RUNTIMES``
+ then the actual ``libc`` build will be in a separate directory.
+
+ #. All tests - You can run all supported tests with the command:
+
+ .. code-block:: sh
+
+ $> ninja check-libc-amdgcn-amd-amdhsa
+ $> ninja check-libc-nvptx64-nvidia-cuda
+
+ #. Specific tests - You can use the same targets as above by entering the
+ runtimes build directory.
+
+ .. code-block:: sh
+
+ $> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc
+ $> ninja -C runtimes/runtimes-nvptx64-nvidia-cuda-bins check-libc
+
+Tests can also be built and run manually using the respective loader utility.
diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst
index 11a00cd620d866..1a9446eeb1130a 100644
--- a/libc/docs/gpu/using.rst
+++ b/libc/docs/gpu/using.rst
@@ -159,17 +159,21 @@ GPUs.
}
We can then compile this for both NVPTX and AMDGPU into LLVM-IR using the
-following commands.
+following commands. This will yield valid LLVM-IR for the given target just like
+if we were using CUDA, OpenCL, or OpenMP.
.. code-block:: sh
$> clang id.c --target=amdgcn-amd-amdhsa -mcpu=native -nogpulib -flto -c
$> clang id.c --target=nvptx64-nvidia-cuda -march=native -nogpulib -flto -c
-We use this support to treat the GPU as a hosted environment by providing a C
-library and startup object just like a standard C library running on the host
-machine. Then, in order to execute these programs, we provide a loader utility
-to launch the executable on the GPU similar to a cross-compiling emulator.
+We can also use this support to treat the GPU as a hosted environment by
+providing a C library and startup object just like a standard C library running
+on the host machine. Then, in order to execute these programs, we provide a
+loader utility to launch the executable on the GPU similar to a cross-compiling
+emulator. This is how we run :ref:`unit tests <libc_gpu_testing>` targeting the
+GPU. This is clearly not the most efficient way to use a GPU, but it provides a
+simple method to test execution on a GPU for debugging or development.
Building for AMDGPU targets
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
||
.. code-block:: sh | ||
|
||
$> ninja -C runtimes/runtimes-amdgcn-amd-amdhsa-bins check-libc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we also mention another option that is to change into the runtimes/runtimes-...
folder and do the normal ninja libc.test.src....
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it was implied by
You can use the same targets as above by entering the runtimes build directory.
But I could be more explicit and show both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on being more explicit. It would be harder for readers to miss it.
Summary: This documentation was lagging reality and didn't contain much. Update it with some more information now that it's more mature.
Summary:
This documentation was lagging reality and didn't contain much. Update
it with some more information now that it's more mature.