Skip to content

Commit bc24ef2

Browse files
NB4444BeanavilSnektron
authored
Develop stream 2024-09-12 (#462)
* Fixed overflow bug for large sizes in thrust::shuffle * Added definitions of execution space macros * Add missing overloads for thrust::pow * Refactors thrust::unique_by_key to use cub::DeviceSelect::UniqueByKey * Fix a typo in thrust-config.cmake * Check that thrust::pair is trivially copyable * Remove double ignore in discard_iterator.h docs * Replace deprecated _VSTD macro with std * Update mode example to use thrust::unique_count * Ensure that thrust fancy iterators are trivially_copy_constructible when possible * Use checked allocators in CUB catch2 tests * Refactors thrust::copy_if to use cub::DeviceSelect * Refactor thrust::[stable_]partition[_copy] to use cub::DevicePartition * Fix include of <thrust/random.h> with NVC++ * Cleanup diagnostic handling * Rework config.h * Bump version to 2.4.0 * Fix issues with ambiguous calls to addressof in thrust::optional * Try harder to unwrap nested thrust::tuple_of_iterator_references, CUDA backend * Added missing element from thrust's tuple implementation * Ensure that we can run reduce_by_key with const inputs * Leave definitions of __host__ and __device__ This prevents CCCL/thrust's build breakage because of v2.4.0 changes * Patched up CI because of CCCL2.4.0 tests' build failure * Updated tests and examples for __host__ __device__ use * Updated CHANGELOG * Added operator to transform_reduce benchmark * Added mem allocator in benchmarks * Changes for review * ci: set up sccache * Added helper functions for choosing between different custom reporter * Added json and csv custom reporter for benchmarks * Changes for review * Added hipstdpar tests * Relocated our ParallelSTL additions * Fixed several naming issues * Added missing unimplemented algorithms * Split hipstdpar_lib.hpp * Added relevant information to README and CHANGELOG regarding HIPSTDPAR * Clarified upstream LLVM offload support * Emit error when HIPSTDPAR macros are not defined * Move forwarding calls to rocPRIM to thrust's stubs * Fix path to hipstdpar impl headers * Prevent building hipstdpar tests when no compatible libstdc++ is present * Disable TBB tests build --------- Co-authored-by: Beatriz Navidad Vilches <[email protected]> Co-authored-by: Robin Voetter <[email protected]>
1 parent 2695a52 commit bc24ef2

File tree

710 files changed

+13550
-13115
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

710 files changed

+13550
-13115
lines changed

.clang-format

Lines changed: 79 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
# Style file for MLSE Libraries based on the modified rocBLAS style
22

33
# Common settings
4-
BasedOnStyle: WebKit
5-
TabWidth: 4
6-
IndentWidth: 4
4+
BasedOnStyle: LLVM
5+
TabWidth: 2
6+
IndentWidth: 2
77
UseTab: Never
8-
ColumnLimit: 100
8+
ColumnLimit: 120
99

1010
# Other languages JavaScript, Proto
1111

@@ -20,14 +20,14 @@ Language: Cpp
2020
# void formatted_code_again;
2121

2222
DisableFormat: false
23-
Standard: Cpp11
24-
25-
AccessModifierOffset: -4
23+
Standard: c++14
24+
AccessModifierOffset: -2
2625
AlignAfterOpenBracket: true
2726
AlignConsecutiveAssignments: true
2827
AlignConsecutiveDeclarations: true
2928
AlignEscapedNewlinesLeft: true
3029
AlignOperands: true
30+
AllowAllArgumentsOnNextLine: true
3131
AlignTrailingComments: false
3232
AllowAllParametersOfDeclarationOnNextLine: true
3333
AllowShortBlocksOnASingleLine: false
@@ -39,13 +39,26 @@ AlwaysBreakAfterDefinitionReturnType: false
3939
AlwaysBreakAfterReturnType: None
4040
AlwaysBreakBeforeMultilineStrings: false
4141
AlwaysBreakTemplateDeclarations: true
42+
AttributeMacros: [
43+
'THRUST_DEVICE',
44+
'THRUST_FORCEINLINE',
45+
'THRUST_HOST_DEVICE',
46+
'THRUST_HOST',
47+
'_CCCL_DEVICE',
48+
'_CCCL_FORCEINLINE',
49+
'_CCCL_HOST_DEVICE',
50+
'_CCCL_HOST',
51+
'THRUST_RUNTIME_FUNCTION',
52+
'THRUST_DETAIL_KERNEL_ATTRIBUTES',
53+
]
4254
BinPackArguments: false
4355
BinPackParameters: false
4456

4557
# Configure each individual brace in BraceWrapping
4658
BreakBeforeBraces: Custom
4759
# Control of individual brace wrapping cases
4860
BraceWrapping: {
61+
AfterCaseLabel: 'false'
4962
AfterClass: 'true'
5063
AfterControlStatement: 'true'
5164
AfterEnum : 'true'
@@ -56,52 +69,69 @@ BraceWrapping: {
5669
BeforeCatch : 'true'
5770
BeforeElse : 'true'
5871
IndentBraces : 'false'
59-
# AfterExternBlock : 'true'
72+
SplitEmptyFunction: 'false'
73+
SplitEmptyRecord: 'false'
6074
}
6175

62-
#BreakAfterJavaFieldAnnotations: true
63-
#BreakBeforeInheritanceComma: false
64-
#BreakBeforeBinaryOperators: None
65-
#BreakBeforeTernaryOperators: true
66-
#BreakConstructorInitializersBeforeComma: true
67-
#BreakStringLiterals: true
76+
BreakBeforeConceptDeclarations: true
77+
BreakBeforeBinaryOperators: NonAssignment
78+
BreakBeforeTernaryOperators: true
79+
BreakConstructorInitializers: BeforeComma
80+
BreakInheritanceList: BeforeComma
81+
EmptyLineAfterAccessModifier: Never
82+
EmptyLineBeforeAccessModifier: Always
83+
84+
InsertBraces: true
85+
InsertNewlineAtEOF: true
86+
InsertTrailingCommas: Wrapped
87+
IndentRequires: true
88+
IndentPPDirectives: AfterHash
89+
PackConstructorInitializers: Never
90+
PenaltyBreakAssignment: 30
91+
PenaltyBreakTemplateDeclaration: 0
92+
PenaltyIndentedWhitespace: 2
93+
RemoveSemicolon: false
94+
SpaceAfterLogicalNot: false
95+
SpaceAfterTemplateKeyword: true
96+
SpaceBeforeCtorInitializerColon: true
97+
SpaceBeforeInheritanceColon: true
98+
SpaceBeforeRangeBasedForLoopColon: true
99+
68100

69101
CommentPragmas: '^ IWYU pragma:'
70-
#CompactNamespaces: false
102+
CompactNamespaces: false
71103
ConstructorInitializerAllOnOneLineOrOnePerLine: false
72104
ConstructorInitializerIndentWidth: 4
73-
ContinuationIndentWidth: 4
105+
ContinuationIndentWidth: 2
74106
Cpp11BracedListStyle: true
75-
#SpaceBeforeCpp11BracedList: false
76-
DerivePointerAlignment: false
107+
SpaceBeforeCpp11BracedList: false
77108
ExperimentalAutoDetectBinPacking: false
78109
ForEachMacros: [ foreach, Q_FOREACH, BOOST_FOREACH ]
79-
IndentCaseLabels: false
80-
#FixNamespaceComments: true
110+
IndentCaseLabels: true
111+
FixNamespaceComments: true
81112
IndentWrappedFunctionNames: false
82-
KeepEmptyLinesAtTheStartOfBlocks: true
113+
KeepEmptyLinesAtTheStartOfBlocks: false
83114
MacroBlockBegin: ''
84115
MacroBlockEnd: ''
85116
#JavaScriptQuotes: Double
86117
MaxEmptyLinesToKeep: 1
87-
NamespaceIndentation: Inner
118+
NamespaceIndentation: None
88119
ObjCBlockIndentWidth: 4
89120
#ObjCSpaceAfterProperty: true
90121
#ObjCSpaceBeforeProtocolList: true
91-
PenaltyBreakBeforeFirstCallParameter: 19
92-
PenaltyBreakComment: 300
93-
PenaltyBreakFirstLessLess: 120
94-
PenaltyBreakString: 1000
95-
96-
PenaltyExcessCharacter: 1000000
97-
PenaltyReturnTypeOnItsOwnLine: 60
122+
PenaltyBreakBeforeFirstCallParameter: 50
123+
PenaltyBreakComment: 0
124+
PenaltyBreakFirstLessLess: 0
125+
PenaltyBreakString: 70
126+
PenaltyExcessCharacter: 100
127+
PenaltyReturnTypeOnItsOwnLine: 90
98128
PointerAlignment: Left
99-
SpaceAfterCStyleCast: false
129+
SpaceAfterCStyleCast: true
100130
SpaceBeforeAssignmentOperators: true
101-
SpaceBeforeParens: Never
131+
SpaceBeforeParens: ControlStatements
102132
SpaceInEmptyParentheses: false
103133
SpacesBeforeTrailingComments: 1
104-
SpacesInAngles: false
134+
SpacesInAngles: Never
105135
SpacesInContainerLiterals: true
106136
SpacesInCStyleCastParentheses: false
107137
SpacesInParentheses: false
@@ -110,11 +140,25 @@ SpacesInSquareBrackets: false
110140
#SpaceBeforeInheritanceColon: true
111141

112142
#SortUsingDeclarations: true
113-
SortIncludes: true
143+
SortIncludes: CaseInsensitive
114144

115-
# Comments are for developers, they should arrange them
116-
ReflowComments: false
145+
ReflowComments: true
117146

118147
#IncludeBlocks: Preserve
119148
#IndentPPDirectives: AfterHash
149+
150+
StatementMacros: [
151+
'THRUST_EXEC_CHECK_DISABLE',
152+
'THRUST_NAMESPACE_BEGIN',
153+
'THRUST_NAMESPACE_END',
154+
'THRUST_EXEC_CHECK_DISABLE',
155+
'CUB_NAMESPACE_BEGIN',
156+
'CUB_NAMESPACE_END',
157+
'THRUST_NAMESPACE_BEGIN',
158+
'THRUST_NAMESPACE_END',
159+
'_LIBCUDACXX_BEGIN_NAMESPACE_STD',
160+
'_LIBCUDACXX_END_NAMESPACE_STD',
161+
]
162+
TabWidth: 2
163+
UseTab: Never
120164
---

.gitlab-ci.yml

Lines changed: 25 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ include:
1212
- /deps-rocm.yaml
1313
- /deps-windows.yaml
1414
- /deps-nvcc.yaml
15+
- /deps-compiler-acceleration.yaml
1516
- /gpus-rocm.yaml
1617
- /gpus-nvcc.yaml
1718
- /rules.yaml
@@ -46,17 +47,21 @@ copyright-date:
4647
extends:
4748
- .deps:rocm
4849
- .deps:cmake-latest
50+
- .deps:compiler-acceleration
4951
before_script:
5052
- !reference [".deps:rocm", before_script]
5153
- !reference [".deps:cmake-latest", before_script]
54+
- !reference [".deps:compiler-acceleration", before_script]
5255

5356
.cmake-minimum:
5457
extends:
5558
- .deps:rocm
5659
- .deps:cmake-minimum
60+
- .deps:compiler-acceleration
5761
before_script:
5862
- !reference [".deps:rocm", before_script]
5963
- !reference [".deps:cmake-minimum", before_script]
64+
- !reference [".deps:compiler-acceleration", before_script]
6065

6166
.install-rocprim:
6267
script:
@@ -69,8 +74,11 @@ copyright-date:
6974
-D CMAKE_CXX_COMPILER=hipcc
7075
-D CMAKE_BUILD_TYPE=Release
7176
-D BUILD_TEST=OFF
77+
-D BUILD_HIPSTDPAR_TEST=OFF
7278
-D BUILD_EXAMPLE=OFF
7379
-D ROCM_DEP_ROCMCORE=OFF
80+
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
81+
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
7482
-S $ROCPRIM_DIR
7583
-B $ROCPRIM_DIR/build
7684
- cd $ROCPRIM_DIR/build
@@ -91,7 +99,7 @@ copyright-date:
9199
- !reference [.install-rocprim, script]
92100
- | # Setup env vars for testing
93101
rng_seed_count=0; prng_seeds="0";
94-
if [[ $CI_COMMIT_BRANCH == "develop_stream" ]]; then
102+
if [[ $CI_COMMIT_BRANCH == "develop_stream" ]]; then
95103
rng_seed_count=3
96104
prng_seeds="0, 1000"
97105
fi
@@ -111,6 +119,9 @@ copyright-date:
111119
-D AMDGPU_TEST_TARGETS=$GPU_TARGETS
112120
-D RNG_SEED_COUNT=$rng_seed_count
113121
-D PRNG_SEEDS=$prng_seeds
122+
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
123+
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
124+
-D CMAKE_CUDA_COMPILER_LAUNCHER=phc_sccache_cuda
114125
-S $CI_PROJECT_DIR
115126
-B $CI_PROJECT_DIR/build
116127
- cmake --build $CI_PROJECT_DIR/build
@@ -198,10 +209,10 @@ build:windows:
198209
-D CMAKE_INSTALL_PREFIX:PATH="$ROCPRIM_DIR/build/install" *>&1
199210
- \& cmake --build "$ROCPRIM_DIR/build" --target install *>&1
200211
# Configure and build rocThrust
201-
- \& cmake
202-
-S "$CI_PROJECT_DIR"
203-
-B "$CI_PROJECT_DIR/build"
204-
-G Ninja
212+
- \& cmake
213+
-S "$CI_PROJECT_DIR"
214+
-B "$CI_PROJECT_DIR/build"
215+
-G Ninja
205216
-D CMAKE_BUILD_TYPE=Release
206217
-D GPU_TARGETS=$GPU_TARGET
207218
-D BUILD_TEST=ON
@@ -327,10 +338,12 @@ test:rocm-windows-install:
327338
- .deps:nvcc
328339
- .gpus:nvcc-gpus
329340
- .deps:cmake-latest
341+
- .deps:compiler-acceleration
330342
- .rules:manual
331343
before_script:
332344
- !reference [".deps:nvcc", before_script]
333345
- !reference [".deps:cmake-latest", before_script]
346+
- !reference [".deps:compiler-acceleration", before_script]
334347

335348
build:cuda-and-omp:
336349
stage: build
@@ -340,7 +353,7 @@ build:cuda-and-omp:
340353
tags:
341354
- build
342355
variables:
343-
CCCL_GIT_BRANCH: v2.3.2
356+
CCCL_GIT_BRANCH: v2.4.0
344357
CCCL_DIR: ${CI_PROJECT_DIR}/cccl
345358
needs: []
346359
script:
@@ -349,16 +362,21 @@ build:cuda-and-omp:
349362
- rm -R $CCCL_DIR/thrust/thrust
350363
- cp -r $CI_PROJECT_DIR/thrust $CCCL_DIR/thrust
351364
# Build tests and examples from CCCL Thrust
365+
# CCCL 2.4.0 breaks compilation of tests. Compile examples only until we
366+
# match v2.5.0.
352367
- cmake
353368
-G Ninja
354369
-D CMAKE_BUILD_TYPE=Release
355370
-D CMAKE_CUDA_ARCHITECTURES="$GPU_TARGETS"
356-
-D THRUST_ENABLE_TESTING=ON
371+
-D THRUST_ENABLE_TESTING=OFF
357372
-D THRUST_ENABLE_EXAMPLES=ON
358373
-D THRUST_ENABLE_BENCHMARKS=OFF
359374
-D THRUST_ENABLE_MULTICONFIG=ON
360375
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_OMP=ON
361376
-D THRUST_MULTICONFIG_ENABLE_SYSTEM_CUDA=ON
377+
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
378+
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
379+
-D CMAKE_CUDA_COMPILER_LAUNCHER=phc_sccache_cuda
362380
-B $CI_PROJECT_DIR/build
363381
-S $CCCL_DIR/thrust
364382
- cmake --build $CI_PROJECT_DIR/build

CHANGELOG.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,18 @@
33
Documentation for rocThrust available at
44
[https://rocm.docs.amd.com/projects/rocThrust/en/latest/](https://rocm.docs.amd.com/projects/rocThrust/en/latest/).
55

6-
## (Unreleased) rocThrust 3.2.0 for ROCm 6.4
6+
## (Unreleased) rocThrust 3.3.0 for ROCm 6.4
77

88
### Added
99
* Added extended tests to `rtest.py`. These tests are extra tests that did not fit the criteria of smoke and regression tests. These tests will take much longer to run relative to smoke and regression tests. Use `python rtest.py [--emulation|-e|--test|-t]=extended` to run these tests.
1010
* Added regression tests to `rtest.py`. These tests recreate scenarios that have caused hardware problems in past emulation environments. Use `python rtest.py [--emulation|-e|--test|-t]=regression` to run these tests.
1111
* Added smoke test options, which runs a subset of the unit tests and ensures that less than 2gb of VRAM will be used. Use `python rtest.py [--emulation|-e|--test|-t]=smoke` to run these tests.
1212
* Added `--emulation` option for `rtest.py`
13+
* Merged changes from upstream CCCL/thrust 2.4.0
1314

1415
### Changed
1516
* `--test|-t` is no longer a required flag for `rtest.py`. Instead, the user can use either `--emulation|-e` or `--test|-t`, but not both.
17+
* Split the contents of HIPSTDPAR's forwarding header into several implementation headers.
1618

1719
## (Unreleased) rocThrust 3.2.0 for ROCm 6.3
1820

@@ -38,6 +40,7 @@ Documentation for rocThrust available at
3840

3941
* Merged changes from upstream CCCL/thrust 2.2.0
4042
* Updated the contents of `system/hip` and `test` with the upstream changes to `system/cuda` and `testing`
43+
* Added HIPSTDPAR library as part of rocThrust.
4144

4245
### Changes
4346

CMakeLists.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ endif()
8080
# Disable -Werror
8181
option(DISABLE_WERROR "Disable building with Werror" ON)
8282
option(BUILD_TEST "Build tests" OFF)
83+
option(BUILD_HIPSTDPAR_TEST "Build hipstdpar tests" OFF)
8384
option(BUILD_EXAMPLES "Build examples" OFF)
8485
option(BUILD_BENCHMARKS "Build benchmarks" OFF)
8586
option(DOWNLOAD_ROCPRIM "Download rocPRIM and do not search for rocPRIM package" OFF)
@@ -143,14 +144,14 @@ if(BUILD_TEST OR BUILD_BENCHMARKS)
143144
endif()
144145

145146
# Tests
146-
if(BUILD_TEST)
147+
if(BUILD_TEST OR BUILD_HIPSTDPAR_TEST)
147148
rocm_package_setup_client_component(tests)
148149
if (ENABLE_UPSTREAM_TESTS)
149150
enable_testing()
150151
endif()
151152
# We still want the testing to be compiled to catch some errors
152153
#TODO: Get testing folder working with HIP on Windows
153-
if (NOT WIN32)
154+
if (NOT WIN32 AND BUILD_TEST)
154155
add_subdirectory(testing)
155156
endif()
156157
enable_testing()

0 commit comments

Comments
 (0)