Skip to content
This repository was archived by the owner on Mar 28, 2023. It is now read-only.

Commit b405c81

Browse files
[SYCL] Split AtomicRef 64-bit tests into separate files (#436)
* [SYCL] Split AtomicRef 64-bit tests into separate files Some devices do not support 64-bit atomics, in which case most AtomicRef tests would fail due to them assuming 64-bit atomics being available. The availability of 64-bit atomics can be queried in SYCL 2020 using `aspect::atomic64`. However, the generated device code will fail to compile if there exists any kernels using 64-bit atomics, even if the test intents to skip testing said kernel. To circumvent this, all AtomicRef tests that previously mixed 64-bit atomics with smaller atomic operations are split into two test files, one with non-64-bit atomics and one with only the 64-bit atomics. The latter test is skipped if the selected device does not have `aspect::atomic64`. Likewise, the atomic memory order test is further split into tests for each of the tested atomic memory orderings. This is also to avoid invalid instructions in the generated device code on devices that do not support the corresponding atomic memory order. Additionally, affected AtomicRef tests are also run on accelerators. Change summary: * AtomicRef tests for 64-bit atomics are split into separate test files guarded by a check if the used device has `aspect::atomic64`. * SYCL/AtomicRef/atomic_memory_order.cpp is further split into separate files testing `memory_order::acq_rel` and `memory_order::seq_cst`. * All AtomicRef tests changed in this commit will now run on accelerators. Signed-off-by: Steffen Larsen <[email protected]>
1 parent c9e45c8 commit b405c81

35 files changed

+1369
-775
lines changed

SYCL/AtomicRef/add.cpp

Lines changed: 13 additions & 165 deletions
Original file line numberDiff line numberDiff line change
@@ -3,188 +3,36 @@
33
// RUN: %HOST_RUN_PLACEHOLDER %t.out
44
// RUN: %GPU_RUN_PLACEHOLDER %t.out
55
// RUN: %CPU_RUN_PLACEHOLDER %t.out
6+
// RUN: %ACC_RUN_PLACEHOLDER %t.out
67

7-
#include <CL/sycl.hpp>
8-
#include <algorithm>
9-
#include <cassert>
8+
#include "add.h"
109
#include <iostream>
11-
#include <numeric>
12-
#include <vector>
1310
using namespace sycl;
14-
using namespace sycl::ONEAPI;
15-
16-
template <typename T, typename Difference = T>
17-
void add_fetch_test(queue q, size_t N) {
18-
T sum = 0;
19-
std::vector<T> output(N);
20-
std::fill(output.begin(), output.end(), T(0));
21-
{
22-
buffer<T> sum_buf(&sum, 1);
23-
buffer<T> output_buf(output.data(), output.size());
24-
25-
q.submit([&](handler &cgh) {
26-
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
27-
auto out =
28-
output_buf.template get_access<access::mode::discard_write>(cgh);
29-
cgh.parallel_for(range<1>(N), [=](item<1> it) {
30-
int gid = it.get_id(0);
31-
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
32-
access::address_space::global_space>(sum[0]);
33-
out[gid] = atm.fetch_add(Difference(1));
34-
});
35-
});
36-
}
37-
38-
// All work-items increment by 1, so final value should be equal to N
39-
assert(sum == T(N));
40-
41-
// Fetch returns original value: will be in [0, N-1]
42-
auto min_e = std::min_element(output.begin(), output.end());
43-
auto max_e = std::max_element(output.begin(), output.end());
44-
assert(*min_e == T(0) && *max_e == T(N - 1));
45-
46-
// Intermediate values should be unique
47-
std::sort(output.begin(), output.end());
48-
assert(std::unique(output.begin(), output.end()) == output.end());
49-
}
50-
51-
template <typename T, typename Difference = T>
52-
void add_plus_equal_test(queue q, size_t N) {
53-
T sum = 0;
54-
std::vector<T> output(N);
55-
std::fill(output.begin(), output.end(), T(0));
56-
{
57-
buffer<T> sum_buf(&sum, 1);
58-
buffer<T> output_buf(output.data(), output.size());
59-
60-
q.submit([&](handler &cgh) {
61-
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
62-
auto out =
63-
output_buf.template get_access<access::mode::discard_write>(cgh);
64-
cgh.parallel_for(range<1>(N), [=](item<1> it) {
65-
int gid = it.get_id(0);
66-
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
67-
access::address_space::global_space>(sum[0]);
68-
out[gid] = atm += Difference(1);
69-
});
70-
});
71-
}
72-
73-
// All work-items increment by 1, so final value should be equal to N
74-
assert(sum == T(N));
75-
76-
// += returns updated value: will be in [1, N]
77-
auto min_e = std::min_element(output.begin(), output.end());
78-
auto max_e = std::max_element(output.begin(), output.end());
79-
assert(*min_e == T(1) && *max_e == T(N));
80-
81-
// Intermediate values should be unique
82-
std::sort(output.begin(), output.end());
83-
assert(std::unique(output.begin(), output.end()) == output.end());
84-
}
85-
86-
template <typename T, typename Difference = T>
87-
void add_pre_inc_test(queue q, size_t N) {
88-
T sum = 0;
89-
std::vector<T> output(N);
90-
std::fill(output.begin(), output.end(), T(0));
91-
{
92-
buffer<T> sum_buf(&sum, 1);
93-
buffer<T> output_buf(output.data(), output.size());
94-
95-
q.submit([&](handler &cgh) {
96-
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
97-
auto out =
98-
output_buf.template get_access<access::mode::discard_write>(cgh);
99-
cgh.parallel_for(range<1>(N), [=](item<1> it) {
100-
int gid = it.get_id(0);
101-
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
102-
access::address_space::global_space>(sum[0]);
103-
out[gid] = ++atm;
104-
});
105-
});
106-
}
107-
108-
// All work-items increment by 1, so final value should be equal to N
109-
assert(sum == T(N));
110-
111-
// Pre-increment returns updated value: will be in [1, N]
112-
auto min_e = std::min_element(output.begin(), output.end());
113-
auto max_e = std::max_element(output.begin(), output.end());
114-
assert(*min_e == T(1) && *max_e == T(N));
115-
116-
// Intermediate values should be unique
117-
std::sort(output.begin(), output.end());
118-
assert(std::unique(output.begin(), output.end()) == output.end());
119-
}
120-
121-
template <typename T, typename Difference = T>
122-
void add_post_inc_test(queue q, size_t N) {
123-
T sum = 0;
124-
std::vector<T> output(N);
125-
std::fill(output.begin(), output.end(), T(0));
126-
{
127-
buffer<T> sum_buf(&sum, 1);
128-
buffer<T> output_buf(output.data(), output.size());
129-
130-
q.submit([&](handler &cgh) {
131-
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
132-
auto out =
133-
output_buf.template get_access<access::mode::discard_write>(cgh);
134-
cgh.parallel_for(range<1>(N), [=](item<1> it) {
135-
int gid = it.get_id(0);
136-
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
137-
access::address_space::global_space>(sum[0]);
138-
out[gid] = atm++;
139-
});
140-
});
141-
}
142-
143-
// All work-items increment by 1, so final value should be equal to N
144-
assert(sum == T(N));
145-
146-
// Post-increment returns original value: will be in [0, N-1]
147-
auto min_e = std::min_element(output.begin(), output.end());
148-
auto max_e = std::max_element(output.begin(), output.end());
149-
assert(*min_e == T(0) && *max_e == T(N - 1));
150-
151-
// Intermediate values should be unique
152-
std::sort(output.begin(), output.end());
153-
assert(std::unique(output.begin(), output.end()) == output.end());
154-
}
155-
156-
template <typename T, typename Difference = T>
157-
void add_test(queue q, size_t N) {
158-
add_fetch_test<T, Difference>(q, N);
159-
add_plus_equal_test<T, Difference>(q, N);
160-
add_pre_inc_test<T, Difference>(q, N);
161-
add_post_inc_test<T, Difference>(q, N);
162-
}
16311

16412
// Floating-point types do not support pre- or post-increment
16513
template <> void add_test<float>(queue q, size_t N) {
16614
add_fetch_test<float>(q, N);
16715
add_plus_equal_test<float>(q, N);
16816
}
169-
template <> void add_test<double>(queue q, size_t N) {
170-
add_fetch_test<double>(q, N);
171-
add_plus_equal_test<double>(q, N);
172-
}
17317

17418
int main() {
17519
queue q;
176-
std::string version = q.get_device().get_info<info::device::version>();
17720

17821
constexpr int N = 32;
17922
add_test<int>(q, N);
18023
add_test<unsigned int>(q, N);
181-
add_test<long>(q, N);
182-
add_test<unsigned long>(q, N);
183-
add_test<long long>(q, N);
184-
add_test<unsigned long long>(q, N);
18524
add_test<float>(q, N);
186-
add_test<double>(q, N);
187-
add_test<char *, ptrdiff_t>(q, N);
25+
26+
// Include long tests if they are 32 bits wide
27+
if constexpr (sizeof(long) == 4) {
28+
add_test<long>(q, N);
29+
add_test<unsigned long>(q, N);
30+
}
31+
32+
// Include pointer tests if they are 32 bits wide
33+
if constexpr (sizeof(char *) == 4) {
34+
add_test<char *, ptrdiff_t>(q, N);
35+
}
18836

18937
std::cout << "Test passed." << std::endl;
19038
}

SYCL/AtomicRef/add.h

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
#pragma once
2+
3+
#include <CL/sycl.hpp>
4+
#include <algorithm>
5+
#include <cassert>
6+
#include <numeric>
7+
#include <vector>
8+
9+
using namespace sycl;
10+
using namespace sycl::ext::oneapi;
11+
12+
template <typename T, typename Difference = T>
13+
void add_fetch_test(queue q, size_t N) {
14+
T sum = 0;
15+
std::vector<T> output(N);
16+
std::fill(output.begin(), output.end(), T(0));
17+
{
18+
buffer<T> sum_buf(&sum, 1);
19+
buffer<T> output_buf(output.data(), output.size());
20+
21+
q.submit([&](handler &cgh) {
22+
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
23+
auto out =
24+
output_buf.template get_access<access::mode::discard_write>(cgh);
25+
cgh.parallel_for(range<1>(N), [=](item<1> it) {
26+
int gid = it.get_id(0);
27+
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
28+
access::address_space::global_space>(sum[0]);
29+
out[gid] = atm.fetch_add(Difference(1));
30+
});
31+
});
32+
}
33+
34+
// All work-items increment by 1, so final value should be equal to N
35+
assert(sum == T(N));
36+
37+
// Fetch returns original value: will be in [0, N-1]
38+
auto min_e = std::min_element(output.begin(), output.end());
39+
auto max_e = std::max_element(output.begin(), output.end());
40+
assert(*min_e == T(0) && *max_e == T(N - 1));
41+
42+
// Intermediate values should be unique
43+
std::sort(output.begin(), output.end());
44+
assert(std::unique(output.begin(), output.end()) == output.end());
45+
}
46+
47+
template <typename T, typename Difference = T>
48+
void add_plus_equal_test(queue q, size_t N) {
49+
T sum = 0;
50+
std::vector<T> output(N);
51+
std::fill(output.begin(), output.end(), T(0));
52+
{
53+
buffer<T> sum_buf(&sum, 1);
54+
buffer<T> output_buf(output.data(), output.size());
55+
56+
q.submit([&](handler &cgh) {
57+
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
58+
auto out =
59+
output_buf.template get_access<access::mode::discard_write>(cgh);
60+
cgh.parallel_for(range<1>(N), [=](item<1> it) {
61+
int gid = it.get_id(0);
62+
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
63+
access::address_space::global_space>(sum[0]);
64+
out[gid] = atm += Difference(1);
65+
});
66+
});
67+
}
68+
69+
// All work-items increment by 1, so final value should be equal to N
70+
assert(sum == T(N));
71+
72+
// += returns updated value: will be in [1, N]
73+
auto min_e = std::min_element(output.begin(), output.end());
74+
auto max_e = std::max_element(output.begin(), output.end());
75+
assert(*min_e == T(1) && *max_e == T(N));
76+
77+
// Intermediate values should be unique
78+
std::sort(output.begin(), output.end());
79+
assert(std::unique(output.begin(), output.end()) == output.end());
80+
}
81+
82+
template <typename T, typename Difference = T>
83+
void add_pre_inc_test(queue q, size_t N) {
84+
T sum = 0;
85+
std::vector<T> output(N);
86+
std::fill(output.begin(), output.end(), T(0));
87+
{
88+
buffer<T> sum_buf(&sum, 1);
89+
buffer<T> output_buf(output.data(), output.size());
90+
91+
q.submit([&](handler &cgh) {
92+
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
93+
auto out =
94+
output_buf.template get_access<access::mode::discard_write>(cgh);
95+
cgh.parallel_for(range<1>(N), [=](item<1> it) {
96+
int gid = it.get_id(0);
97+
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
98+
access::address_space::global_space>(sum[0]);
99+
out[gid] = ++atm;
100+
});
101+
});
102+
}
103+
104+
// All work-items increment by 1, so final value should be equal to N
105+
assert(sum == T(N));
106+
107+
// Pre-increment returns updated value: will be in [1, N]
108+
auto min_e = std::min_element(output.begin(), output.end());
109+
auto max_e = std::max_element(output.begin(), output.end());
110+
assert(*min_e == T(1) && *max_e == T(N));
111+
112+
// Intermediate values should be unique
113+
std::sort(output.begin(), output.end());
114+
assert(std::unique(output.begin(), output.end()) == output.end());
115+
}
116+
117+
template <typename T, typename Difference = T>
118+
void add_post_inc_test(queue q, size_t N) {
119+
T sum = 0;
120+
std::vector<T> output(N);
121+
std::fill(output.begin(), output.end(), T(0));
122+
{
123+
buffer<T> sum_buf(&sum, 1);
124+
buffer<T> output_buf(output.data(), output.size());
125+
126+
q.submit([&](handler &cgh) {
127+
auto sum = sum_buf.template get_access<access::mode::read_write>(cgh);
128+
auto out =
129+
output_buf.template get_access<access::mode::discard_write>(cgh);
130+
cgh.parallel_for(range<1>(N), [=](item<1> it) {
131+
int gid = it.get_id(0);
132+
auto atm = atomic_ref<T, memory_order::relaxed, memory_scope::device,
133+
access::address_space::global_space>(sum[0]);
134+
out[gid] = atm++;
135+
});
136+
});
137+
}
138+
139+
// All work-items increment by 1, so final value should be equal to N
140+
assert(sum == T(N));
141+
142+
// Post-increment returns original value: will be in [0, N-1]
143+
auto min_e = std::min_element(output.begin(), output.end());
144+
auto max_e = std::max_element(output.begin(), output.end());
145+
assert(*min_e == T(0) && *max_e == T(N - 1));
146+
147+
// Intermediate values should be unique
148+
std::sort(output.begin(), output.end());
149+
assert(std::unique(output.begin(), output.end()) == output.end());
150+
}
151+
152+
template <typename T, typename Difference = T>
153+
void add_test(queue q, size_t N) {
154+
add_fetch_test<T, Difference>(q, N);
155+
add_plus_equal_test<T, Difference>(q, N);
156+
add_pre_inc_test<T, Difference>(q, N);
157+
add_post_inc_test<T, Difference>(q, N);
158+
}

0 commit comments

Comments
 (0)