This is a test plan for sycl::atomic_fence
function described in SYCL 2020 sections 4.15.1. Barriers and fences
The synchronization check between work-items
with the atomic_fence
function should be performed with sycl::handler::parallel_for
.
The check should be performed with following memory types:
-
local memory
andsycl::local_accessor
- to test synchronization between work-items in the same work-group -
global memory
andsycl::accessor
- to test synchronization between work-items in different work-groups
This test allocates memory in a local_accessor
that is shared by all work-items
in each work-group
.
Each leader work-item
in the work-group
writes a value to this memory and every other work-item
in
the work-group
reads the memory, making sure it has the expected value.
sycl::group g = /* ... */;
sycl::atomic_ref<bool, sycl::memory_order::relaxed, sycl::memory_scope::work_group> sync_flag(/*value from local_accessor*/);
int *data = &/*value from local_accessor*/;
if (g.leader()) {
*data = EXPECTED;
sycl::atomic_fence(ORDER1, SCOPE);
sync_flag = true;
}
else {
bool write_happened = false;
for (int i = 0; i < RETRY_COUNT; i++) {
if (sync_flag == true) {
write_happened = true;
break;
}
}
sycl::atomic_fence(ORDER2, SCOPE);
if (write_happened) {
if (*data != EXPECTED)
/* test fails */
}
}
The test is repeated for each of the following values of SCOPE
which is supported by the device
(according to info::device::atomic_fence_scope_capabilities
):
-
sycl::memory_scope::work_group
-
sycl::memory_scope::device
-
sycl::memory_scope::system
For each supported SCOPE
, the test is repeated for each of the following combinations of ORDER1
and ORDER2
which are supported by the device (according to info::device::atomic_fence_order_capabilities
):
-
ORDER1
issycl::memory_order::release
andORDER2
issycl::memory_order::acquire
. -
ORDER1
andORDER2
are bothsycl::memory_order::acq_rel
-
ORDER1
andORDER2
are bothsycl::memory_order::seq_cst
This test allocates memory in a buffer accessor
that is shared by all work-items
in the kernel
.
One work-item
writes a value to this memory and all other work-items
in the kernel
read the memory,
making sure it has the expected value. Because this test requires sycl::memory_scope::device
,
it is only run if the device supports that memory scope.
sycl::group g = /* ... */;
sycl::atomic_ref<bool, sycl::memory_order::relaxed, sycl::memory_scope::device> sync_flag(/*value from accessor*/);
int *data = &/*value from accessor*/;
if (nditem.get_global_linear_id() == 0) {
*data = EXPECTED;
sycl::atomic_fence(ORDER1, SCOPE);
sync_flag = true;
}
else {
bool write_happened = false;
for (int i = 0; i < RETRY_COUNT; i++) {
if (sync_flag == true) {
write_happened = true;
break;
}
}
sycl::atomic_fence(ORDER2, SCOPE);
if (write_happened) {
if (*data != EXPECTED)
/* test fails */
}
}
The test is repeated for each of the following values of SCOPE
which is supported by the device
(according to info::device::atomic_fence_scope_capabilities
):
-
sycl::memory_scope::device
-
sycl::memory_scope::system
For each supported SCOPE
, the test is repeated for each of the following combinations of ORDER1
and ORDER2
which are supported by the device (according to info::device::atomic_fence_order_capabilities
):
-
ORDER1
issycl::memory_order::release
andORDER2
issycl::memory_order::acquire
. -
ORDER1
andORDER2
are bothsycl::memory_order::acq_rel
-
ORDER1
andORDER2
are bothsycl::memory_order::seq_cst
The above algorithms check that using a synchronizing variable of atomic_ref
type in combination with
atomic_fence
function provides no data racing and the specified order of instruction execution.
Also, in the algorithm the atomic_fence
function check will be performed only if the leader writes
a value to the sync_flag
variable before all other items complete the loop with the check of sync_flag
value.
It does not give a 100% guarantee that the check will be performed, but while there is no guarantee
that the loop with the condition while (sync_flag != true);
will always complete, a safe and less
strict check algorithm has been chosen.