Skip to content

Latest commit

 

History

History
130 lines (101 loc) · 4.87 KB

atomic_fence.asciidoc

File metadata and controls

130 lines (101 loc) · 4.87 KB

Test plan for atomic_fence

This is a test plan for sycl::atomic_fence function described in SYCL 2020 sections 4.15.1. Barriers and fences

1. Testing scope

1.1. Device coverage

All of the tests described below are performed only on the default device that is selected on the CTS command line.

2. Tests

2.1. SYCL2020 interface

2.1.1. Testing sycl::atomic_fence function

The synchronization check between work-items with the atomic_fence function should be performed with sycl::handler::parallel_for.

The check should be performed with following memory types:

  • local memory and sycl::local_accessor - to test synchronization between work-items in the same work-group

  • global memory and sycl::accessor - to test synchronization between work-items in different work-groups

Testing atomic_fence across work-items in the same work-group

This test allocates memory in a local_accessor that is shared by all work-items in each work-group. Each leader work-item in the work-group writes a value to this memory and every other work-item in the work-group reads the memory, making sure it has the expected value.

sycl::group g = /* ... */;
sycl::atomic_ref<bool, sycl::memory_order::relaxed, sycl::memory_scope::work_group> sync_flag(/*value from local_accessor*/);
int *data = &/*value from local_accessor*/;
if (g.leader()) {
  *data = EXPECTED;
  sycl::atomic_fence(ORDER1, SCOPE);
  sync_flag = true;
}
else {
  bool write_happened = false;
  for (int i = 0; i < RETRY_COUNT; i++) {
    if (sync_flag == true) {
      write_happened = true;
      break;
    }
  }
  sycl::atomic_fence(ORDER2, SCOPE);
  if (write_happened) {
    if (*data != EXPECTED)
      /* test fails */
  }
}

The test is repeated for each of the following values of SCOPE which is supported by the device (according to info::device::atomic_fence_scope_capabilities):

  • sycl::memory_scope::work_group

  • sycl::memory_scope::device

  • sycl::memory_scope::system

For each supported SCOPE, the test is repeated for each of the following combinations of ORDER1 and ORDER2 which are supported by the device (according to info::device::atomic_fence_order_capabilities):

  • ORDER1 is sycl::memory_order::release and ORDER2 is sycl::memory_order::acquire.

  • ORDER1 and ORDER2 are both sycl::memory_order::acq_rel

  • ORDER1 and ORDER2 are both sycl::memory_order::seq_cst

Testing atomic_fence across work-items in different work-groups

This test allocates memory in a buffer accessor that is shared by all work-items in the kernel. One work-item writes a value to this memory and all other work-items in the kernel read the memory, making sure it has the expected value. Because this test requires sycl::memory_scope::device, it is only run if the device supports that memory scope.

sycl::group g = /* ... */;
sycl::atomic_ref<bool, sycl::memory_order::relaxed, sycl::memory_scope::device> sync_flag(/*value from accessor*/);
int *data = &/*value from accessor*/;
if (nditem.get_global_linear_id() == 0) {
  *data = EXPECTED;
  sycl::atomic_fence(ORDER1, SCOPE);
  sync_flag = true;
}
else {
  bool write_happened = false;
  for (int i = 0; i < RETRY_COUNT; i++) {
    if (sync_flag == true) {
      write_happened = true;
      break;
    }
  }
  sycl::atomic_fence(ORDER2, SCOPE);
  if (write_happened) {
    if (*data != EXPECTED)
      /* test fails */
  }
}

The test is repeated for each of the following values of SCOPE which is supported by the device (according to info::device::atomic_fence_scope_capabilities):

  • sycl::memory_scope::device

  • sycl::memory_scope::system

For each supported SCOPE, the test is repeated for each of the following combinations of ORDER1 and ORDER2 which are supported by the device (according to info::device::atomic_fence_order_capabilities):

  • ORDER1 is sycl::memory_order::release and ORDER2 is sycl::memory_order::acquire.

  • ORDER1 and ORDER2 are both sycl::memory_order::acq_rel

  • ORDER1 and ORDER2 are both sycl::memory_order::seq_cst

Clarification and limitation of check algorithms

The above algorithms check that using a synchronizing variable of atomic_ref type in combination with atomic_fence function provides no data racing and the specified order of instruction execution.

Also, in the algorithm the atomic_fence function check will be performed only if the leader writes a value to the sync_flag variable before all other items complete the loop with the check of sync_flag value.

It does not give a 100% guarantee that the check will be performed, but while there is no guarantee that the loop with the condition while (sync_flag != true); will always complete, a safe and less strict check algorithm has been chosen.