Names that begin with an underscore followed by an uppercase letter
are reserved for the C++ implementation.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Fix a few instances where an incorrect number of arguments was
supplied when calling (v)log_error.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Fix local memory out of bounds in atomic_fence
In the error condition, the atomic_fence kernel can illegally access local memory addresses.
In this snippet, localValues is in the local address space and provided as a kernel argument. Its size is effectively get_local_size(0) * sizeof(int). The stores to localValues lead to OoB accesses.
size_t myId = get_local_id(0);
...
if(hisAtomicValue != hisValue)
{ // fail
atomic_store(&destMemory[myId], myValue-1);
hisId = (hisId+get_local_size(0)-1)%get_local_size(0);
if(myValue+1 < 1)
localValues[myId*1+myValue+1] = hisId;
if(myValue+2 < 1)
localValues[myId*1+myValue+2] = hisAtomicValue;
if(myValue+3 < 1)
localValues[myId*1+myValue+3] = hisValue;
}
* Fix formatting
* Fix formatting again
* Formatting
* Change memory order and scope for atomics that gate final results being stored.
memory_order_acq_rel with memory_scope_device is now used to guarantee that the correct memory consistency is observed before final results are stored.
Previously it was possible for kernels to be generated that all used relaxed memory ordering, which could lead to false-positive failures.
Fixes#1370
* Disable atomics tests with global, in-program atomics.
If the device does not support `memory_order_relaxed` or `memory_scope_device`, disable atomics tests that declare their atomics in-program with global memory.
There is now an implicit requirement to support `memory_order_relaxed` and `memory_scope_device` for these tests.
* Fix misplaced parentheses.
* Change memory scope for atomic fetch and load calls in kernel
Change the memory scope from memory_scope_work_group to
memory_scope_device so the ordering applies across all work items
Co-authored-by: Sreelakshmi Haridas <sharidas@quicinc.com>
* Fix memory model issue in atomic_flag.
In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.
This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.
Fixes#134.
* Clang format changes.
* Added missing global acquire which is necessary for the corresponding global release.
Thanks to @jlewis-austin for spotting.
* Clang format changes.
* Match the condition for applying acquire/release fences.
* Add memory_scope_all_devices testing
This duplicats memory_scope_all_svm_devices testing, but it seems pretty quick
so I don't think it hurts.
Fixes#990
* Address clang-format failures
* Address a further clang-format failure
* The `atomic_flag` test assumes support for the `atomic_scope_device`
scope in the global scope test case. Since `atomic_scope_device` is
optional on an OpenCL-3.0 driver, this test should check for support and
skip otherwise.
The generator function for atomic_fence uses the current workgroup
size to determine the number of non atomic variables per thread
in the kernel. The kernel should hence be regenerated when the
workgroup size changes.
However regenerating the kernel can itself change the workgroup
size. This change introduces an iterative loop that reduces the
workgroup sizes by 2 each time re-generating the kernel until
we find one that works (or exit at groupsize == 1)
Change-Id: Ic32fe967e32de6643e01c6775f4bddbcad0a299a
* Change setup code in `KernelCode()` to use `_explicit` builtin
variants that are common to both OpenCL-2.X and OpenCL-3.0.
* Only test optional supported builtin variants (`_explicit` signature
memory_order/scope) for OpenCL-3.0.
* Disable program scope global variable and generic address space tests
for a OpenCL-3.0 driver which does not optionally support these
features.
* Test minimum memory consistency capabilities for a device reporting >= 3.0.
Skip tests where unsupported memory consistency capabilities are being requested.
* Pass nullptr as program build option.
Allows the CTS framework to select an appropriate CL C version.
* Removed redundant braces.
The spec states that the minimum amount of local memory for embedded devices is 1KB. This change clamps work group sizes to 1024 for embedded devices, and sets the number of local variables per thread to 1.
Fixes#690.
The maintenance of the conformance tests is moving to Github.
This commit contains all the changes that have been done in
Gitlab since the first public release of the conformance tests.
Signed-off-by: Kevin Petit kevin.petit@arm.com