The maximum value for the workgroup size in a specific dimension can be
lower than the overall maximum workgroup size. This patch queries for
the maximum work item size in the first dimension and limits the
group_size by that value as well.
Signed-off-by: Ole Strohm <ole.strohm@arm.com>
Fixes a warning in the mutable dispatch test with some compilers:
```
3>C:\git\OpenCL-CTS\test_conformance\extensions\cl_khr_command_buffer\cl_khr_command_buffer_mutable_dispatch\mutable_command_basic.h(82,16): warning C4805: '==': unsafe mix of type 'int' and type 'bool' in operation
```
Also fixes a misspelled variable name while we're at it.
Add a wrapper around AHB for proper resource deallocation and refactor
existing tests to use the wrapper.
Add a negative test for AHB to test for error codes when calling
clCreateImageWithProperties and clCreateBufferWithProperties.
---------
Signed-off-by: Alex Davicenko <alex.davicenko@arm.com>
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Co-authored-by: Alex Davicenko <alex.davicenko@arm.com>
**For mutable_dispatch_image_1d_arguments &
mutable_dispatch_image_2d_arguments:**
As the images are created using CL_UNSIGNED_INT8, the kernel does not
use correct instructions, as they are designed for signed variable. This
fix consists of modifying the kernel code to use unsigned instructions
and auxiliary variables .
Ensure clEnqueueReleaseExternalMemObjectsKHR targets imported_image
instead of the non-external opencl_image, matching the prior acquire
call.
Signed-off-by: Xin Jin <xin.jin@arm.com>
Actions test plan from
https://github.com/KhronosGroup/OpenCL-CTS/issues/2473 to update CTS
tests to reflect changes from cl_khr_command_buffer PR
https://github.com/KhronosGroup/OpenCL-Docs/pull/1411
* Adds new test in`command_buffer_pipelined_enqueue.cpp` for multiple
enqueues without blocking in-between, but serialized execution.
* Removed test for `CL_COMMAND_BUFFER_STATE_PENDING_KHR` state query.
* Remove negative test for `clEnqueueCommandBuffer` pending state error.
* Simplify `cl_khr_command_buffer` tests that stress simultaneous-use by
testing multiple serialized enqueues of the same command-buffer, which
doesn't now require the device imultaneous-use capability
* Remove simultaneous-use command-buffer creation in base class to off,
and require tests do it themselves if they require it.
* Rewrite mutable dispatch simultaneous test to test updating both
pipelined enqueues, and updating the new definition of simultaneous-use
---------
Co-authored-by: Ewan Crawford <ewan@codeplay.com>
The test had critical buffer overflow issues:
1. Buffer size was calculated incorrectly: used update_elements (4)
instead
of total work items. For 3D kernels, this meant allocating 16 bytes
when 64*4=256 bytes were needed for the updated 4x4x4 grid.
2. Original 2x2x2 grid writes 8 elements (32 bytes) but buffer was only
16 bytes, causing overflow on first execution.
3. Updated 4x4x4 grid writes 64 elements (256 bytes) with massive
overflow into adjacent memory.
4. Verify function only checked one dimension instead of total elements
in the 3D grid.
Fixed by:
- Calculating total work items as product of all dimensions
- Using update_total_elements (64) for buffer allocation
- Updating Verify calls to check correct number of elements
- Adding constants for original_total_elements and update_total_elements
Adds tests to cover points 2 & 3 from the questions asked about
cl_khr_command_buffer_mutable_dispatch in
https://github.com/KhronosGroup/OpenCL-Docs/issues/1437
* New test for point 2 from issue, `mutable_dispatch_updates_persist`,
testing multiple enqueues of a command-buffer after update, and that the
updated argument persists for all of them.
* New test for point 3 pseudocode from issue in test
`mutable_dispatch_set_kernel_arg`
This is a very small subset of the changes in #2477 to get things
building again, since the command-buffer pending state is no longer in
the spec or headers.
The check implementated by that Skip function is already implemented in
'InfoMutableCommandBufferTest::Skip()'.
Also this is trying to get the extension_version before checking whether
the extension is supported, leading to false negative for device not
supporting the extension.
Update the test to use
`CL_EXTERNAL_MEMORY_HANDLE_ANDROID_HARDWARE_BUFFER_KHR` instead of
`CL_EXTERNAL_MEMORY_HANDLE_AHB_KHR` to match the headers.
Handle missing format in switch statement.
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
New test following on from OpenCL-Docs Issues discussion
https://github.com/KhronosGroup/OpenCL-Docs/issues/1390#issuecomment-3023818903
Noting that we have no test coverage for using the original value of
`work_dim` during command-buffer update. All of our current CTS testing
uses `0` for the `work_dim` to signify no update from the original
value, however this test explicitly uses the original value.
Prior to this change, both `clEnqueueReadBuffer` calls before and after
updating the command buffer were writing to the same `output_buffer`,
causing a data race condition and the first call's result to be
overwritten. This commit introduces separate destination vectors
(`output_buffer` and `updated_output_buffer`) for these operations and
verifies both results independently to ensure test integrity.
Memory objects created in `EnqueueSimultaneousPass()` are used by
kernels
that don't execute until the user event is signaled. Without retaining
these objects, they would be destroyed before the deferred kernel
execution occurs.
While the cl_khr_semaphore extension spec does state that are no
implicit dependencies between already enqueued commands and the
clEnqueueSignalSemaphoresKHR, it's nothing special as this is already
true for any other event that's not a barrier or marker.
Also, the CTS can't reliably assume implementations to reorder events
even in an out of order queue as this is highly implementation defined
behavior and implementations may or may not choose to reorder events in
a specific order.
I don't see a reason why this should be tested for semaphores, but not
for any other commands, especially as it imposes a restriction on how to
implement out of order queues that wasn't enforced before.
Closes: https://github.com/KhronosGroup/OpenCL-CTS/issues/2439
Add cl_khr_command_buffer test that is it valid to release a
command-buffer after it has been enqueued but before execution is
finished.
This stresses the semantics from
[clReleaseCommandBufferKHR](https://registry.khronos.org/OpenCL/sdk/3.0/docs/man/html/clReleaseCommandBufferKHR.html#_description)
that: "After the command_buffer reference count becomes zero **and has
finished execution**, the command-buffer is deleted"
In CMake 3.24+, there is built-in support for adding -Werror that does
not require adding -Werror explicitly, and allows it to be downgraded to
a warning if the user wants that. Use this, to account for warnings that
have false positives.
This change provides partial test coverage for
KhronosGroup/OpenCL-Docs#1280
Adding CTS tests for:
1. clEnqueueMapBuffer, clEnqueueMapImage.
2. Command buffer negative tests.
3. clSetKernelArgs negative tests.
The bulk of the tests is to make sure that the CL driver does not allow
writing to a memory object that is created with `CL_MEM_IMMUTABLE_EXT`
flag when used with the above APIs.
---------
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
Remove the `CREATE_OPENCL_SEMAPHORE` macro and use derived class
instantiations of the `clExternalSemaphore` class, rather than base
pointers to derived class objects.
Remove the default argument for `queryParamName` in
`check_external_semaphore_handle_type()`.
Move `check_external_semaphore_handle_type()` checks to constructors of
`clExternalImportableSemaphore` and `clExternalExportableSemaphore`,
rather than manually making the check before creating an external
semaphore.
---------
Signed-off-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
Co-authored-by: Kévin Petit <kpet@free.fr>
Co-authored-by: Kevin Petit <kevin.petit@arm.com>
Both `REGISTER_TEST` and `REQUIRE_EXTENSION` expect cl_device_id
variable but the variable name is inconsistent which makes both macros
unusable together.
This change renames `deviceID` in `REQUIRE_EXTENSION` to `device` to be
consistent with `REGISTER_TEST`.
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
`semaphores_ooo_ops_cross_queue` uses two OOO command queues to run the
test. In one queue, a semaphore is signalled and the semahpore is waited
on in the other queue.
The CL specification requires the application to synchronize the queues
if objects are shared.
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
This adds support for allocating DMA buffers on systems that support it,
i.e. Linux and Android.
On mainline Linux, starting version 5.6 (equivalent to Android 12),
there is a new kernel module framework available called [DMA-BUF
Heaps](https://github.com/torvalds/linux/blob/master/drivers/dma-buf/dma-heap.c).
The goal of this framework is to provide a standardised way for user
applications to allocate and share memory buffers between different
devices, subsystems, etc. The main feature of interest is that the
framework provides device-agnostic allocation; it abstracts away the
underlying hardware, and provides a single IOCTL,
`DMA_HEAP_IOCTL_ALLOC`. Mainline implementation provides two heaps that
act as character devices that can allocate DMA buffers; system, which
uses the buddy allocator, and cma, which uses the
[CMA](https://developer.toradex.com/software/linux-resources/linux-features/contiguous-memory-allocator-cma-linux/)
(Contiguous Memory Allocator). Both of these are [kernel configuration
options](https://github.com/torvalds/linux/blob/master/drivers/dma-buf/heaps/Kconfig)
that need to be enabled when building the Linux kernel. Generally, any
kernel module implementing this framework is made available under
/dev/dma_heaps/<heap_name>, e.g. /dev/dma_heaps/system.
The implementation currently only supports one type of DMA heaps;
`system`, the default device path for which is `/dev/dma_heap/system`.
The path can be overridden at runtime using an environment variable,
`OCL_CTS_DMA_HEAP_PATH_SYSTEM`, if needed. Extending this in the future
should be trivial (subject to platform support), by adding an entry to
the enum `dma_buf_heap_type`, and an appropriate default path and
overriding environment variable name.
The proposed implementation will conditionally compile if the conditions
are met (i.e. building for Linux or Android, using kernel headers >=
5.6.0), and will provide a compile-time warning otherwise, and return
`-1` as the DMA handle in runtime if not.
To demonstrate the functionality, a new test is added for the
`cl_khr_external_memory_dma_buf` extension. If the extension is
supported by the device, a DMA buffer will be allocated and used to
create a CL buffer, that is then used by a simple kernel.
This should provide a way forward for adding more tests that depend on
DMA buffers.
---------
Signed-off-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Co-authored-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
fixes#2247
* For the `negative_create_command_buffer_not_supported_properties`
test, the only property we can check for is simultaneous use. All other
properties are part of other extensions and hence will generate
`CL_INVALID_VALUE`, not `CL_INVALID_PROPERTY`.
* Checks whether the `cl_khr_command_buffer_multi_device` extension is
supported when using `CL_COMMAND_BUFFER_DEVICE_SIDE_SYNC_KHR`, instead
of `device_side_enqueue_support`.
* If the `cl_khr_command_buffer_multi_device` extension is NOT supported
and the `CL_COMMAND_BUFFER_DEVICE_SIDE_SYNC_KHR` command buffer creation
flag is used, the expected error code is `CL_INVALID_VALUE`, not
`CL_INVALID_PROPERTY`.
Update cl_khr_command_buffer tests to reflect changes from
https://github.com/KhronosGroup/OpenCL-Docs/pull/1292
* Moves negative test for
`CL_DEVICE_COMMAND_BUFFER_SUPPORTED_QUEUE_PROPERTIES_KHR` from
command-buffer creation to enqueue.
* Moves negative test for
`CL_DEVICE_COMMAND_BUFFER_REQUIRED_QUEUE_PROPERTIES_KHR` from
command-buffer creation to enqueue.
* Introduces a negative test for `CL_INVALID_DEVICE` on command-buffer
enqueue for new error condition in spec. Although it requires a context
to be contain more than 1 device, which I'm not sure if possible in
current test framework.
* Introduces a new test that created a command-buffer using a queue
without the profiling property set, then enqueues the command-buffer to
a queue with the profiling property set.
* Introduces a new test that creates a command-buffer with an in-order
queue, enqueued on an out-of-order queue.
* Introduces a new test that creates a command-buffer with an
out-of-order queue, enqueued on an in-order queue.
It was noticed during another PR review
https://github.com/KhronosGroup/OpenCL-CTS/pull/2207/files#r1903921283
that there
was a case where the return value of a `Skip()` check was ignored, this
is fixed in this PR.
I've also tracking down occurrences of derived class overriding the
`Skip()` test fixture method, but not calling the parents class `Skip()`
check inside of the method. I believe omitting this parent skip check
wasn't intentional, it's clearer to explicitly respect the parent
classes skip conditions, even if we've got away with not needing too due
to the way the derived class skip conditions have been defined.
As described in Issue
https://github.com/KhronosGroup/OpenCL-CTS/issues/2152 we currently
check the provisional extension version supported by a vendor is the
same or older than a test specified version. However, it was discussed
in the WG that this should be a check for equality to avoid hitting
issues when a implementation tests against an older version of the CTS
using a lower extension version, with API breaking changes having
occurred since.
The tests for the command-buffer family of extensions are the only
provisional KHR tests using this versioning check, so this PR updates
all those cases to equality.
When creating a CL semaphore object from a Vulkan semaphore one, we
explicitly pass `-1` as the file descriptor value in the case of
`VULKAN_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD`. According to the CL
specification:
> The special value -1 for fd is treated like a valid sync file
> descriptor referring to an object that has already signaled. The
> import operation will succeed and the semaphore will have a
> temporarily imported payload as if a valid file descriptor had
> been provided.
The test currently checks that the semaphore payload is unsignalled,
unconditionally, which is incorrect.
Changed the test to check for the correct expected payload value.
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Current class wrapper of the CTS test framework allows getting the
pointer of the private member object. This puts the object at risk of
losing its original value if the pointer gets reassigned, causing a
memory leak and potentially other problems.
This happens to the "clMemWrapper kernel" used by
mutable_command_work_groups tests, where the "clMemWrapper kernel" gets
initialised by the default basic setup function and then it gets
reassigned by the build_program_create_kernel_helper() helper function
through pointer.
This patch fixes this issue by updating mutable_command_work_groups
tests: instead of calling basic setup function and then initialise the
"clMemWrapper kernel" object again in the helper function, it now
overrides the basic setup function to make sure the "clMemWrapper
kernel" will be assigned only once.
Signed-off-by: Xin Jin <xin.jin@arm.com>
Currently Intel® C++ Compiler Classic (ICC) is supported to build
OpenCL-CTS on Windows. This compiler has been discontinued since the
second half of 2023. Instead, Intel recommends that users transition to
use the LLVM-based Intel® oneAPI DPC++/C++ Compiler (ICX).
This change is to enable users to build OpenCL-CTS with ICX on Windows.