The test now checks that CL_KERNEL_ARG_INFO_NOT_AVAILABLE is returned
when calling clGetKernelArgInfo() with offline compilation modes.
The correct function name is printed if clGetKernelArgInfo() fails
when using online compilation (and not "clSetKernelArgInfo()").
When using online compilation, if the actual arg type is not as
expected, the actual arg type is now logged, and the return value
is now TEST_FAIL (-1) as per other failures (and not 1).
All other test pass/fail values used in the test now use TEST_PASS
and TEST_FAIL instead of 0 and -1 literals.
An unnecessary cast of pipe_kernel_code has been removed.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
This changes compilation of subgroup test kernels so that a separate
compilation is no longer performed for each divergence mask value.
The divergence mask is now passed as a kernel argument.
This also fixes all subgroup_functions_non_uniform_arithmetic testing
and the sub_group_elect and sub_group_any/all_equal subtests of the
subgroup_functions_non_uniform_vote test to use the correct order of
vector components for GPUs with a subgroup size greater than 64.
The conversion of divergence mask bitsets to uint4 vectors has been
corrected to match code comments in WorkGroupParams::load_masks()
in test_conformance/subgroups/subhelpers.h.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Fix test_api get_command_queue_info
Decouple host and device out-of-order test enabling
* Rename property sets more generically
* Refactor to use std::vector to accumulate test permutations
* Set safe input values for half type and mul, add operations
* Set safe values for all data types
* Typo fix
* Set constant seed for shuffle
* Change function name to more specific
* set_value takes an integer value, not a bit pattern
Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* test api - fix code formatting only
* Fix printing cl_ulong type to avoid overloading.
* Fix printing size_t data type
* Fix printing size_t data type - set unsinged
* Fix formatting for maxArgs (uint) and numberOfInts (size_t)
It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.
Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
The way that program sources were being constructed involved capturing
pointers to strings that were allocated on the stack, and then trying
to use them outside of that scope. This change uses a stringstream
defined in the outer scope to build the program instead.
The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit. Fix this.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Extended subgroups - use 128bit masks
* Refactoring to avoid kernels code duplication
* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define kernel for specific one subgroup function
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
* Report unsupported extended subgroup tests as skipped rather than passed
Also don't check the presence of extensions for each sub-test.
Signed-off-by: Kévin Petit <kpet@free.fr>
* address review comments
* Update cl_khr_integer_dot_product tests for v2
Signed-off-by: Kevin Petit <kevin.petit@arm.com>
Signed-off-by: Marco Cattani <marco.cattani@arm.com>
Change-Id: I97dbd820f1f32f6b377e47d0bf638f36bb91930a
* only query acceleration properties with v2+
Change-Id: I3f13a0cba7f1f686365b10adf81690e089cd3d74
* gles: Fix double frees.
Remove a few explicit frees in the redirect_buffers test which are
already handled by a wrapper.
* gles: Fix double frees
A recent update to the object wrapper classes (#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.
Co-authored-by: spauls <spauls@qti.qualcomm.com>
* Fix memory model issue in atomic_flag.
In atomic_flag sub-tests that modify local memory, compilers may re-order memory accesses between the local and global address spaces which can lead to incorrect test failures.
This commit ensures that both local and global memory operations are fenced to prevent this re-ordering from occurring.
Fixes#134.
* Clang format changes.
* Added missing global acquire which is necessary for the corresponding global release.
Thanks to @jlewis-austin for spotting.
* Clang format changes.
* Match the condition for applying acquire/release fences.
* Temporarily disable the test_kernel_attributes test case
Per OpenCL spec on CL_KERNEL_ATTRIBUTES, for kernels not created from OpenCL C
source and the clCreateProgramWithSource API call the string returned from this
query will be empty.
But in test_kernel_attributes test, it read from bc binary and expect to get
kernel attribute, which is not consistent with OpenCL spec.
* Fix clang format issue
* Add tests for entrypoint cl_khr_suggested_local_work_size
Tests added within test_conformance/workgroups. The tests cover several
shapes (num dimensions) and sizes of global work size, kernels using
local memory (dynamic and static) and present/non-present global work
offset.
Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>
* Fix in comparison for error checking
Signed-off-by: Kallia Chronaki <kallia.chronaki@arm.com>
* 'test_wg_suggested_local_work_size' fixes
* Refactoring of 'test_wg_suggested_local_work_size'
Modifications to reduce code duplication and minimize build time
* subgroups: Fix setting cl_halfs and progress check.
cl_float testing uses set_value such that a generated cl_ulong of 1 is
stored as 1.0F in a logical sense. However, cl_half values aren't
intrinsic to C++ and generated cl_ulongs less than 1024 in particular
are interpreted bitwise as subnormals. The test fails on compute devices
lacking subnormal support. Perform the logical conversion to cl_half.
Fix independent forward progress check.
* subgroups_half: Address review comments
* subgroups_half: Formatting fixes required by check-format
* subgroups_half: Modified to query and use rounding mode supported by device
Co-authored-by: spauls <spauls@qti.qualcomm.com>
* add basic test for cl_khr_pci_bus_info
* correctly use TEST_SKIPPED_ITSELF
Co-authored-by: Kévin Petit <kpet@free.fr>
* fix related usage of TEST_SKIPPED_ITSELF
Co-authored-by: Kévin Petit <kpet@free.fr>
A recent update to the object wrapper classes (#1268) changed the
behavior of assigning to a wrapper, whereby the wrapped object is now
released upon assignment. A couple of tests were manually calling
clReleaseMemObject and then assigning `nullptr` to the wrapper,
resulting in the wrapper calling clReleaseMemObject on an object that
had already been destroyed.
* Remove unnecessary code
These custom equality operators are not necessary because of the
conversion operators which already allow using the standard equality
operators between two pointers.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Fix copy and move semantics of wrapper classes
Related to #465.
The Wrapper classes are rewritten to properly handle copy and move
semantics, while preserving the existing API and removing code
duplication.
Add error handling around clRelase* and clRetain*.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Address build issue on 32-bit Windows
Include linkage in RetainReleaseType function type.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Minor fixes for CL_UNORM_SHORT_565, CL_UNORM_SHORT_555
* Fix verification for undefined bit
* Relax current infinitely precision requirement for these formats
and move check in common function.
* Add proper debug output.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
* Minor Formating fix.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
The CL_UNORM_SHORT_555 and CL_UNORM_INT_101010 formats contain padding
bits which need to be ignored in clCopyImage and clFillImage testing.
For clFillImage tests, padding was not ignored for the CL_UNORM_SHORT_555
format, and was ignored for CL_UNORM_INT_101010 by modifying actual and
reference data. For clCopyImage tests, padding was not ignored, both for
CL_UNORM_SHORT_555 and for CL_UNORM_INT_101010.
Fix this by adding a new compare_scanlines() function, which is used for
both of these formats, and does not modify the actual or reference data.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Avoid manual memory management
Prefer std::vector over malloc and free. This will allow removing goto
statements by leveraging RAII.
Use appropriate type (bool) to store overflow predicates and allocate
std::vector<bool> of appropriate sizes: before this change the
allocation was unnecessary bigger than required.
No longer attempt to catch "out of host memory" issues, given that in
such situation it is generally not possible to cleanly report an error.
Rely on std::bad_alloc exception to report such issues.
Introduce a new header for common code in the math_brute_force
component. It is currently complementary to utility.h and is expected to
hold cleaned up content extracted from future refactoring operations.
List all headers as source in CMake for better compatibility with IDEs.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Remove manual or unnecessary memset
In order to use non-POD types as fields of TestInfo, memset must be
replaced with a compatible zero-initialisation.
Remove an unnecessary memset in MakeKernels.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
A program having a type (such as ThreadInfo) defined differently in
multiple translation units exhibits undefined behaviour.
This commit fixes such issues in the math_brute_force component by
ensuring most types are local to their translation unit with the help of
anonymous namespaces. Later refactoring will be able to extract common
definitions to a single place.
This patch also removes unnecessary static and typedef keywords.
Otherwise, code is only moved around with no change.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>