The main source of warnings was the use of `%d` for printing a
templated type `T`, where `T` could be any cl_ scalar or vector type.
Introduce `print_expected_obtained`. It takes const references to
handle alignment of the cl_ types.
Define `operator<<` for all types used by the subgroup tests. Ideally
those would be template functions enabled by TypeManager data, but
that requires some more work on the TypeManager (which we'd ideally do
after more warnings have been enabled). So for now, define the
`operator<<` instances using preprocessor defines.
Also fix a few instances where the wrong format specifier was used for
`size_t` types.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
As per the OpenCL Extension Specification § 38.6 Ballots:
If no bits representing predicate values from all work items in
the subgroup are set in the bitfield value then the return value
is undefined.
The case with no bits set is still worth testing, as it does not result
in undefined behavior, but only an undefined return value.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
Note that this also corrects the start messages logged for the
sub_group_ballot_bit_count/find_msb/find_lsb tests.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
It seems more intuitive to set only the bits that are required, rather
than to set one more bit than is required, only to clear it again.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
sub_group_ballot_bit_count() and sub_group_ballot_find_msb() mask
their input according to a subgroup size, which is assumed to be the
maximum subgroup size, and not the actual subgroup size excluding
non-existent work-items in the "remainder" subgroup.
Fix this as per the the clarification made to the OpenCL C specification
in revision 3.0.9 for issue KhronosGroup/OpenCL-Docs#626 by pull request
KhronosGroup/OpenCL-Docs#689.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
The tests were logging scalar results as vectors padded with zeroes for
no apparent benefit. Fix this.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Extended subgroups - use 128bit masks
* Refactoring to avoid kernels code duplication
* unification kernel names as test_ prefix +subgroups function name
* use string literals that improve readability
* use kernel templates that limit code duplication
* WorkGroupParams allows define default kernel - kernel template for multiple functions
* WorkGroupParams allows define kernel for specific one subgroup function
Co-authored-by: Stuart Brady <stuart.brady@arm.com>
* Report unsupported extended subgroup tests as skipped rather than passed
Also don't check the presence of extensions for each sub-test.
Signed-off-by: Kévin Petit <kpet@free.fr>
* address review comments