- Make it a common parameter in harness using either '-w', '--wimpy' or
'CL_WIMPY_MODE' environment variable.
- Remove all test specific wimpy variable.
---------
Co-authored-by: Kévin Petit <kpet@free.fr>
'-list' option is used to print all sub-tests. But some test do not
support it at all. And all test do not display it the same way, making
it quite complicated for external tools to extract them.
That CL clean the usage so that tests:
- Print the sub-tests list with either '-list' (to prevent breaking
legacy usage) or '--list' (to match other options)
- Do not print anything else when the option is used
fixes#2387
Corrects the "correctly rounded" behavior for the math bruteforce tests.
Specifically:
* Only applies the `-cl-fp32-correctly-rounded-divide-sqrt` build option
for the `divide_cr` and `sqrt_cr` tests. The other tests do not receive
this build option. This means that there is a difference in the behavior
of the `divide` and `divide_cr` tests and the `sqrt` and `sqrt_cr`
tests, and the "correctly rounded" build option is not applied to the
fp16 or fp64 tests.
* Removes the build option to toggle testing the correctly rounded
divide and square root tests since it no longer needed. Instead, the
test names can be used to choose whether to test the correctly rounded
functions or the non-correctly rounded functions.
Additionally:
* Relaxes the fp16 sqrt accuracy requirements to 1 ULP. This is needed
to pass this test on some of our devices. This part is still under
discussion, so I will keep this PR as a draft until it is settled.
* Ulp_Error*: ilogb(reference) - 1 may overflow if reference is zero.
* binary_i_double Test: DoubleFromUInt32's result is a cl_double and the
attempt is to store it as a cl_double, but p was defined as a pointer to
cl_ulong, resulting in an unintended implicit conversion that is not
valid for out-of-range doubles.
* exp2, tanpi: ensure early exit for NaN.
* shift_right_sticky_128: avoid out-of-range shift if shift value is
exactly 64.
* scalbn: e += n may overflow if n is large, move it after the check for
large n.
`LogBuildError` was only ever called after `clSetKernelArg`, but setting
a kernel argument has no impact on the program build log. Printing of
the actual build log in case of a build failure is already handled via
`create_single_kernel_helper`.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
fixes#2145
As suggested by @svenvh reciprocal has different precision requirements
than divide. This PR introduces special path for reciprocal for
binar_float_operator to test reciprocal with relaxed math. If this PR
will get approvals, invalidate PR #2162
- to be able to have deterministic results it is useful to have a
mechanism
to force the same count of workers
- this commit doesn't change the default settings but expands
functionality
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
It was disabled because lack of reference implementation. However the
reference implementation exists. Then no reason to start testing these
functions.
This makes it literally impossible for drivers to constant fold the
IsTininessDetectedBeforeRounding kernel. Sure, drivers might have should
respect volatile here, but I'm not convinced this is actually required
by the spec in a very strict sense, because here there are no
side-effects possible in the first place.
And as far as I know, constant folding is allowed to give different
results than an actual GPU calculation would.
In any case, passing the constants via kernel arguments makes this
detection more reliable and one doesn't have to wonder why the fma test
is failing.
Side note: this was the last bug (known as of today) I had to fix in
order being able to make a CL CTS submission for Apple Silicon devices.
When the math brute force test printed the platform version it always
printed information for the first platform in the system, which could
be different than the platform for the passed-in device. Fixed by
querying the platform from the passed-in device instead.
* grab latest from upstream OpenCL
* Use clEnqueueFillBuffer rather than memset4 in all test files
* Cleanup leftover code from memset_pattern4
* Remove unnecessary map, unmap, writeBuffer from math_brute_force tests
* Remove extraneous build system change
* Appease clang-format
* Add option to perform buffer fills on the host
Co-authored-by: Taeten Prettyman <taeten.j@gmail.com>
Co-authored-by: taetenp <taet@holochip.com>
Co-authored-by: Chip Davis <chip@holochip.com>
Simplify code by relying on RAII to free resources. Reduce code
duplication.
This commit only affects tests that use `BuildKernelInfo`, which are
the multi-threaded tests. Another patch will deal with the
single-threaded tests, i.e., those using `BuildKernelInfo2`.
Original patch by Marco Antognini.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* math_brute_force: Fix -Wformat warnings
The main sources of warnings were:
* Printing of 64-bit types, which is now done using the `PRI*64`
macros from <cinttypes> to ensure portability across 32 and 64-bit
builds.
* Printing of `size_t` types that lacked a `z` length modifier.
* Printing of values with a `z` length modifier that weren't a
`size_t` type.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* [NFC] math_brute_force: clang-format after -Wformat changes
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Improve the design of the MTdataHolder wrapper:
* Make it a class instead of a struct with a private member, to make
it clearer that there is no direct access to the MTdata member.
* Make the 1-arg constructor `explicit` to avoid unintended
conversions.
* Forbid copy construction/assignment as MTdataHolder is never
initialised from an MTdataHolder object in the codebase.
* Define move construction/assignment as per the "rule of five".
Use the MTdataHolder class throughout math_brute_force, to simplify
code by avoiding manual resource management.
Original patch by Marco Antognini.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Fix "‘nadj’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]".
* Fix "specified bound 4096 equals destination size
[-Werror=stringop-truncation]".
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Commit 9666ca3c ("[NFC] Fix sign-compare warnings in math_brute_force
(#1467)", 2022-08-23) inadvertently changed the semantics of the if
condition. The `i > gEndTestNumber` comparison was relying on
`gEndTestNumber` being promoted to unsigned. When casting `i` to
`int32_t`, this promotion no longer happens and as a result any tests
given on the command line were being skipped.
Use an unsigned type for `gStartTestNumber` and `gEndTestNumber` to
eliminate the casts and any implicit conversions between signed and
unsigned types.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Simplify code by avoiding manual resource management.
This allows removing clReleaseProgram from `MakeKernels` to reduce
behavioral differences between `MakeKernels` and `MakeKernel`.
Original patch by Marco Antognini.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Avoid manual memory management
Prefer std::vector over malloc and free. This will allow removing goto
statements by leveraging RAII.
Use appropriate type (bool) to store overflow predicates and allocate
std::vector<bool> of appropriate sizes: before this change the
allocation was unnecessary bigger than required.
No longer attempt to catch "out of host memory" issues, given that in
such situation it is generally not possible to cleanly report an error.
Rely on std::bad_alloc exception to report such issues.
Introduce a new header for common code in the math_brute_force
component. It is currently complementary to utility.h and is expected to
hold cleaned up content extracted from future refactoring operations.
List all headers as source in CMake for better compatibility with IDEs.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Remove manual or unnecessary memset
In order to use non-POD types as fields of TestInfo, memset must be
replaced with a compatible zero-initialisation.
Remove an unnecessary memset in MakeKernels.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Cleanup usage of static, extern and typedef
Remove static on functions defined headers, as it can result in
duplication in binaries.
Remove unnecessary extern keyword on a function declaration, as it is
the default behavior and can be puzzling when reading the code.
Remove the unused declaration of my_ilogb, which is never defined.
Remove unnecessary usage of typedef, as they are only increasing the
cognitive load of the code for no purpose.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Improve usage of inline and static in harness
Functions declared in header as static can trigger unused warnings when
(indirectly) included in translation units that do not use such
functions. Use inline instead, which also avoids duplicating symbols in
binaries.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Fix heap-buffer-overflow reported by AddressSanitizer: ensure the
appropriate number of elements are allocated for the list of tests.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
gWimpyBufferSize is never modified and is actually not used to modify
the number of tests -- gWimpyReductionFactor is used for that purpose by
some tests, but not all.
This patch removes this unnecessary global variable to simplify the
codebase, and reduce differences between tests.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Make variables and functions local to translation unit
Make some global variables local to function, or remove them when
actually dead.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Address comments
Remove unused code.
Reduce scope of gDoubleCapabilities.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Remove code for runtime measurement
The GetTime() and associated functions are not fully implemented on
Linux. This functionality is assumed to be untested, or unused at best.
Reduce differences between tests by removing this unnecessary feature.
It can be (re-)implemented later, if desired, once the math_brute_force
component is in better shape.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Coalesce if-statements
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Keep else branch
Address comments.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
To reduce differences between tests, remove APPLE specific code from
unary tests as no other test have similar logic.
Ensure gMeasureTimes is consistently initialised regardless of operating
systems to ensure a consistent command line interface.
The remaining APPLE specific pieces of code relate either to include
paths, or to the implementation of PreventSleep(), ResumeSleep() and
GetTime(). Those are not removed in this commit.
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
Ensure the following forms of command lines are supported, as per usage
message (-h):
- math_brute_force [<name1> [<name2> ... [<nameN>]]]
- math_brute_force I [J]
Remove dead/unnecessary code.
Fix regression introduced in f337e0b6 ( Fix command-line function range
for bruteforce (#1127), 2021-01-29).
Signed-off-by: Marco Antognini <marco.antognini@arm.com>
* Fix enqueue_flags test to use correct barrier type.
Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.
* Add check for support for Read-Wrie images
Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.
Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.
Fixes issue #894
* Fix formatting in case of Read-Write image checks.
Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests
* Fix some more formatting for RW-image checks
Remove unnecessary spaces at various places.
Also, fix lengthy lines.
* Fix malloc-size calculation in test imagedim
unsigned char size is silently assumed to be 1
in imagedim test of test_basic.
Pass sizeof(type) in malloc size calculation.
Also, change loop variable from signed to unsigned.
Add checks for null pointer for malloced memory.
* Fix command-line function range for bruteforce
Runnning "test_bruteforce N M" is expected to skip
first N functions and test M functions after it.
When N is 0, the test currently skips M functions
and run all functions thereafter.
Fix the test to honor semantics of these
command-line options to correctly test
first M functions when N is 0.
* Using helper functions for clCreateKernel
Uses of clCreateKernel following create program helper
functions, have been incorporated into
create_single_kernel_helper when suitable.
Contributes #31
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Skip tests using clCompileProgram in offline mode
Contributes #31
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Using type wrappers when using kernel helper functions
Also includes fix for windows build
Fixes#31
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Remove clReleaseKernel for wrapped kernel
Fixes#31
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Permit half overflow within allowable ULP
Modify the algorithm for calculating half precision ULP error so
that it duplicates the behaviour of the single precision ULP algorithm,
in regards to allowing overflow within the defined ULP error.
In the case where the test value is infinity, but the reference is
finite, pretend the test value is 63336.0 and calculate the ULP error
against that.
Encountered this while testing half precision `hypot()` in PR !529,
for inputs `hypot(-48864.0, 43648.0)` which has reference
`65519.755799`. With RTE rounding this only just rounds to `65504` as half,
and returning INF is currently infinite ULP error. Using the leniency
introduced by this change however the error is `~0.5` within the `2` ULP
bounds defined by the spec.
* Run clang-format over changes
Code now conforms to style guidelines and allows `check-format.sh` to pass.
* OpenCL versions before 2.0 do not have precision requirements for
reduced precision math.
* Skip reduced precision testing for devices with
versions < 2.0.
* The global variable `gTestFastRelaxed` has state which is used to
control the behaviour of the compiler flag `-cl-fast-relaxed-math` and
the precision testing of relaxed, fp32 and fp64 types. This is confusing
since the global variable is being set and read in different translation
units, making it very difficult to reason about the logic of the brute
force framework. It is particular difficult to follow since the global
variables is cached and then turned off in the case of fp32 and f64 in
order to use the same code path as relaxed testing, after it is then
turned back on.
* Remove uses of the global variable outside of `main.cpp` (the global
variable remains in use within `main.cpp` since it is a command line
option and used to turn of relaxed testing completely). Replace all uses
of the global variable with boolean `relaxedMode` which is passed as a
function paramter but replaces `gTestFastRelaxed` semantically.
* Enable -Werror for GCC/Clang builds
Fixes many of the errors this produces, and disables a handful that
didn't have solutions that were obvious (to me).
* Check for `-W*` flags empirically
* Remove cl_APPLE_fp64_basic_ops support
* Undo NAN conversion fix
* Add comments to warning override flags
* Remove unneeded STRINGIFY definition
* Fix tautological compare issue in basic
* Use ABS_ERROR macro in image tests
* Use fabs for ABS_ERROR macro
* Move ABS_ERROR definition to common header
(Patch2)
A number of tests have got their own code for checking the presence of
extensions. This change replaces that code with is_extension_available
function.
Contributes to #627
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
Change-Id: I17e007e5ad009e522c5006c42537bf1170550a6f