Files
OpenCL-CTS/test_conformance/math_brute_force
Nikhil Joshi a7c33f8dc4 Add ffp-contract=off Compilation flag for CTS build (#1824)
* Fix enqueue_flags test to use correct barrier type.

Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.

* Add check for support for Read-Wrie images

Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.

Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.

Fixes issue #894

* Fix formatting in case of Read-Write image checks.

Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests

* Fix some more formatting for RW-image checks

Remove unnecessary spaces at various places.
Also, fix lengthy lines.

* Fix malloc-size calculation in test imagedim

unsigned char size is silently assumed to be 1
in imagedim test of test_basic.
Pass sizeof(type) in malloc size calculation.
Also, change loop variable from signed to unsigned.
Add checks for null pointer for malloced memory.

* Initial CTS for external sharing extensions

Initial set of tests for below extensions
with Vulkan as producer
1. cl_khr_external_memory
2. cl_khr_external_memory_win32
3. cl_khr_external_memory_opaque_fd
4. cl_khr_external_semaphore
5. cl_khr_external_semaphore_win32
6. cl_khr_external_semaphore_opaque_fd

* Updates to external sharing CTS

Updates to external sharing CTS
1. Fix some build issues to remove unnecessary, non-existent files
2. Add new tests for platform and device queries.
3. Some added checks for VK Support.

* Update CTS build script for Vulkan Headers

Update CTS build to clone Vulkan Headers
repo and pass it to CTS build
in preparation for external memory
and semaphore tests

* Fix Vulkan header path

Fix Vulkan header include path.

* Add Vulkan loader dependency

Vulkan loader is required to build
test_vulkan of OpenCL-CTS.
Clone and build Vulkan loader as prerequisite
to OpenCL-CTS.

* Fix Vulkan loader path in test_vulkan

Remove arch/os suffix in Vulkan loader path
to match vulkan loader repo build.

* Fix warnings around getHandle API.

Return type of getHandle is defined
differently based on win or linux builds.
Use appropriate guards when using API
at other places.
While at it remove duplicate definition
of ARRAY_SIZE.

* Use ARRAY_SIZE in harness.

Use already defined ARRAY_SIZE macro
from test_harness.

* Fix build issues for test_vulkan

Fix build issues for test_vulkan
1. Add cl_ext.h in common files
2. Replace cl_mem_properties_khr with cl_mem_properties
3. Replace cl_external_mem_handle_type_khr with
cl_external_memory_handle_type_khr
4. Type-cast malloc as required.

* Fix code formatting.

Fix code formatting to
get CTS CI builds clean.

* Fix formatting fixes part-2

Another set of formatting fixes.

* Fix code formatting part-3

Some more code formatting fixes.

* Fix code formatting issues part-4

More code formatting fixes.

* Formatting fixes part-5

Some more formatting fixes

* Fix formatting part-6

More formatting fixes continued.

* Code formatting fixes part-7

Code formatting fixes for image

* Code formatting fixes part-8

Fixes for platform and device query tests.

* Code formatting fixes part-9

More formatting fixes for vulkan_wrapper

* Code formatting fixes part-10

More fixes to wrapper header

* Code formatting fixes part-11

Formatting fixes for api_list

* Code formatting fixes part-12

Formatting fixes for api_list_map.

* Code formatting changes part-13

Code formatting changes for utility.

* Code formatting fixes part-15
Formatting fixes for wrapper.

* Misc Code formatting fixes

Some more misc code formatting fixes.

* Fix build breaks due to code formatting

Fix build issues arised with recent
code formatting issues.

* Fix presubmit script after merge

Fix presubmit script after merge conflicts.

* Fix Vulkan loader build in presubmit script.

Use cmake ninja and appropriate toolchain
for Vulkan loader dependency to fix
linking issue on arm/aarch64.

* Use static array sizes

Use static array sizes to fix
windows builds.

* Some left-out formatting fixes.

Fix remaining formatting issues.

* Fix harness header path

Fix harness header path
While at it, remove Misc and test pragma.

* Add/Fix license information

Add Khronos License info for test_vulkan.
Replace Apple license with Khronos
as applicable.

* Fix headers for Mac OSX builds.

Use appropriate headers for
Mac OSX builds

* Fix Mac OSX builds.

Use appropriate headers for
Mac OSX builds.
Also, fix some build issues
due to type-casting.

* Fix new code formatting issues

Fix new code formatting issues
with recent MacOS fixes.

* Add back missing case statement

Add back missing case statement
that was accidentally removed.

* Disable USE_GAS for Vulkan Loader build.

Disable USE_GAS for Vulkan Loader build
to fix aarch64 build.

* Fixes to OpenCL external sharing tests

Fix clReleaseSemaphore() API.
Fix copyright year.
Some other minor fixes.

* Improvements to OpenCL external sharing CTS

Use SPIR-V shaders instead of NV extension path
from GLSL to Vulkan shaders.
Fixes for lower end GPUs to use limited memory.
Update copy-right year at some more places.

* Fix new code formatting issues.

Fix code formatting issues with
recent changes for external sharing
tests.

* More formatting fixes.

More formatting fixes for recent
updates to external sharing tests.

* Final code formatting fixes.

Minor formatting fixes to get
format checks clean.

* Update extension list of test_compiler

Upate extension list of test_compiler
with missing external memory and semaphore
extensions

* Add ffp-contract=off Compilation flag for CTS build.

GCC defaults to using ffp-contract=fast even when fast math is disabled in the
case of GNU C
This creates precision issues when comparing the results with that of x86_64.
GNU options reference:
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#

This fix is to disable floating-point expression contractions with
flag ffp-contract=off for math_brute_force tests

Fixes #1794

* Make fp-contract flag arch-independent, but compiler dependent

Use existing CMake constructs to add fp-contract flag
so that it automatically checks for compiler support.
Also, make this change arch-independent.

Fixes #1794

* Fix typo in earlier commit

Fix typo in earlier commit

Fixes #1794

* Remove duplicate addition of ffp-contract flag

ffp-contract flag is currently added via two macros
Retain add_cxx_flag_if_supported macro and remove
set_gnulike_module_compile_flags.

Fixes #1794

* Fix typo in earlier commit

Add closing " that was unintentionally removed in previous commit.
2023-10-10 09:21:23 -07:00
..
2021-02-17 17:05:09 +00:00

Copyright:	(c) 2009-2013 by Apple Inc. All Rights Reserved.

math_brute_force test                                                   Feb 24, 2009
=====================

Usage:

        Please run the executable with --help for usage information.
	


System Requirements:

        This test requires support for correctly rounded single and double precision arithmetic.
The current version also requires a reasonably accurate operating system math library to 
be present. The OpenCL implementation must be able to compile kernels online. The test assumes
that the host system stores its floating point data according to the IEEE-754 binary single and 
double precision floating point formats. 


Test Completion Time:

        This test takes a while. Modern desktop systems can usually finish it in 1-3
days. Engineers doing OpenCL math library software development may find wimpy mode (-w)
a useful screen to quickly look for problems in a new implementation, before committing
to a lengthy test run. Likewise, it is possible to run just a range of tests, or specific
tests. See Usage above.


Test Design:

        This test is designed to do a somewhat exhaustive examination of the single
and double precision math library functions in OpenCL, for all vector lengths. Math 
library functions are compared against results from a higher precision reference 
function to determine correctness. All possible inputs are  examined for unary 
single precision functions.  Other functions are tested against a table of difficult 
values, followed by a few billion random values. If an error is found in a function,
the test for that function terminates early, reports an error, and moves on to the 
next test, if any.

The test currently doesn't support half precision math functions covered in section 
9 of the OpenCL 1.0 specification, but does cover the half_func functions covered in 
section six. It also doesn't test the native_<funcname> functions, for which any result 
is conformant.  

For the OpenCL 1.0 time frame, the reference library shall be the operating system 
math library, as modified by the test itself to conform to the OpenCL specification. 
That will help ensure that all devices on a particular operating system are returning 
similar results.  Going forward to future OpenCL releases, it is planned to gradually 
introduce a reference math library directly into the test, so as to reduce inter-
platform variance between OpenCL implementations. 

Generally speaking, this test will consider a result correct if it is one of the following:

        1) bitwise identical to the output of the reference function, 
                rounded to the appropriate precision

        2) within the allowed ulp error tolerance of the infinitely precise
                result (as estimated by the reference function)

        3) If the reference result is a NaN, then any NaN is deemed correct.

        4) if the devices is running in FTZ mode, then the result is also correct
                if the infinitely precise result (as estimated by the reference
                function) is subnormal, and the returned result is a zero
        
        5) if the devices is running in FTZ mode, then we also calculate the 
                estimate of the infinitely precise result with the reference function 
                with subnormal inputs flushed to +- zero.  If any of those results 
                are within the error tolerance of the returned result, then it is 
                deemed correct

        6) half_func functions may flush per 4&5 above, even if the device is not
                in FTZ mode.

        7) Functions are allowed to prematurely overflow to infinity, so long as 
                the estimated infinitely precise result is within the stated ulp 
                error limit of the maximum finite representable value of appropriate 
                sign

        8) Functions are allowed to prematurely underflow (and if in FTZ mode, 
                have behavior covered by 4&5 above), so long as the estimated
                infinitely precise result is within the stated ulp error limit
                of the minimum normal representable value of appropriate sign

        9) Some functions have limited range. Results of inputs outside that range
                are considered correct, so long as a result is returned.

        10) Some functions have infinite error bounds. Results of these function
                are considered correct, so long as a result is returned.

        11) The test currently does not discriminate based on the sign of zero
                We anticipate a later test will.

        12) The test currently does not check to make sure that edge cases called 
                out in the standard (e.g. pow(1.0, any) = 1.0) are exactly correct.
                We anticipate a later test will.

        13) The test doesn't check IEEE flags or exceptions. See section 7.3 of the 
                OpenCL standard.



Performance Measurement:

        There is also some optional timing code available, currently turned off by default. 
These may be useful for tracking internal performance regressions, but is not required to 
be part of the conformance submission.


If the test is believed to be in error:

The above correctness heuristics shall not be construed to be an alternative to the correctness 
criteria established by the OpenCL standard. An implementation shall be judged correct
or not on appeal based on whether it is within prescribed error bounds of the infinitely 
precise result. (The ulp is defined in section 7.4 of the spec.) If the input value corresponds
to an edge case listed in OpenCL specification sections covering edge case behavior, or 
similar sections in the C99 TC2 standard (section F.9 and G.6), the the function shall return
exactly that result, and the sign of a zero result shall be correct. In the event that the test 
is found to be faulty, resulting in a spurious failure result, the committee shall make a reasonable 
attempt to fix the test. If no practical and timely remedy can be found, then the implementation 
shall be granted a waiver.


Guidelines for reference function error tolerances:

        Errors are measured in ulps, and stored in a single precision representation. So as
to avoid introducing error into the error measurement due to error in the reference function
itself, the reference function should attempt to deliver 24 bits more precision than the test 
function return type. (All functions are currently either required to be correctly rounded or 
may have >= 1 ulp of error. This places the 1's bit at the LSB of the result, with 23 bits of 
sub-ulp accuracy. One more bit is required to avoid accrual of extra error due to round-to-
nearest behavior. If we start to require sub-ulp precision, then the accuracy requirements 
for reference functions increase.) Therefore reference functions for single precision should 
have 24+24=48 bits of accuracy, and reference functions for double precision should ideally 
have 53+24 = 77 bits of accuracy. 

A double precision system math library function should be sufficient to safely verify a single 
precision OpenCL math library function.  A long double precision math library function may or 
may not be sufficient to verify a double precision OpenCL math library function, depending on 
the precision of the long double type. A later version of these tests is expected to replace 
long double with a head+tail double double representation that can represent sufficient precision,
on all platforms that support double. 


Revision history:

 Feb 24, 2009                IRO        Created README
                                        Added some reference functions so the test will run on Windows.