OpenCL-CTS

mirror of https://github.com/KhronosGroup/OpenCL-CTS.git synced 2026-03-21 06:49:02 +00:00

Files

Ben Ashbaugh 620c689919 update fp16 staging branch from main (#1903 )

* allocations: Move results array from stack to heap (#1857)

* allocations: Fix stack overflow

* check format fixes

* Fix windows stack overflow. (#1839)

* thread_dimensions: Avoid combinations of very small LWS and very large GWS (#1856)

Modify the existing condition to include extremely small LWS like
1x1 on large GWS values

* c11_atomics: Reduce the loopcounter for sequential consistency tests (#1853)

Reduce the loop from 1000000 to 500000 since the former value
makes the test run too long and cause system issues on certain
platforms

* Limit individual allocation size using the global memory size (#1835)

Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>

* geometrics: fix Wsign-compare warnings (#1855)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* integer_ops: fix -Wformat warnings (#1860)

The main sources of warnings were:

 * Printing of a `size_t` which requires the `%zu` specifier.

 * Printing of `cl_long`/`cl_ulong` which is now done using the
   `PRI*64` macros to ensure portability across 32 and 64-bit builds.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Replace OBSOLETE_FORAMT with OBSOLETE_FORMAT (#1776)

* Replace OBSOLETE_FORAMT with OBSOLETE_FORMAT

In imageHelpers.cpp and few other places in image tests, OBSOLETE_FORMAT is misspelled as OBSOLETE_FORAMT.
Fix misspelling by replcaing it with OBSOLETE_FORMAT.

Fixes #1769

* Remove code guarded by OBSOLETE_FORMAT

Remove code guarded by OBSOLETE_FORMAT
as suggested by review comments

Fixes #1769

* Fix formating issues for OBSOLETE_FORMAT changes

Fix formatting issues observed in files while removing
code guarded by OBSOLETE_FORMAT

Fixes #1769

* Some more formatting fixes

Some more formatting fixes to get CI clean

Fixes #1769

* Final Formating fixes

Final formatting fixes for #1769

* Enhancement: Thread dimensions user parameters (#1384)

* Fix format in the test scope

* Add user params to limit testing

Add parameters to reduce amount of testing.
Helpful for debugging or for machines with lower performance.

* Restore default value

* Print info only if testing params bigger than 0.

* [NFC] conversions: reenable Wunused-but-set-variable (#1845)

Remove an assigned-to but unused variable.

Reenable the Wunused-but-set-variable warning for the conversions
suite, as it now compiles cleanly with this warning enabled.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Fix bug of conversion from long to double (#1847)

* Fix bug of conversion from long to double

It the input is long type, it should be load as long type, not ulong.

* update long2float

* math_brute_force: fix exp/exp2 rlx ULP calculation (#1848)

Fix the ULP error calculation for the `exp` and `exp2` builtins in
relaxed math mode for the full profile.

Previously, the `ulps` value kept being added to while verifying the
result buffer in a loop.  `ulps` could even become a `NaN` when the
input argument being tested was a `NaN`.

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Enable LARGEADDRESSAWARE for 32 bit compilation (#1858)

* Enable LARGEADDRESSAWARE for 32 bit compilation

32-bit executables built with MSVC linker have only 2GB virtual memory
address space by default, which might not be sufficient for some tests.

Enable LARGEADDRESSAWARE linker flag for 32-bit targets to allow tests
to handle addresses larger than 2 gigabytes.

https://learn.microsoft.com/en-us/cpp/build/reference/largeaddressaware-handle-large-addresses?view=msvc-170

Signed-off-by: Guo, Yilong <yilong.guo@intel.com>

* Apply suggestion

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

---------

Signed-off-by: Guo, Yilong <yilong.guo@intel.com>
Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* fix return code when readwrite image is not supported (#1873)

This function (do_test) starts by testing write and read individually.
Both of them can have errors.

When readwrite image is not supported, the function returns
TEST_SKIPPED_ITSELF potentially masking errors leading to the test
returning EXIT_SUCCESS even with errors along the way.

* fix macos builds by avoiding double compilation of function_list.cpp for test_spir (#1866)

* modernize CMakeLists for test_spir

* add the operating system release to the sccache key

* include the math brute force function list vs. building it twice

* fix the license header on the spirv-new tests (#1865)

The source files for the spirv-new tests were using the older Khronos
license instead of the proper Apache license.  Fixed the license in
all source files.

* compiler: fix grammar in error message (#1877)

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Updated semaphore tests to use clSemaphoreReImportSyncFdKHR. (#1854)

* Updated semaphore tests to use clSemaphoreReImportSyncFdKHR.

Additionally updated common semaphore code to handle spec updates
that restrict simultaneous importing/exporting of handles.

* Fix build issues on CI

* gcc build issues

* Make clReImportSemaphoreSyncFdKHR a required API
call if cl_khr_external_semaphore_sync_fd is present.

* Implement signal and wait for all semaphore types.

* subgroups: fix for testing too large WG sizes (#1620)

It seemed to be a typo; the comment says that it
tries to fetch local size for a subgroup count with
above max WG size, but it just used the previous
subgroup count.

The test on purpose sets a SG count to be a larger
number than the max work-items in the work group.
Given the minimum SG size is 1 WI, it means that there
can be a maximum of maximum work-group size of SGs (of
1 WI of size). Thus, if we request a number of SGs that
exceeds the local size, the query should fail as expected.

* add SPIR-V version testing (#1861)

* basic SPIR-V 1.3 testing support

* updated script to compile for more SPIR-V versions

* switch to general SPIR-V versions test

* update copyright text and fix license

* improve output while test is running

* check for higher SPIR-V versions first

* fix formatting

* fix the reported platform information for math brute force (#1884)

When the math brute force test printed the platform version it always
printed information for the first platform in the system, which could
be different than the platform for the passed-in device.  Fixed by
querying the platform from the passed-in device instead.

* api tests fix: Use MTdataHolder in test_get_image_info (#1871)

* Minor fixes in mutable dispatch tests. (#1829)

* Minor fixes in mutable dispatch tests.

* Fix size of newWrapper in MutableDispatchSVMArguments.
* Fix errnoneus clCommandNDRangeKernelKHR call.

Signed-off-by: John Kesapides <john.kesapides@arm.com>

* * Set the row_pitch for imageInfo in MutableDispatchImage1DArguments
and MutableDispatchImage2DArguments. The row_pitch is
used by get_image_size() to calculate the size of
the host pointers by generate_random_image_data.

Signed-off-by: John Kesapides <john.kesapides@arm.com>

---------

Signed-off-by: John Kesapides <john.kesapides@arm.com>

* add test for cl_khr_spirv_linkonce_odr (#1226)

* initial version of the test with placeholders for linkonce_odr linkage

* add OpExtension SPV_KHR_linkonce_odr extension

* add check for extension

* switch to actual LinkOnceODR linkage

* fix formatting

* add a test case to ensure a function with linkonce_odr is exported

* add back the extension check

* fix formatting

* undo compiler optimization and actually add the call to function a

* [NFC] subgroups: remove unnecessary extern keywords (#1892)

In C and C++ all functions have external linkage by default.

Also remove the unused `gMTdata` and `test_pipe_functions`
declarations.

Fixes https://github.com/KhronosGroup/OpenCL-CTS/issues/1137

Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>

* Added cl_khr_fp16 extension support for test_decorate from spirv_new (#1770)

* Added cl_khr_fp16 extension support for test_decorate from spirv_new, work in progres

* Complemented test_decorate saturation test to support cl_khr_fp16 extension (issue #142)

* Fixed clang format

* scope of modifications:

-changed naming convention of saturation .spvasm files related to
test_decorate of spirv_new
-restored float to char/uchar saturation tests
-few minor corrections

* fix ranges for half testing

* fix formating

* one more formatting fix

* remove unused function

* use isnan instead of std::isnan

isnan is currently implemented as a macro, not as a function, so
we can't use std::isnan.

* fix Clang warning about inexact conversion

---------

Co-authored-by: Ben Ashbaugh <ben.ashbaugh@intel.com>

* add support for custom devices (#1891)

enable the CTS to run on custom devices

---------

Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Guo, Yilong <yilong.guo@intel.com>
Signed-off-by: John Kesapides <john.kesapides@arm.com>
Co-authored-by: Sreelakshmi Haridas Maruthur <sharidas@quicinc.com>
Co-authored-by: Haonan Yang <haonan.yang@intel.com>
Co-authored-by: Ahmed Hesham <117350656+ahesham-arm@users.noreply.github.com>
Co-authored-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Co-authored-by: niranjanjoshi121 <43807392+niranjanjoshi121@users.noreply.github.com>
Co-authored-by: Grzegorz Wawiorko <grzegorz.wawiorko@intel.com>
Co-authored-by: Wenwan Xing <wenwan.xing@intel.com>
Co-authored-by: Yilong Guo <yilong.guo@intel.com>
Co-authored-by: Romaric Jodin <89833130+rjodinchr@users.noreply.github.com>
Co-authored-by: joshqti <127994991+joshqti@users.noreply.github.com>
Co-authored-by: Pekka Jääskeläinen <pekka.jaaskelainen@tuni.fi>
Co-authored-by: imilenkovic00 <155085410+imilenkovic00@users.noreply.github.com>
Co-authored-by: John Kesapides <46718829+JohnKesapidesARM@users.noreply.github.com>
Co-authored-by: Marcin Hajder <marcin.hajder@gmail.com>
Co-authored-by: Aharon Abramson <aharon.abramson@mobileye.com>

2024-03-02 16:48:45 -08:00

binary_double.cpp

math_brute_force: Remove unnecessary gotos (#1605 )

2023-02-07 09:01:07 -08:00

binary_float.cpp

math_brute_force: Remove unnecessary gotos (#1605 )

2023-02-07 09:01:07 -08:00

binary_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_i_double.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_i_float.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_i_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_operator_double.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_operator_float.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

binary_operator_half.cpp

bruteforce: Remove unnecessary half to float conversion (#1874 )

2024-01-16 09:50:07 -08:00

binary_two_results_i_double.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

binary_two_results_i_float.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

binary_two_results_i_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

CMakeLists.txt

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

common.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

common.h

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

function_list.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

function_list.h

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

i_unary_double.cpp

math_brute_force: don't set/restore FTZ mode twice (#1808 )

2023-10-03 09:36:01 -07:00

i_unary_float.cpp

math_brute_force: don't set/restore FTZ mode twice (#1808 )

2023-10-03 09:36:01 -07:00

i_unary_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

macro_binary_double.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

macro_binary_float.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

macro_binary_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

macro_unary_double.cpp

math_brute_force: Remove unnecessary gotos (#1605 )

2023-02-07 09:01:07 -08:00

macro_unary_float.cpp

math_brute_force: remove gotos in macro_unary_float (#1725 )

2023-09-06 13:32:19 +01:00

macro_unary_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

mad_double.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

mad_float.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

mad_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

main.cpp

update fp16 staging branch from main (#1903 )

2024-03-02 16:48:45 -08:00

README.txt

Initial open source release of OpenCL 2.2 CTS.

2017-05-16 18:44:33 +05:30

reference_math.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

reference_math.h

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

run_math_brute_force_in_parallel.py

Synchronise with Khronos-private Gitlab branch

2019-03-05 16:24:50 +00:00

sleep.cpp

Regroup vtbl definitions to one translation unit (#1167 )

2021-02-18 10:06:37 +00:00

sleep.h

Rename files for consistency (#1166 )

2021-02-17 17:05:09 +00:00

ternary_double.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

ternary_float.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

ternary_half.cpp

Fix testing of half-precision fma. (#1882 )

2024-02-06 09:25:31 -08:00

test_functions.h

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

unary_double.cpp

math_brute_force: Remove unnecessary gotos (#1605 )

2023-02-07 09:01:07 -08:00

unary_float.cpp

update fp16 staging branch from main (#1903 )

2024-03-02 16:48:45 -08:00

unary_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

unary_two_results_double.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

unary_two_results_float.cpp

math_brute_force: always initialize oldMode (#1796 )

2023-08-07 13:51:29 +01:00

unary_two_results_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

unary_two_results_i_double.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

unary_two_results_i_float.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

unary_two_results_i_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

unary_u_double.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

unary_u_float.cpp

math_brute_force: Drop BuildKernelInfo2 (#1634 )

2023-03-20 09:44:25 +00:00

unary_u_half.cpp

Fp16 math bruteforce staging (#1863 )

2023-12-18 10:15:31 -08:00

utility.cpp

Regroup vtbl definitions to one translation unit (#1167 )

2021-02-18 10:06:37 +00:00

utility.h

Fix testing of half-precision fma. (#1882 )

2024-02-06 09:25:31 -08:00

README.txt

Copyright:	(c) 2009-2013 by Apple Inc. All Rights Reserved.

math_brute_force test                                                   Feb 24, 2009
=====================

Usage:

        Please run the executable with --help for usage information.
	


System Requirements:

        This test requires support for correctly rounded single and double precision arithmetic.
The current version also requires a reasonably accurate operating system math library to 
be present. The OpenCL implementation must be able to compile kernels online. The test assumes
that the host system stores its floating point data according to the IEEE-754 binary single and 
double precision floating point formats. 


Test Completion Time:

        This test takes a while. Modern desktop systems can usually finish it in 1-3
days. Engineers doing OpenCL math library software development may find wimpy mode (-w)
a useful screen to quickly look for problems in a new implementation, before committing
to a lengthy test run. Likewise, it is possible to run just a range of tests, or specific
tests. See Usage above.


Test Design:

        This test is designed to do a somewhat exhaustive examination of the single
and double precision math library functions in OpenCL, for all vector lengths. Math 
library functions are compared against results from a higher precision reference 
function to determine correctness. All possible inputs are  examined for unary 
single precision functions.  Other functions are tested against a table of difficult 
values, followed by a few billion random values. If an error is found in a function,
the test for that function terminates early, reports an error, and moves on to the 
next test, if any.

The test currently doesn't support half precision math functions covered in section 
9 of the OpenCL 1.0 specification, but does cover the half_func functions covered in 
section six. It also doesn't test the native_<funcname> functions, for which any result 
is conformant.  

For the OpenCL 1.0 time frame, the reference library shall be the operating system 
math library, as modified by the test itself to conform to the OpenCL specification. 
That will help ensure that all devices on a particular operating system are returning 
similar results.  Going forward to future OpenCL releases, it is planned to gradually 
introduce a reference math library directly into the test, so as to reduce inter-
platform variance between OpenCL implementations. 

Generally speaking, this test will consider a result correct if it is one of the following:

        1) bitwise identical to the output of the reference function, 
                rounded to the appropriate precision

        2) within the allowed ulp error tolerance of the infinitely precise
                result (as estimated by the reference function)

        3) If the reference result is a NaN, then any NaN is deemed correct.

        4) if the devices is running in FTZ mode, then the result is also correct
                if the infinitely precise result (as estimated by the reference
                function) is subnormal, and the returned result is a zero
        
        5) if the devices is running in FTZ mode, then we also calculate the 
                estimate of the infinitely precise result with the reference function 
                with subnormal inputs flushed to +- zero.  If any of those results 
                are within the error tolerance of the returned result, then it is 
                deemed correct

        6) half_func functions may flush per 4&5 above, even if the device is not
                in FTZ mode.

        7) Functions are allowed to prematurely overflow to infinity, so long as 
                the estimated infinitely precise result is within the stated ulp 
                error limit of the maximum finite representable value of appropriate 
                sign

        8) Functions are allowed to prematurely underflow (and if in FTZ mode, 
                have behavior covered by 4&5 above), so long as the estimated
                infinitely precise result is within the stated ulp error limit
                of the minimum normal representable value of appropriate sign

        9) Some functions have limited range. Results of inputs outside that range
                are considered correct, so long as a result is returned.

        10) Some functions have infinite error bounds. Results of these function
                are considered correct, so long as a result is returned.

        11) The test currently does not discriminate based on the sign of zero
                We anticipate a later test will.

        12) The test currently does not check to make sure that edge cases called 
                out in the standard (e.g. pow(1.0, any) = 1.0) are exactly correct.
                We anticipate a later test will.

        13) The test doesn't check IEEE flags or exceptions. See section 7.3 of the 
                OpenCL standard.



Performance Measurement:

        There is also some optional timing code available, currently turned off by default. 
These may be useful for tracking internal performance regressions, but is not required to 
be part of the conformance submission.


If the test is believed to be in error:

The above correctness heuristics shall not be construed to be an alternative to the correctness 
criteria established by the OpenCL standard. An implementation shall be judged correct
or not on appeal based on whether it is within prescribed error bounds of the infinitely 
precise result. (The ulp is defined in section 7.4 of the spec.) If the input value corresponds
to an edge case listed in OpenCL specification sections covering edge case behavior, or 
similar sections in the C99 TC2 standard (section F.9 and G.6), the the function shall return
exactly that result, and the sign of a zero result shall be correct. In the event that the test 
is found to be faulty, resulting in a spurious failure result, the committee shall make a reasonable 
attempt to fix the test. If no practical and timely remedy can be found, then the implementation 
shall be granted a waiver.


Guidelines for reference function error tolerances:

        Errors are measured in ulps, and stored in a single precision representation. So as
to avoid introducing error into the error measurement due to error in the reference function
itself, the reference function should attempt to deliver 24 bits more precision than the test 
function return type. (All functions are currently either required to be correctly rounded or 
may have >= 1 ulp of error. This places the 1's bit at the LSB of the result, with 23 bits of 
sub-ulp accuracy. One more bit is required to avoid accrual of extra error due to round-to-
nearest behavior. If we start to require sub-ulp precision, then the accuracy requirements 
for reference functions increase.) Therefore reference functions for single precision should 
have 24+24=48 bits of accuracy, and reference functions for double precision should ideally 
have 53+24 = 77 bits of accuracy. 

A double precision system math library function should be sufficient to safely verify a single 
precision OpenCL math library function.  A long double precision math library function may or 
may not be sufficient to verify a double precision OpenCL math library function, depending on 
the precision of the long double type. A later version of these tests is expected to replace 
long double with a head+tail double double representation that can represent sufficient precision,
on all platforms that support double. 


Revision history:

 Feb 24, 2009                IRO        Created README
                                        Added some reference functions so the test will run on Windows.