mirror of
https://github.com/KhronosGroup/OpenCL-CTS.git
synced 2026-03-19 06:09:01 +00:00
151 lines
7.6 KiB
Plaintext
151 lines
7.6 KiB
Plaintext
Copyright: (c) 2009-2013 by Apple Inc. All Rights Reserved.
|
|
|
|
math_brute_force test Feb 24, 2009
|
|
=====================
|
|
|
|
Usage:
|
|
|
|
Please run the executable with --help for usage information.
|
|
|
|
|
|
|
|
System Requirements:
|
|
|
|
This test requires support for correctly rounded single and double precision arithmetic.
|
|
The current version also requires a reasonably accurate operating system math library to
|
|
be present. The OpenCL implementation must be able to compile kernels online. The test assumes
|
|
that the host system stores its floating point data according to the IEEE-754 binary single and
|
|
double precision floating point formats.
|
|
|
|
|
|
Test Completion Time:
|
|
|
|
This test takes a while. Modern desktop systems can usually finish it in 1-3
|
|
days. Engineers doing OpenCL math library software development may find wimpy mode (-w)
|
|
a useful screen to quickly look for problems in a new implementation, before committing
|
|
to a lengthy test run. Likewise, it is possible to run just a range of tests, or specific
|
|
tests. See Usage above.
|
|
|
|
|
|
Test Design:
|
|
|
|
This test is designed to do a somewhat exhaustive examination of the single
|
|
and double precision math library functions in OpenCL, for all vector lengths. Math
|
|
library functions are compared against results from a higher precision reference
|
|
function to determine correctness. All possible inputs are examined for unary
|
|
single precision functions. Other functions are tested against a table of difficult
|
|
values, followed by a few billion random values. If an error is found in a function,
|
|
the test for that function terminates early, reports an error, and moves on to the
|
|
next test, if any.
|
|
|
|
The test currently doesn't support half precision math functions covered in section
|
|
9 of the OpenCL 1.0 specification, but does cover the half_func functions covered in
|
|
section six. It also doesn't test the native_<funcname> functions, for which any result
|
|
is conformant.
|
|
|
|
For the OpenCL 1.0 time frame, the reference library shall be the operating system
|
|
math library, as modified by the test itself to conform to the OpenCL specification.
|
|
That will help ensure that all devices on a particular operating system are returning
|
|
similar results. Going forward to future OpenCL releases, it is planned to gradually
|
|
introduce a reference math library directly into the test, so as to reduce inter-
|
|
platform variance between OpenCL implementations.
|
|
|
|
Generally speaking, this test will consider a result correct if it is one of the following:
|
|
|
|
1) bitwise identical to the output of the reference function,
|
|
rounded to the appropriate precision
|
|
|
|
2) within the allowed ulp error tolerance of the infinitely precise
|
|
result (as estimated by the reference function)
|
|
|
|
3) If the reference result is a NaN, then any NaN is deemed correct.
|
|
|
|
4) if the devices is running in FTZ mode, then the result is also correct
|
|
if the infinitely precise result (as estimated by the reference
|
|
function) is subnormal, and the returned result is a zero
|
|
|
|
5) if the devices is running in FTZ mode, then we also calculate the
|
|
estimate of the infinitely precise result with the reference function
|
|
with subnormal inputs flushed to +- zero. If any of those results
|
|
are within the error tolerance of the returned result, then it is
|
|
deemed correct
|
|
|
|
6) half_func functions may flush per 4&5 above, even if the device is not
|
|
in FTZ mode.
|
|
|
|
7) Functions are allowed to prematurely overflow to infinity, so long as
|
|
the estimated infinitely precise result is within the stated ulp
|
|
error limit of the maximum finite representable value of appropriate
|
|
sign
|
|
|
|
8) Functions are allowed to prematurely underflow (and if in FTZ mode,
|
|
have behavior covered by 4&5 above), so long as the estimated
|
|
infinitely precise result is within the stated ulp error limit
|
|
of the minimum normal representable value of appropriate sign
|
|
|
|
9) Some functions have limited range. Results of inputs outside that range
|
|
are considered correct, so long as a result is returned.
|
|
|
|
10) Some functions have infinite error bounds. Results of these function
|
|
are considered correct, so long as a result is returned.
|
|
|
|
11) The test currently does not discriminate based on the sign of zero
|
|
We anticipate a later test will.
|
|
|
|
12) The test currently does not check to make sure that edge cases called
|
|
out in the standard (e.g. pow(1.0, any) = 1.0) are exactly correct.
|
|
We anticipate a later test will.
|
|
|
|
13) The test doesn't check IEEE flags or exceptions. See section 7.3 of the
|
|
OpenCL standard.
|
|
|
|
|
|
|
|
Performance Measurement:
|
|
|
|
There is also some optional timing code available, currently turned off by default.
|
|
These may be useful for tracking internal performance regressions, but is not required to
|
|
be part of the conformance submission.
|
|
|
|
|
|
If the test is believed to be in error:
|
|
|
|
The above correctness heuristics shall not be construed to be an alternative to the correctness
|
|
criteria established by the OpenCL standard. An implementation shall be judged correct
|
|
or not on appeal based on whether it is within prescribed error bounds of the infinitely
|
|
precise result. (The ulp is defined in section 7.4 of the spec.) If the input value corresponds
|
|
to an edge case listed in OpenCL specification sections covering edge case behavior, or
|
|
similar sections in the C99 TC2 standard (section F.9 and G.6), the the function shall return
|
|
exactly that result, and the sign of a zero result shall be correct. In the event that the test
|
|
is found to be faulty, resulting in a spurious failure result, the committee shall make a reasonable
|
|
attempt to fix the test. If no practical and timely remedy can be found, then the implementation
|
|
shall be granted a waiver.
|
|
|
|
|
|
Guidelines for reference function error tolerances:
|
|
|
|
Errors are measured in ulps, and stored in a single precision representation. So as
|
|
to avoid introducing error into the error measurement due to error in the reference function
|
|
itself, the reference function should attempt to deliver 24 bits more precision than the test
|
|
function return type. (All functions are currently either required to be correctly rounded or
|
|
may have >= 1 ulp of error. This places the 1's bit at the LSB of the result, with 23 bits of
|
|
sub-ulp accuracy. One more bit is required to avoid accrual of extra error due to round-to-
|
|
nearest behavior. If we start to require sub-ulp precision, then the accuracy requirements
|
|
for reference functions increase.) Therefore reference functions for single precision should
|
|
have 24+24=48 bits of accuracy, and reference functions for double precision should ideally
|
|
have 53+24 = 77 bits of accuracy.
|
|
|
|
A double precision system math library function should be sufficient to safely verify a single
|
|
precision OpenCL math library function. A long double precision math library function may or
|
|
may not be sufficient to verify a double precision OpenCL math library function, depending on
|
|
the precision of the long double type. A later version of these tests is expected to replace
|
|
long double with a head+tail double double representation that can represent sufficient precision,
|
|
on all platforms that support double.
|
|
|
|
|
|
Revision history:
|
|
|
|
Feb 24, 2009 IRO Created README
|
|
Added some reference functions so the test will run on Windows.
|
|
|