fix correctly rounded behavior for math bruteforce tests (#2397)

fixes #2387 

Corrects the "correctly rounded" behavior for the math bruteforce tests.
Specifically:

* Only applies the `-cl-fp32-correctly-rounded-divide-sqrt` build option
for the `divide_cr` and `sqrt_cr` tests. The other tests do not receive
this build option. This means that there is a difference in the behavior
of the `divide` and `divide_cr` tests and the `sqrt` and `sqrt_cr`
tests, and the "correctly rounded" build option is not applied to the
fp16 or fp64 tests.
* Removes the build option to toggle testing the correctly rounded
divide and square root tests since it no longer needed. Instead, the
test names can be used to choose whether to test the correctly rounded
functions or the non-correctly rounded functions.

Additionally:

* Relaxes the fp16 sqrt accuracy requirements to 1 ULP. This is needed
to pass this test on some of our devices. This part is still under
discussion, so I will keep this PR as a draft until it is settled.
This commit is contained in:
Ben Ashbaugh
2025-07-15 09:01:19 -07:00
committed by GitHub
parent 933874f070
commit 8d4a870059
6 changed files with 21 additions and 26 deletions

View File

@@ -102,7 +102,7 @@ void EmitEnableExtension(std::ostringstream &kernel,
if (needsFp16) kernel << "#pragma OPENCL EXTENSION cl_khr_fp16 : enable\n";
}
std::string GetBuildOptions(bool relaxed_mode)
std::string GetBuildOptions(const BuildKernelInfo &info)
{
std::ostringstream options;
@@ -111,16 +111,16 @@ std::string GetBuildOptions(bool relaxed_mode)
options << " -cl-denorms-are-zero";
}
if (gFloatCapabilities & CL_FP_CORRECTLY_ROUNDED_DIVIDE_SQRT)
{
options << " -cl-fp32-correctly-rounded-divide-sqrt";
}
if (relaxed_mode)
if (info.relaxedMode)
{
options << " -cl-fast-relaxed-math";
}
if (info.correctlyRounded)
{
options << " -cl-fp32-correctly-rounded-divide-sqrt";
}
return options.str();
}
@@ -581,7 +581,7 @@ cl_int BuildKernels(BuildKernelInfo &info, cl_uint job_id,
// Create the program.
clProgramWrapper &program = info.programs[vector_size_index];
auto options = GetBuildOptions(info.relaxedMode);
auto options = GetBuildOptions(info);
int error =
create_single_kernel_helper(gContext, &program, nullptr, sources.size(),
sources.data(), nullptr, options.c_str());