* Don't recalculate image parameters repeatedly in `test_read_image()`
We've already done this in the loop. There's no need to recalculate
those parameters over and over again in `sample_image_pixel*()` and
`read_image_pixel*()`. This should save some work during the image
streams test.
This only affects the 3D tests for now, but my time profiles indicate
this is where we spend the most time anyway.
* Vectorize read_image_pixel_float() and sample_image_pixel_float() for SSE/AVX
This shortens the image streams test time from 45 minutes without it to
37 minutes. Unfortunately, most of the time is now spent waiting for
memory, particularly in the 3D tests, because the 3D image doesn't
neatly fit in the cache, especially in the linear sampling case, where
pixels from two 2D slices must be sampled. Software prefetching won't
help; it only helps when execution time is dominated by operations, but
this is dominated by memory access. Randomized offsets are likely a
factor, because they throw off the hardware prefetcher.
One possible further optimization is, in the linear sampling case, to
load two sampled pixels at once. This is easy to do using AVX, which
extends SSE with 256-bit vectors.
Obviously, this only applies to x86 CPUs with SSE2. The greatest
performance gains, however, are seen with SSE4.1. Most modern x86 CPus
have SSE4. Work is needed to support other CPUs' vector units--ARM
Advanced SIMD/NEON is probably the most important one. Another
possibility is arranging the code so that the compiler's
autovectorization will kick in and do what I did here manually.
Move functions in .h files to .cpp files where appropriate; align
prototypes and definitions; and remove functions that are not used.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
So we get finer grain reporting and better parallelisation in the future.
Signed-off-by: Kévin Petit <kpet@free.fr>
Signed-off-by: Kévin Petit <kpet@free.fr>
Also use log_info instead of printf for skipped tests, for consistency.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
Fix a few instances where an incorrect number of arguments was
supplied when calling (v)log_error.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
* Fix enqueue_flags test to use correct barrier type.
Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.
* Add check for support for Read-Wrie images
Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.
Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.
Fixes issue #894
* Fix formatting in case of Read-Write image checks.
Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests
* Fix some more formatting for RW-image checks
Remove unnecessary spaces at various places.
Also, fix lengthy lines.
* Fix malloc-size calculation in test imagedim
unsigned char size is silently assumed to be 1
in imagedim test of test_basic.
Pass sizeof(type) in malloc size calculation.
Also, change loop variable from signed to unsigned.
Add checks for null pointer for malloced memory.
* Cap CL_DEVICE_MAX_MEM_ALLOC_SIZE to SIZE_MAX
Cap CL_DEVICE_MAX_MEM_ALLOC_SIZE to SIZE_MAX
when CL_DEVICE_GLOBAL_MEM_SIZE is capped with SIZE_MAX.
test_allocation caps the value of GLOBAL_MEM_SIZE to SIZE_MAX
if it exceeds the value of SIZE_MAX(value depends on platform bitness),
but doesn’t modify MAX_ALLOC_SIZE the same way.
Due to this MAX_ALLOC_SIZE becomes greater than GLOBAL_MEM_SIZE
and the test fails.
Modify MAX_MEM_ALLOC_SIZE as GLOBAL_MEM_SIZE when it exceeds SIZE_MAX
OpenCL-CTS #1022
* Make InitFloatCoords suitable for all image types
Contributes #616
* Create common functions neutral for image types
Remove 3D specific code from common test_read_image so using
it for other image types is simpler in following patches
Contributes #616
* Removing unused code
Tidying commented out or unnecessary code
Contributes #616
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Restoring 'lod' variable name
Contributes #616
* Default cases to handle unsupported image types
Contributes #616
* Resolving build issues
Contributes #616
* Fix formatting
Contributes #616
* Using TEST_FAIL as an error code.
Contributes #616
* Add static keyword, improve error handling
Contributes #616
* Fix build errors with least disruption
Contributes #616
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
Factor out a macro to set module-specific compilation flags for
GNU-like compilers. This simplifies setting compilation flags per
test.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Remove unused variables throughout the code base and enable the
`-Wunused-variable` warning flag globally to prevent new unused
variable issues being introduced in the future.
This is mostly a non-functional change, with one exception:
- In `test_conformance/api/test_kernel_arg_info.cpp`, an error check
of the clGetDeviceInfo return value was added.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
The slice pitch/padding calculation assumed that the 'height' variable contained the pixel height of the image, which it doesn't for IMAGE1D_ARRAY.
Fixes#1257
* images: Stop checking gDeviceType != CL_DEVICE_TYPE_GPU
If the device type also advertises CL_DEVICE_TYPE_DEFAULT (which should
be valid), this causes it to be considered a CPU device and the tests
enforce different precision and rounding expectations.
* Fix clang-format
* Drop redundant NORM_OFFSET checks
* Minor fixes for CL_UNORM_SHORT_565, CL_UNORM_SHORT_555
* Fix verification for undefined bit
* Relax current infinitely precision requirement for these formats
and move check in common function.
* Add proper debug output.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
* Minor Formating fix.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
The CL_UNORM_SHORT_555 and CL_UNORM_INT_101010 formats contain padding
bits which need to be ignored in clCopyImage and clFillImage testing.
For clFillImage tests, padding was not ignored for the CL_UNORM_SHORT_555
format, and was ignored for CL_UNORM_INT_101010 by modifying actual and
reference data. For clCopyImage tests, padding was not ignored, both for
CL_UNORM_SHORT_555 and for CL_UNORM_INT_101010.
Fix this by adding a new compare_scanlines() function, which is used for
both of these formats, and does not modify the actual or reference data.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Fix possible size_t overflow in 32-bit builds.
Use cl_ulong temporary values for row/slice_pitch.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
* Remove redundant casts
Signed-off-by: John Kesapides <john.kesapides@arm.com>
clCopyImage and clFillImage contain near-duplicate code for logging of
pixel difference errors. Move this into imageHelpers.
Signed-off-by: Stuart Brady <stuart.brady@arm.com>
* Move common global variable and functions to header
InitFloatCoords for 3D read images has also been renamed
so it can later be used as a common function
Contributes #616
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Set-up for using 3D functions as a base
test_read_image_3D had been moved to common.cpp (and renamed
test_read_image) along with corresponding
determine_validation_error_offset and InitFloatCoords.
Only function names and the formatting have been changed.
Contributes #616
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
* Minor fixes for clCopyImage.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
Change-Id: I63c47570e45580e5e29716a46929cb1127711c6b
* Convert comment to CTS style.
Return error code instead of -1 in clFinish.
* Use std::vector for format lists in images suite
Avoids memory deallocation issues and generally simplifies the code.
* Fixup formatting with git-clang-format
The test was trying to create read_write image with read_only formats.
Make it use common image formats for both read_only and read_write
flags when creating images.
Fixes issue #329
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>
Signed-off-by: James Morrissey <james.morrissey@arm.com>
Co-authored-by: Radek Szymanski <radek.szymanski@arm.com>
Test was querying for supported images with CL_MEM_WRITE_ONLY flag, but
always used CL_MEM_READ_ONLY to create images.
Fixes issue #328
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>
Signed-off-by: James Morrissey <james.morrissey@arm.com>
Co-authored-by: Radek Szymanski <radek.szymanski@arm.com>
* Tests requiring image support use runTestHarnessWithCheck
Removing special case for images in runTestHarness.
Fixes#710
* Remove imageSupportRequired argument
Tests which require image support now specify this while
calling runTestHarnessWithCheck.
Fixes#710
Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
Pass the right flags to the tests that create images to ensure valid
usage.
Fixes issue #330
Signed-off-by: James Morrissey <james.morrissey@arm.com>
Co-authored-by: Alessandro Navone <alessandro.navone@arm.com>
These declarations either aren't used or aren't needed, as testBase.h
already declares them.
Some definitions got moved to test_common.h, as these are duplicated
across few files. There's further opportunity to improve code reuse
via test_common.h, but that's for future patch.
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>
* Remove unused code in clCopyImage
Some of the declarations that are still used now moved to testBase.h
as these are reused throughout the image tests.
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>
* Add missing file
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>
Fix validate_float/half_write_results so that when nan/inf is
encountered on a channel, the rest of the channel values are still
considered for correctness.
Signed-off-by: John Kesapides <john.kesapides@arm.com>
Signed-off-by: James Morrissey <james.morrissey@arm.com>
Co-authored-by: John Kesapides <john.kesapides@arm.com>
A previously created Image was not being released leading to a
memory leak.
Change-Id: I9715b5c4193a8e89c6c182b443959009ce1f1129
Signed-off-by: Chetankumar Mistry <chetan.mistry@arm.com>
* Fix enqueue_flags test to use correct barrier type.
Currently, enqueue_flags test uses CLK_LOCAL_MEM_FENCE.
Use CLK_GLOBAL_MEM_FENCE instead as all threads across work-groups
need to wait here.
* Add check for support for Read-Wrie images
Read-Write images have required OpenCL 2.x.
Read-Write image tests are already being skipped
for 1.x devices.
With OpenCL 3.0, read-write images being optional,
the tests should be run or skipped
depending on the implementation support.
Add a check to decide if Read-Write images are
supported or required to be supported depending
on OpenCL version and decide if the tests should
be run on skipped.
Fixes issue #894
* Fix formatting in case of Read-Write image checks.
Fix formatting in case of Read-write image checks.
Also, combine two ifs into one in case of
kerne_read_write tests
* Fix some more formatting for RW-image checks
Remove unnecessary spaces at various places.
Also, fix lengthy lines.
* skip test cases rather than fail without cl_khr_3d_image_writes
cl_khr_3d_image_writes is required for OpenCL 2.x devices, but is not
required for OpenCL 1.x or OpenCL 3.0 devices. A check for the presence
of the extension on OpenCL 2.x devices already exists in
test_min_max_device_version, so we don't need any failure conditions
here, and can just skip tests if the extension is not supported.
* clang-format changes
* Enable -Werror for GCC/Clang builds
Fixes many of the errors this produces, and disables a handful that
didn't have solutions that were obvious (to me).
* Check for `-W*` flags empirically
* Remove cl_APPLE_fp64_basic_ops support
* Undo NAN conversion fix
* Add comments to warning override flags
* Remove unneeded STRINGIFY definition
* Fix tautological compare issue in basic
* Use ABS_ERROR macro in image tests
* Use fabs for ABS_ERROR macro
* Move ABS_ERROR definition to common header
This function is duplicated across clCopyImages files, so just keep one
definition in common file, and get rid of the duplicated ones.
Signed-off-by: Radek Szymanski <radek.szymanski@arm.com>