Commit Graph

11 Commits

Author SHA1 Message Date
gorazd-sumkovski-arm
e678277c93 Add testing of CL_UNORM_INT_2_101010_EXT (#2112)
All existing tests in `test_image_streams`, that are capable of testing
image formats using the `CL_UNORM_INT_2_101010_EXT` data type, do so.

Signed-off-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
2024-10-22 09:54:46 -07:00
gorazd-sumkovski-arm
bcfd8f82cd Extend testing of CL_UNORM_INT_101010_2 (#2031)
All existing tests in `test_image_streams`, that are capable of testing
image formats using the `CL_UNORM_INT_101010_2` data type, now do so.
2024-09-16 08:45:16 -07:00
ellnor01
13848621f1 Unduplicate kernel_read_write image tests (read) (#1552)
The kernel_read_write tests have a lot of duplicate code. These are the
next steps to reducing the duplication, by using the functions in
test_common.cpp as common for 1D, 1D array and 2D array.

---------

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
Co-authored-by: Ahmed Hesham <117350656+ahesham-arm@users.noreply.github.com>
2024-08-06 09:28:39 -07:00
Ben Ashbaugh
e71a7bce68 Revert "Image streams optimization (#1616)" (#1638)
This reverts commit b73c3149ad.
2023-02-28 09:06:34 -08:00
Chip Davis
b73c3149ad Image streams optimization (#1616)
* Don't recalculate image parameters repeatedly in `test_read_image()`

We've already done this in the loop. There's no need to recalculate
those parameters over and over again in `sample_image_pixel*()` and
`read_image_pixel*()`. This should save some work during the image
streams test.

This only affects the 3D tests for now, but my time profiles indicate
this is where we spend the most time anyway.

* Vectorize read_image_pixel_float() and sample_image_pixel_float() for SSE/AVX

This shortens the image streams test time from 45 minutes without it to
37 minutes. Unfortunately, most of the time is now spent waiting for
memory, particularly in the 3D tests, because the 3D image doesn't
neatly fit in the cache, especially in the linear sampling case, where
pixels from two 2D slices must be sampled. Software prefetching won't
help; it only helps when execution time is dominated by operations, but
this is dominated by memory access. Randomized offsets are likely a
factor, because they throw off the hardware prefetcher.

One possible further optimization is, in the linear sampling case, to
load two sampled pixels at once. This is easy to do using AVX, which
extends SSE with 256-bit vectors.

Obviously, this only applies to x86 CPUs with SSE2. The greatest
performance gains, however, are seen with SSE4.1. Most modern x86 CPus
have SSE4. Work is needed to support other CPUs' vector units--ARM
Advanced SIMD/NEON is probably the most important one. Another
possibility is arranging the code so that the compiler's
autovectorization will kick in and do what I did here manually.
2023-02-07 08:46:15 -08:00
ellnor01
c014122742 Creating common functions for image/kernel_read_write read tests (#1141)
* Make InitFloatCoords suitable for all image types

Contributes #616

* Create common functions neutral for image types

Remove 3D specific code from common test_read_image so using
it for other image types is simpler in following patches

Contributes #616

* Removing unused code

Tidying commented out or unnecessary code

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>

* Restoring 'lod' variable name

Contributes #616

* Default cases to handle unsupported image types

Contributes #616

* Resolving build issues

Contributes #616

* Fix formatting

Contributes #616

* Using TEST_FAIL as an error code.

Contributes #616

* Add static keyword, improve error handling

Contributes #616

* Fix build errors with least disruption

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
2022-09-26 12:57:42 +01:00
Jason Ekstrand
6e6249fb48 images: Stop checking gDeviceType != CL_DEVICE_TYPE_GPU (#1418)
* images: Stop checking gDeviceType != CL_DEVICE_TYPE_GPU

If the device type also advertises CL_DEVICE_TYPE_DEFAULT (which should
be valid), this causes it to be considered a CPU device and the tests
enforce different precision and rounding expectations.

* Fix clang-format

* Drop redundant NORM_OFFSET checks
2022-05-17 08:51:53 -07:00
Ben Ashbaugh
02bf24d2b1 remove min max macros (#1310)
* remove the MIN and MAX macros and use the std versions instead

* fix formatting

* fix Arm build

* remove additional MIN and MAX macros from compat.h
2021-09-13 13:25:32 +01:00
John Kesapides
80a4a833be Minor fixes for CL_UNORM_SHORT_565, CL_UNORM_SHORT_555 (#1129)
* Minor fixes for CL_UNORM_SHORT_565, CL_UNORM_SHORT_555

* Fix verification for undefined bit
* Relax current infinitely precision requirement for these formats
  and move check in common function.
* Add proper debug output.

Signed-off-by: John Kesapides <john.kesapides@arm.com>

* Minor Formating fix.

Signed-off-by: John Kesapides <john.kesapides@arm.com>
2021-06-11 09:44:16 +01:00
ellnor01
ca673af488 First steps in tidying image/kernel_read_write tests (#1121)
* Move common global variable and functions to header

InitFloatCoords for 3D read images has also been renamed
so it can later be used as a common function

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>

* Set-up for using 3D functions as a base

test_read_image_3D had been moved to common.cpp (and renamed
test_read_image) along with corresponding
determine_validation_error_offset and InitFloatCoords.

Only function names and the formatting have been changed.

Contributes #616

Signed-off-by: Ellen Norris-Thompson <ellen.norris-thompson@arm.com>
2021-02-02 08:56:00 +00:00
Kévin Petit
ec3959bfd1 Make it possible to run kernel_read_write on a 1.x implementation (#615)
Signed-off-by: Kévin Petit <kpet@free.fr>
2020-02-24 10:29:30 +00:00