OpenCL-CTS/test_conformance at b73c3149ad7930048d00a0c9d2fab046e41d227c - OpenCL-CTS - Gitea: Git with a cup of tea

ahmed/OpenCL-CTS

mirror of https://github.com/KhronosGroup/OpenCL-CTS.git synced 2026-03-19 06:09:01 +00:00

Files

History

Chip Davis b73c3149ad Image streams optimization (#1616 )

* Don't recalculate image parameters repeatedly in `test_read_image()`

We've already done this in the loop. There's no need to recalculate
those parameters over and over again in `sample_image_pixel*()` and
`read_image_pixel*()`. This should save some work during the image
streams test.

This only affects the 3D tests for now, but my time profiles indicate
this is where we spend the most time anyway.

* Vectorize read_image_pixel_float() and sample_image_pixel_float() for SSE/AVX

This shortens the image streams test time from 45 minutes without it to
37 minutes. Unfortunately, most of the time is now spent waiting for
memory, particularly in the 3D tests, because the 3D image doesn't
neatly fit in the cache, especially in the linear sampling case, where
pixels from two 2D slices must be sampled. Software prefetching won't
help; it only helps when execution time is dominated by operations, but
this is dominated by memory access. Randomized offsets are likely a
factor, because they throw off the hardware prefetcher.

One possible further optimization is, in the linear sampling case, to
load two sampled pixels at once. This is easy to do using AVX, which
extends SSE with 256-bit vectors.

Obviously, this only applies to x86 CPUs with SSE2. The greatest
performance gains, however, are seen with SSE4.1. Most modern x86 CPus
have SSE4. Work is needed to support other CPUs' vector units--ARM
Advanced SIMD/NEON is probably the most important one. Another
possibility is arranging the code so that the compiler's
autovectorization will kick in and do what I did here manually.

2023-02-07 08:46:15 -08:00

..

Allocations fixes (#1245 )

2021-05-18 18:12:55 +01:00

Fix unused-function warnings and enable -Wunused-function (#1576 )

2022-12-13 09:47:48 -08:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

Avoid use of rand in test_rw_image_access_qualifier (#1322 )

2023-01-31 09:42:45 -08:00

[NFCI] Remove unused variables and enable -Wunused-variable (#1483 )

2022-09-08 12:54:36 +01:00

c11 atomic fence: relaxed requirements for an auxiliary atomic_store (#1603 )

2023-01-31 09:47:47 -08:00

[NFC] commonfns: Remove unused values arrays (#1595 )

2022-12-14 07:34:30 -08:00

Partial clean up of test_compiler_defines_for_extensions (#1577 )

2022-12-13 09:48:35 -08:00

[NFCI] Remove unused variables and enable -Wunused-variable (#1483 )

2022-09-08 12:54:36 +01:00

Remove __DATE__ and __TIME__ usage (#1506 )

2022-09-23 17:29:18 +01:00

Conversions (#1555 )

2023-01-24 08:53:18 -08:00

Using helper functions for clCreateKernel (#1064 )

2021-01-07 11:34:42 +00:00

Using helper functions for clCreateKernel (#1064 )

2021-01-07 11:34:42 +00:00

device_execution

remove min max macros (#1310 )

2021-09-13 13:25:32 +01:00

device_partition

Remove imageSupportRequired parameter to runTestHarness (#1077 )

2020-12-09 16:12:40 +00:00

Fix code format errors vs.3

2020-07-23 17:21:07 +01:00

events: Remove unused BufferAction::Setup parameter (#1586 )

2022-12-13 09:53:11 -08:00

Command buffer queue substitution (#1584 )

2023-01-24 09:44:16 -08:00

generic_address_space

Fix generic address space OpenCL 2.0 assumption (#1575 )

2022-11-23 14:08:30 +00:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

[NFC] clang-format gl (#1612 )

2023-02-06 15:09:04 +00:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

Half (#1554 )

2023-01-24 08:48:53 -08:00

Image streams optimization (#1616 )

2023-02-07 08:46:15 -08:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

math_brute_force

Enqueue fill buffer (#1561 )

2023-01-24 08:51:00 -08:00

Fix -Woverloaded-virtual warnings (#1599 )

2023-01-27 16:34:22 +00:00

multiple_device_context

Fix misleading indentation and enable -Wmisleading-indentation (#1458 )

2022-08-02 18:16:03 +01:00

non_uniform_work_group

Change Behviour of non-uniform-work-group tests for OpenCL-3.0 (#877 )

2020-09-01 08:16:17 -07:00

pipes: Fix readwrite verification function for fp64 (#1522 )

2022-10-11 09:36:33 -07:00

[NFC] cmake: Remove redundant CMAKE_CXX_STANDARD (#1558 )

2022-11-04 08:53:42 -07:00

Fix misleading indentation and enable -Wmisleading-indentation (#1458 )

2022-08-02 18:16:03 +01:00

relationals: Use stringstream in print_hex_mem_dump (#1597 )

2023-01-20 15:11:31 +00:00

select: Use MTdataHolder (#1609 )

2023-01-31 09:50:21 -08:00

Fix -Woverloaded-virtual warnings (#1599 )

2023-01-27 16:34:22 +00:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

[NFC] cmake: Remove redundant CMAKE_CXX_STANDARD (#1558 )

2022-11-04 08:53:42 -07:00

thread_dimensions

Remove imageSupportRequired parameter to runTestHarness (#1077 )

2020-12-09 16:12:40 +00:00

Get rid of threadTesting.h (#1604 )

2023-01-14 15:18:27 +00:00

Fix vulkan test build issue on Intel compiler (#1464 )

2022-11-01 13:05:37 -07:00

[NFC] workgroups: Remove unused array (#1572 )

2022-11-23 14:04:31 +00:00

CMakeCommon.txt

CMake Build: Tidy up when -msse2 is passed to gcc (#622 )

2020-02-25 08:56:54 +00:00

CMakeLists.txt

Make building the Vulkan interop tests optional (#1530 )

2022-10-27 10:16:16 +01:00

generate_spirv_offline.py

Reimplement invocation of offline compilation program

2019-08-12 10:18:06 +01:00

opencl_conformance_tests_conversions.csv

Synchronise with Khronos-private Gitlab branch

2019-03-05 16:23:49 +00:00

opencl_conformance_tests_d3d.csv

Synchronise with Khronos-private Gitlab branch

2019-03-05 16:23:49 +00:00

opencl_conformance_tests_full_binary.csv

Add a binary compile mode CSV (#987 )

2020-10-13 09:24:22 +01:00

opencl_conformance_tests_full_no_math_or_conversions.csv

Replace use of -ILPath with --spirv-binaries-path in CSV (#981 )

2020-09-25 14:25:26 +01:00

opencl_conformance_tests_full_spirv.csv

Replace use of -ILPath with --spirv-binaries-path in CSV (#981 )

2020-09-25 14:25:26 +01:00

opencl_conformance_tests_full.csv

Update CTS csv files. (#971 )

2020-09-24 20:36:52 +01:00

opencl_conformance_tests_math.csv

Update CTS csv files. (#971 )

2020-09-24 20:36:52 +01:00

opencl_conformance_tests_quick.csv

Update CTS csv files. (#971 )

2020-09-24 20:36:52 +01:00

run_conformance.py

Add Python 3 support to run_conformance.py (#1470 )

2022-10-03 14:26:43 +01:00

submission_details_template.txt

Sync submission_details with conformance doc v26 (#1389 )

2022-02-22 16:49:35 +00:00