Chip Davis b7c9f6a656 Image streams optimization (#1616)
* Don't recalculate image parameters repeatedly in `test_read_image()`

We've already done this in the loop. There's no need to recalculate
those parameters over and over again in `sample_image_pixel*()` and
`read_image_pixel*()`. This should save some work during the image
streams test.

This only affects the 3D tests for now, but my time profiles indicate
this is where we spend the most time anyway.

* Vectorize read_image_pixel_float() and sample_image_pixel_float() for SSE/AVX

This shortens the image streams test time from 45 minutes without it to
37 minutes. Unfortunately, most of the time is now spent waiting for
memory, particularly in the 3D tests, because the 3D image doesn't
neatly fit in the cache, especially in the linear sampling case, where
pixels from two 2D slices must be sampled. Software prefetching won't
help; it only helps when execution time is dominated by operations, but
this is dominated by memory access. Randomized offsets are likely a
factor, because they throw off the hardware prefetcher.

One possible further optimization is, in the linear sampling case, to
load two sampled pixels at once. This is easy to do using AVX, which
extends SSE with 256-bit vectors.

Obviously, this only applies to x86 CPUs with SSE2. The greatest
performance gains, however, are seen with SSE4.1. Most modern x86 CPus
have SSE4. Work is needed to support other CPUs' vector units--ARM
Advanced SIMD/NEON is probably the most important one. Another
possibility is arranging the code so that the compiler's
autovectorization will kick in and do what I did here manually.
2023-03-21 12:57:30 +00:00
2018-10-10 16:02:58 -04:00
2022-05-17 08:54:39 -07:00

OpenCL Conformance Test Suite (CTS)

This it the OpenCL CTS for all versions of the Khronos OpenCL standard.

Building the CTS

The CTS supports Linux, Windows, macOS, and Android platforms. In particular, GitHub Actions CI builds against Ubuntu 20.04, Windows-latest, and macos-latest.

Compiling the CTS requires the following CMake configuration options to be set:

  • CL_INCLUDE_DIR Points to the unified OpenCL-Headers.
  • CL_LIB_DIR Directory containing the OpenCL library to build against.
  • OPENCL_LIBRARIES Name of the OpenCL library to link.

It is advised that the OpenCL ICD-Loader is used as the OpenCL library to build against. Where CL_LIB_DIR points to a build of the ICD loader and OPENCL_LIBRARIES is "OpenCL".

Example Build

Steps on a Linux platform to clone dependencies from GitHub sources, configure a build, and compile.

git clone https://github.com/KhronosGroup/OpenCL-CTS.git
git clone https://github.com/KhronosGroup/OpenCL-Headers.git
git clone https://github.com/KhronosGroup/OpenCL-ICD-Loader.git

mkdir OpenCL-ICD-Loader/build
cmake -S OpenCL-ICD-Loader -B OpenCL-ICD-Loader/build \
      -DOPENCL_ICD_LOADER_HEADERS_DIR=$PWD/OpenCL-Headers
cmake --build ./OpenCL-ICD-Loader/build --config Release

mkdir OpenCL-CTS/build
cmake -S OpenCL-CTS -B OpenCL-CTS/build \
      -DCL_INCLUDE_DIR=$PWD/OpenCL-Headers \
      -DCL_LIB_DIR=$PWD/OpenCL-ICD-Loader/build \
      -DOPENCL_LIBRARIES=OpenCL
cmake --build OpenCL-CTS/build --config Release

Running the CTS

A build of the CTS contains multiple executables representing the directories in the test_conformance folder. Each of these executables contains sub-tests, and possibly smaller granularities of testing within the sub-tests.

See the --help output on each executable for the list of sub-tests available, as well as other options for configuring execution.

If the OpenCL library built against is the ICD Loader, and the vendor library to be tested is not registered in the default ICD Loader location then the OCL_ICD_FILENAMES environment variable will need to be set for the ICD Loader to detect the OpenCL library to use at runtime. For example, to run the basic tests on a Linux platform:

OCL_ICD_FILENAMES=/path/to/vendor_lib.so ./test_basic

Offline Compilation

Testing OpenCL drivers which do not have a runtime compiler can be done by using additional command line arguments provided by the test harness for tests which require compilation, these are:

  • --compilation-mode Selects if OpenCL-C source code should be compiled using an external tool before being passed on to the OpenCL driver in that form for testing. Online is the default mode, but also accepts the values spir-v, and binary.

  • --compilation-cache-mode Controls how the compiled OpenCL-C source code should be cached on disk.

  • --compilation-cache-path Accepts a path to a directory where the compiled binary cache should be stored on disk.

  • --compilation-program Accepts a path to an executable (default: cl_offline_compiler) invoked by the test harness to perform offline compilation of OpenCL-C source code. This executable must match the interface description.

Generating a Conformance Report

The Khronos Conformance Process Document details the steps required for a conformance submissions. In this repository opencl_conformance_tests_full.csv defines the full list of tests which must be run for conformance. The output log of which must be included alongside a filled in submission details template.

Utility script run_conformance.py can be used to help generating the submission log, although it is not required.

Git tags are used to define the version of the repository conformance submissions are made against.

Contributing

Contributions are welcome to the project from Khronos members and non-members alike via GitHub Pull Requests (PR). Alternatively, if you've found a bug or have a questions please file an issue in the GitHub project. First time contributors will be required to sign the Khronos Contributor License Agreement (CLA) before their PR can be merged.

PRs to the repository are required to be clang-format clean to pass CI. Developers can either use the git-clang-format tool locally to verify this before contributing, or update their PR based on the diff provided by a failing CI job.

Description
No description provided
Readme 44 MiB
Languages
C++ 84.5%
C 14.7%
CMake 0.5%
Python 0.3%