LunarG has changed the SDK file hosted for MacOS v1.3.275.0 from a .zip
file to a .dmg file with the old hyperlink transparently redirecting to
the new one.
The script expects a .zip archive so it fails when it downloads a .dmg
file.
Bump the Vulkan SDK version to 1.4.309.0, which is the latest version
and is provided as a .zip archive.
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
The test is using CL_DEVICE_MAX_WORK_GROUP_SIZE as first dimension of
local work size. But it can be bigger than the first dimension of
CL_DEVICE_MAX_WORK_ITEM_SIZEs which results in failure.
This patch corrects it to query and use the first dimension of
CL_DEVICE_MAX_WORK_ITEM_SIZES instead.
Signed-off-by: Xing Huang <xing.huang@arm.com>
Fix warnings such as:
test_vloadstore.cpp:330:49: error: format string is not a string literal
(potentially insecure)
There were no security issues here as the format string arguments do not
contain any conversion specifiers and are never written to. Make that
latter fact explicit to avoid the warnings.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Reflects the changes to the specification:
https://github.com/KhronosGroup/OpenCL-Docs/pull/1318
Relaxed embedded exp and exp2 ulps will be 4 + floor(fabs(2 * x)).
Reciprocal and divide are unchanged because the code already handles the
embedded profile case, see unary_float.c and binary_operator_float.c.
---------
Co-authored-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
A recent improvement to the SPIR-V validator added checks to ensure the
**Float16** capability is declared when directly operating on fp16
values, which identified issues in one of our SPIR-V test files. This PR
fixes the SPIR-V files to add the missing capability.
Adds a basic test for the SPIR-V 1.6 UniformDecoration decorations.
Specifically:
* Tests both the Uniform and UniformId decorations.
* Tests the decorations on constants, function parameters, and
variables.
Both `REGISTER_TEST` and `REQUIRE_EXTENSION` expect cl_device_id
variable but the variable name is inconsistent which makes both macros
unusable together.
This change renames `deviceID` in `REQUIRE_EXTENSION` to `device` to be
consistent with `REGISTER_TEST`.
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
The `extinst_printf_operands_scalar_int64` test could fail on 32-bit
platforms with `CL_INVALID_ARG_SIZE`, because the helper function was
not guaranteed to be instantiated using a 64-bit integer template type.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Treat reciprocal as a unary function, instead of handling it through the
binary function testing mechanism and special-casing it there.
This addresses two shortcomings of the previous implementation:
- Testing took significantly longer as the entire input domain was
tested many times (e.g. fp16 reciprocal has only 2^16 possible input
values, but binary function testing iterates over 2^16 * 2^16 input
values).
- The reciprocal test kernel was identical to the divide kernel. Thus
the device compiler would see a regular divide operation instead of a
reciprocal operation and would be unlikely to emit a specialized
reciprocal sequence.
This reverts all of the changes in binary_operator*.cpp made by
bcfa1f7c2 ("Added corrections to re-enable reciprocal test in
math_brute_force suite for relaxed math mode (#2221)", 2025-02-04).
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
`semaphores_ooo_ops_cross_queue` uses two OOO command queues to run the
test. In one queue, a semaphore is signalled and the semahpore is waited
on in the other queue.
The CL specification requires the application to synchronize the queues
if objects are shared.
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
Project fails to build on systems with a kernel version older than 5.6.0
because of `-Wunused-function` combined with `-Werror`.
Expand the conditional compilation guard to include the offending code.
This adds support for allocating DMA buffers on systems that support it,
i.e. Linux and Android.
On mainline Linux, starting version 5.6 (equivalent to Android 12),
there is a new kernel module framework available called [DMA-BUF
Heaps](https://github.com/torvalds/linux/blob/master/drivers/dma-buf/dma-heap.c).
The goal of this framework is to provide a standardised way for user
applications to allocate and share memory buffers between different
devices, subsystems, etc. The main feature of interest is that the
framework provides device-agnostic allocation; it abstracts away the
underlying hardware, and provides a single IOCTL,
`DMA_HEAP_IOCTL_ALLOC`. Mainline implementation provides two heaps that
act as character devices that can allocate DMA buffers; system, which
uses the buddy allocator, and cma, which uses the
[CMA](https://developer.toradex.com/software/linux-resources/linux-features/contiguous-memory-allocator-cma-linux/)
(Contiguous Memory Allocator). Both of these are [kernel configuration
options](https://github.com/torvalds/linux/blob/master/drivers/dma-buf/heaps/Kconfig)
that need to be enabled when building the Linux kernel. Generally, any
kernel module implementing this framework is made available under
/dev/dma_heaps/<heap_name>, e.g. /dev/dma_heaps/system.
The implementation currently only supports one type of DMA heaps;
`system`, the default device path for which is `/dev/dma_heap/system`.
The path can be overridden at runtime using an environment variable,
`OCL_CTS_DMA_HEAP_PATH_SYSTEM`, if needed. Extending this in the future
should be trivial (subject to platform support), by adding an entry to
the enum `dma_buf_heap_type`, and an appropriate default path and
overriding environment variable name.
The proposed implementation will conditionally compile if the conditions
are met (i.e. building for Linux or Android, using kernel headers >=
5.6.0), and will provide a compile-time warning otherwise, and return
`-1` as the DMA handle in runtime if not.
To demonstrate the functionality, a new test is added for the
`cl_khr_external_memory_dma_buf` extension. If the extension is
supported by the device, a DMA buffer will be allocated and used to
create a CL buffer, that is then used by a simple kernel.
This should provide a way forward for adding more tests that depend on
DMA buffers.
---------
Signed-off-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Co-authored-by: Gorazd Sumkovski <gorazd.sumkovski@arm.com>
This PR fixes the validation logic for cases where the data type is not
half. Because the variable nan_test is always false, types like float
never trigger a validation failure.
1. Remove duplicate `create_image` code that is in both clFillImage and
clCopyImage test directories.
2. Unify how pitch buffer's memory is deallocated; The buffer can be
allocated with either `malloc` or `align_malloc` and the free function
is pre-set in `pitch_buffe_data`'s member variable `free_fn` and used
when the buffer is deallocated. With this, the change removes
`is_aligned` conditional variable that was used to select the
appropriate free function.
Signed-off-by: Michael Rizkalla <michael.rizkalla@arm.com>
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper
size
- clamp localThreads[2] as for localThreads[0] and localThreads[2]
- clamp all localThreads elements in regard of CL_MAX_WORK_GROUP_SIZE
- fix the size using to create/read the output buffer
Fix#2238
%a or %A with printf on MSVC platforms have a default precision of 13,
which is in contrast to OpenCL C specification for printf which only
allows for exact digits required if precision is not specified.
Add installation rules for all the binary targets.
Targets are installed under `<CMAKE_INSTALL_PREFIX>/bin/<CONFIG>` where
`<CONFIG>` is `CMAKE_BUILD_TYPE` for single-config generators, e.g. Unix
Makefiles and Ninja, or the build configuration for multi-config
generators, e.g. Ninja Multi-Config and Visual Studio.
This creates the target `install` on Unix and `INSTALL` on Windows.
Print the build log when building the program in `get_program_with_il`
fails, to make it easier to investigate spirv_new test failures.
Factor out a global helper function `OutputBuildLog` for printing the
build log for a single device.
Signed-off-by: Sven van Haastregt <sven.vanhaastregt@arm.com>
Fixes several compile issues I am seeing for my version of Visual Studio
related to an ambiguous call to `fpclassify`, which is called by `isnan`
and other similar functions, specifically for the `cl_half` type:
```
19>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt_math.h(401,1): error C2668: 'fpclassify': ambiguous call to overloaded function
19>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt_math.h(298,31): message : could be 'int fpclassify(long double) throw()'
19>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt_math.h(293,31): message : or 'int fpclassify(double) throw()'
19>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt_math.h(288,31): message : or 'int fpclassify(float) throw()'
19>C:\Program Files (x86)\Windows Kits\10\Include\10.0.19041.0\ucrt\corecrt_math.h(401,1): message : while trying to match the argument list '(_Ty)'
```
Some of these issues seem like differences in compiler behavior, but at
least one appears to have identified a legitimate bug.
Specifically, this change:
* Removes the special-case checks for finite half numbers for commonfns,
since this is already handled by `UlpFn`. (test with: `test_commonfns
degrees radians`)
* Assigns to temporary variables to eliminate the ambiguous function
call for relationals. (test with: `test_relationals relational*`)
* Properly converts from half to float when checking for NaNs for
select. This is the one that seems like a legitimate bug. (test with:
`test_select select_half_ushort select_half_short`)
* Uses `std::enable_if` to disambiguate a function call for spirv_new.
(test with: `test_spirv_new decorate_saturated*`)
If it's helpful, my specific Visual Studio version is:
```
Microsoft Visual Studio Professional 2019
Version 16.11.20
VisualStudio.16.Release/16.11.20+32929.386
```
I also have the Windows Software Development Kit 10.0.19041.685
installed.