Synchronise with Khronos-private Gitlab branch

The maintenance of the conformance tests is moving to Github.

This commit contains all the changes that have been done in
Gitlab since the first public release of the conformance tests.

Signed-off-by: Kevin Petit <kevin.petit@arm.com>
This commit is contained in:
Kevin Petit
2019-02-20 16:36:05 +00:00
committed by Kévin Petit
parent 95196e7fb4
commit d8733efc0f
576 changed files with 212486 additions and 191776 deletions

View File

@@ -1,95 +1,95 @@
2008-09-04 - created by David Black-Schaffer 2008-09-04 - created by David Black-Schaffer
2008-09-31 - updated for reorganization 2008-09-31 - updated for reorganization
============================================================== ==============================================================
*** Where to put tests: *** Where to put tests:
============================================================== ==============================================================
test_apps - complete applications used for testing test_apps - complete applications used for testing
test_common - frameworks used across multiple tests test_common - frameworks used across multiple tests
test_conformance - conformance tests test_conformance - conformance tests
test_development - tests used for development or being developed test_development - tests used for development or being developed
test_internal - tests for private functionality test_internal - tests for private functionality
test_performance - performance tests test_performance - performance tests
Tests placed in other locations will be moved without warning. Tests placed in other locations will be moved without warning.
============================================================== ==============================================================
*** How to setup tests: *** How to setup tests:
============================================================== ==============================================================
To create a new test to run through OATS, you need to: To create a new test to run through OATS, you need to:
1) write the test 1) write the test
2) use ATF to report errors, info, and performance numbers 2) use ATF to report errors, info, and performance numbers
3) make a Makefile that correctly builds with ATF for OATS and builds fat 3) make a Makefile that correctly builds with ATF for OATS and builds fat
4) add the test to the local Makefile (e.g., test_conformance/Makefile) 4) add the test to the local Makefile (e.g., test_conformance/Makefile)
5) add the test to OATS 5) add the test to OATS
6) add the test to the appropriate test suite on OATS 6) add the test to the appropriate test suite on OATS
7) and then add the test to the run_tests_local.py script so it can be run locally. 7) and then add the test to the run_tests_local.py script so it can be run locally.
8) If you want the tests distributed, add them to the zip_tests_for_drops.py script appropriately. 8) If you want the tests distributed, add them to the zip_tests_for_drops.py script appropriately.
--------------------------------------------------------- ---------------------------------------------------------
Use ATF (OATS's Automated Test Framework) Use ATF (OATS's Automated Test Framework)
--------------------------------------------------------- ---------------------------------------------------------
ATF is the only way to report errors to OATS. If you don't use this OATS will have no way of knowing if a test failed or passed. You must use ATF for all output information and you should not use any printfs. ATF is the only way to report errors to OATS. If you don't use this OATS will have no way of knowing if a test failed or passed. You must use ATF for all output information and you should not use any printfs.
1) Make sure your Makefile for the test builds correctly with ATF. You need to include the ATF framework whenever the BUILD_WITH_ATF environment variable is set. This can be done as: 1) Make sure your Makefile for the test builds correctly with ATF. You need to include the ATF framework whenever the BUILD_WITH_ATF environment variable is set. This can be done as:
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
... ...
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%)
2) Make sure you use ATF for logging. This means no printf() output. All errors should be output with log_error and info with log_info. This can be done with: 2) Make sure you use ATF for logging. This means no printf() output. All errors should be output with log_error and info with log_info. This can be done with:
#if USE_ATF #if USE_ATF
#include <ATF/ATF.h> #include <ATF/ATF.h>
#define test_start() ATFTestStart() #define test_start() ATFTestStart()
#define log_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format,##__VA_ARGS__) #define log_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format,##__VA_ARGS__)
#define log_info ATFLogInfo #define log_info ATFLogInfo
#define log_error ATFLogError #define log_error ATFLogError
#define test_finish() ATFTestFinish() #define test_finish() ATFTestFinish()
#else #else
#define test_start() #define test_start()
#define log_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, _higherBetter?"higher is better":"lower is better" , _number) #define log_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, _higherBetter?"higher is better":"lower is better" , _number)
#define log_info printf #define log_info printf
#define log_error printf #define log_error printf
#define test_finish() #define test_finish()
#endif #endif
3) All performance information should be output with log_perf(). You need to specify the value, whether bigger is better, the units, and a name. 3) All performance information should be output with log_perf(). You need to specify the value, whether bigger is better, the units, and a name.
4) You need to call test_start() and test_finish() exactly once each in each test. That is, if you have a test that may bail on a failure condition you need to be sure to call test_finish() at each of those points. 4) You need to call test_start() and test_finish() exactly once each in each test. That is, if you have a test that may bail on a failure condition you need to be sure to call test_finish() at each of those points.
--------------------------------------------------------- ---------------------------------------------------------
Building 32- and 64-bit Building 32- and 64-bit
--------------------------------------------------------- ---------------------------------------------------------
1) Make sure your Makefile passes RC_CFLAGS into the compiler. E.g.,: 1) Make sure your Makefile passes RC_CFLAGS into the compiler. E.g.,:
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%)
2) If you are using C++ code with g++ you also need to set: 2) If you are using C++ code with g++ you also need to set:
CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%)
and you may need to pass in RC_CFLAGS to the linker: and you may need to pass in RC_CFLAGS to the linker:
$(TARGET): $(OBJECTS) $(TARGET): $(OBJECTS)
$(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES) $(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES)
3) Verify that this works by building fat (make RC_CFLAGS="-arch i386 -arch x86_64") and then running file on the output binary. You should see: 3) Verify that this works by building fat (make RC_CFLAGS="-arch i386 -arch x86_64") and then running file on the output binary. You should see:
blackschaffer:test_basic dbs$ file test_basic blackschaffer:test_basic dbs$ file test_basic
test_basic: Mach-O universal binary with 2 architectures test_basic: Mach-O universal binary with 2 architectures
test_basic (for architecture i386): Mach-O executable i386 test_basic (for architecture i386): Mach-O executable i386
test_basic (for architecture x86_64): Mach-O 64-bit executable x86_64 test_basic (for architecture x86_64): Mach-O 64-bit executable x86_64
--------------------------------------------------------- ---------------------------------------------------------
Setting up the test for OATS and adding it Setting up the test for OATS and adding it
--------------------------------------------------------- ---------------------------------------------------------
1) Make one or more run_subtest scripts in the directory that run the particular tests. Try to group the sub-tests together in logical units to make it easier to see them in OATS. (e.g., "run_step" runs step, stepf, smoothstep, and smoothstepf.) Note that these tests can only call one executable because OATS can only accept one tests_start()/test_end() per test. 1) Make one or more run_subtest scripts in the directory that run the particular tests. Try to group the sub-tests together in logical units to make it easier to see them in OATS. (e.g., "run_step" runs step, stepf, smoothstep, and smoothstepf.) Note that these tests can only call one executable because OATS can only accept one tests_start()/test_end() per test.
2) Add the test to OATS in the Test Admin page. Name the test "CL Test - subtest" and put in the path to the tests. (E.g., "CL Common Functions - step" points to "OpenCL_Tests/test_conformance/commonfns/run_step".) 2) Add the test to OATS in the Test Admin page. Name the test "CL Test - subtest" and put in the path to the tests. (E.g., "CL Common Functions - step" points to "OpenCL_Tests/test_conformance/commonfns/run_step".)
3) Add the test to the test suite (e.g., either OpenCL Tests or OpenCL Long Tests). Set the test run order such that basic functionality tests have a lower value (run first) and performance/application tests have a higher value (run last). 3) Add the test to the test suite (e.g., either OpenCL Tests or OpenCL Long Tests). Set the test run order such that basic functionality tests have a lower value (run first) and performance/application tests have a higher value (run last).
4) Add the test directory to the appropriate Makefile, and verify that it builds. This file is used when the tests are built for OATS. You should just need to add the test directory to the list of directories at the top. 4) Add the test directory to the appropriate Makefile, and verify that it builds. This file is used when the tests are built for OATS. You should just need to add the test directory to the list of directories at the top.
--------------------------------------------------------- ---------------------------------------------------------
Setting up the test for running it locally Setting up the test for running it locally
--------------------------------------------------------- ---------------------------------------------------------
1) Add the test run script to the list of tests in run_tests_local.py if the test is a short test. 1) Add the test run script to the list of tests in run_tests_local.py if the test is a short test.

104
clean_tests.py Executable file
View File

@@ -0,0 +1,104 @@
#!/usr/bin/python
import sys, os, re
from subprocess import Popen, PIPE
from optparse import OptionParser
# trail_spaces: This method removes the trailing whitespaces and trailing tabs
def trail_spaces(line):
newline=line
carreturn = 0
if re.search("\r\n",line):
carreturn = 1
status = re.search("\s+$",line)
if status:
if carreturn:
newline = re.sub("\s+$","\r\n",line)
else:
newline = re.sub("\s+$","\n",line)
status = re.search("\t+$",newline)
if status:
newline = re.sub("\t+$","",newline)
return newline
#convert_tabs: This methos converts tabs to 4 spaces
def convert_tabs(line):
newline=line
status = re.search("\t",line)
if status:
newline = re.sub("\t"," ",line)
return newline
#convert_lineends: This method converts lineendings from DOS to Unix
def convert_lineends(line):
newline=line
status = re.search("\r\n",line)
if status:
newline = re.sub("\r\n","\n",line)
return newline
#processfile: This method processes each file passed to it depending
# on the flags passed
def processfile(file,tabs, lineends,trails,verbose):
processed_data = []
if verbose:
print "processing file: "+file
with open(file,'r') as fr:
data = fr.readlines()
for line in data:
if tabs:
line = convert_tabs(line)
if lineends:
line = convert_lineends(line)
if trails:
line = trail_spaces(line)
processed_data.append(line)
with open(file,'w') as fw:
fw.writelines(processed_data)
#findfiles: This method finds all the code files present in current
# directory and subdirectories.
def findfiles(tabs,lineends,trails,verbose):
testfiles = []
for root, dirs, files in os.walk("./"):
for file in files:
for extn in ('.c','.cpp','.h','.hpp'):
if file.endswith(extn):
testfiles.append(os.path.join(root, file))
for file in testfiles:
processfile(file,tabs,lineends,trails,verbose)
# Main function
def main():
parser = OptionParser()
parser.add_option("--notabs", dest="tabs", action="store_false", default=True, help="Disable converting tabs to 4 spaces.")
parser.add_option("--notrails", dest="trails", action="store_false", default=True, help="Disable removing trailing whitespaces and trailing tabs.")
parser.add_option("--nolineends", dest="lineends", action="store_false", default=True, help=" Disable converting line endings to Unix from DOS.")
parser.add_option("--verbose", dest="verbose", action="store_true", default=False, help="Prints out the files being processed.")
parser.add_option("--git", dest="SHA1", default="", help="Processes only the files present in the particular <SHA1> commit.")
parser.add_option('-o', action="store", default=True, help="Default: All the code files (.c,.cpp,.h,.hpp) in the current directory and subdirectories will be processed")
(options, args) = parser.parse_args()
if options.SHA1:
pl = Popen(["git","show", "--pretty=format:", "--name-only",options.SHA1], stdout=PIPE)
cmdout = pl.communicate()[0]
gitout=cmdout.split("\n")
for file in gitout:
print file
if file:
processfile(file,options.tabs,options.lineends,options.trails,options.verbose)
if not options.SHA1:
findfiles(options.tabs,options.lineends,options.trails,options.verbose)
# start the process by calling main
main()

View File

@@ -1,26 +1,26 @@
PRODUCTS = harness/\ PRODUCTS = harness/\
# utils/ # utils/
TOP=$(shell pwd) TOP=$(shell pwd)
all: $(PRODUCTS) all: $(PRODUCTS)
clean: clean:
@for testdir in $(dir $(PRODUCTS)) ; \ @for testdir in $(dir $(PRODUCTS)) ; \
do ( \ do ( \
echo "==================================================================================" ; \ echo "==================================================================================" ; \
echo "Cleaning $$testdir" ; \ echo "Cleaning $$testdir" ; \
echo "==================================================================================" ; \ echo "==================================================================================" ; \
cd $$testdir && make clean \ cd $$testdir && make clean \
); \ ); \
done \ done \
$(PRODUCTS): $(PRODUCTS):
@echo "==================================================================================" ; @echo "==================================================================================" ;
@echo "(`date "+%H:%M:%S"`) Make $@" ; @echo "(`date "+%H:%M:%S"`) Make $@" ;
@echo "==================================================================================" ; @echo "==================================================================================" ;
cd $(dir $@) && make cd $(dir $@) && make
.PHONY: clean $(PRODUCTS) all .PHONY: clean $(PRODUCTS) all

View File

@@ -1,52 +1,52 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _gl_headers_h #ifndef _gl_headers_h
#define _gl_headers_h #define _gl_headers_h
#if defined( __APPLE__ ) #if defined( __APPLE__ )
#include <OpenGL/OpenGL.h> #include <OpenGL/OpenGL.h>
#if defined(CGL_VERSION_1_3) #if defined(CGL_VERSION_1_3)
#include <OpenGL/gl3.h> #include <OpenGL/gl3.h>
#include <OpenGL/gl3ext.h> #include <OpenGL/gl3ext.h>
#else #else
#include <OpenGL/gl.h> #include <OpenGL/gl.h>
#include <OpenGL/glext.h> #include <OpenGL/glext.h>
#endif #endif
#include <GLUT/glut.h> #include <GLUT/glut.h>
#else #else
#ifdef _WIN32 #ifdef _WIN32
#include <windows.h> #include <windows.h>
#endif #endif
#include <GL/glew.h> #include <GL/glew.h>
#include <GL/gl.h> #include <GL/gl.h>
#include <GL/glext.h> #include <GL/glext.h>
#ifdef _WIN32 #ifdef _WIN32
#include <GL/glut.h> #include <GL/glut.h>
#else #else
#include <GL/freeglut.h> #include <GL/freeglut.h>
#endif #endif
#endif #endif
#ifdef _WIN32 #ifdef _WIN32
GLboolean gluCheckExtension(const GLubyte *extName, const GLubyte *extString); GLboolean gluCheckExtension(const GLubyte *extName, const GLubyte *extString);
// No glutGetProcAddress in the standard glut v3.7. // No glutGetProcAddress in the standard glut v3.7.
#define glutGetProcAddress(procName) wglGetProcAddress(procName) #define glutGetProcAddress(procName) wglGetProcAddress(procName)
#endif #endif
#endif // __gl_headers_h #endif // __gl_headers_h

File diff suppressed because it is too large Load Diff

View File

@@ -1,288 +1,288 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _helpers_h #ifndef _helpers_h
#define _helpers_h #define _helpers_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <sys/types.h> #include <sys/types.h>
#include <sys/stat.h> #include <sys/stat.h>
#if !defined (__APPLE__) #if !defined (__APPLE__)
#include <CL/cl.h> #include <CL/cl.h>
#include "gl_headers.h" #include "gl_headers.h"
#include <CL/cl_gl.h> #include <CL/cl_gl.h>
#else #else
#include "gl_headers.h" #include "gl_headers.h"
#endif #endif
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include "../../test_common/harness/kernelHelpers.h" #include "../../test_common/harness/kernelHelpers.h"
#include "../../test_common/harness/threadTesting.h" #include "../../test_common/harness/threadTesting.h"
#include "../../test_common/harness/typeWrappers.h" #include "../../test_common/harness/typeWrappers.h"
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
#include "../../test_common/harness/mt19937.h" #include "../../test_common/harness/mt19937.h"
typedef cl_mem typedef cl_mem
(CL_API_CALL *clCreateFromGLBuffer_fn)(cl_context context, (CL_API_CALL *clCreateFromGLBuffer_fn)(cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
GLuint bufobj, GLuint bufobj,
int * errcode_ret); int * errcode_ret);
typedef cl_mem typedef cl_mem
(CL_API_CALL *clCreateFromGLTexture_fn)(cl_context context , (CL_API_CALL *clCreateFromGLTexture_fn)(cl_context context ,
cl_mem_flags flags , cl_mem_flags flags ,
GLenum target , GLenum target ,
GLint miplevel , GLint miplevel ,
GLuint texture , GLuint texture ,
cl_int * errcode_ret) ; cl_int * errcode_ret) ;
typedef cl_mem typedef cl_mem
(CL_API_CALL *clCreateFromGLTexture2D_fn)(cl_context context , (CL_API_CALL *clCreateFromGLTexture2D_fn)(cl_context context ,
cl_mem_flags flags , cl_mem_flags flags ,
GLenum target , GLenum target ,
GLint miplevel , GLint miplevel ,
GLuint texture , GLuint texture ,
cl_int * errcode_ret) ; cl_int * errcode_ret) ;
typedef cl_mem typedef cl_mem
(CL_API_CALL *clCreateFromGLTexture3D_fn)(cl_context context , (CL_API_CALL *clCreateFromGLTexture3D_fn)(cl_context context ,
cl_mem_flags flags , cl_mem_flags flags ,
GLenum target , GLenum target ,
GLint miplevel , GLint miplevel ,
GLuint texture , GLuint texture ,
cl_int * errcode_ret) ; cl_int * errcode_ret) ;
typedef cl_mem typedef cl_mem
(CL_API_CALL *clCreateFromGLRenderbuffer_fn)(cl_context context , (CL_API_CALL *clCreateFromGLRenderbuffer_fn)(cl_context context ,
cl_mem_flags flags , cl_mem_flags flags ,
GLuint renderbuffer , GLuint renderbuffer ,
cl_int * errcode_ret) ; cl_int * errcode_ret) ;
typedef cl_int typedef cl_int
(CL_API_CALL *clGetGLObjectInfo_fn)(cl_mem memobj , (CL_API_CALL *clGetGLObjectInfo_fn)(cl_mem memobj ,
cl_gl_object_type * gl_object_type , cl_gl_object_type * gl_object_type ,
GLuint * gl_object_name) ; GLuint * gl_object_name) ;
typedef cl_int typedef cl_int
(CL_API_CALL *clGetGLTextureInfo_fn)(cl_mem memobj , (CL_API_CALL *clGetGLTextureInfo_fn)(cl_mem memobj ,
cl_gl_texture_info param_name , cl_gl_texture_info param_name ,
size_t param_value_size , size_t param_value_size ,
void * param_value , void * param_value ,
size_t * param_value_size_ret) ; size_t * param_value_size_ret) ;
typedef cl_int typedef cl_int
(CL_API_CALL *clEnqueueAcquireGLObjects_fn)(cl_command_queue command_queue , (CL_API_CALL *clEnqueueAcquireGLObjects_fn)(cl_command_queue command_queue ,
cl_uint num_objects , cl_uint num_objects ,
const cl_mem * mem_objects , const cl_mem * mem_objects ,
cl_uint num_events_in_wait_list , cl_uint num_events_in_wait_list ,
const cl_event * event_wait_list , const cl_event * event_wait_list ,
cl_event * event) ; cl_event * event) ;
typedef cl_int typedef cl_int
(CL_API_CALL *clEnqueueReleaseGLObjects_fn)(cl_command_queue command_queue , (CL_API_CALL *clEnqueueReleaseGLObjects_fn)(cl_command_queue command_queue ,
cl_uint num_objects , cl_uint num_objects ,
const cl_mem * mem_objects , const cl_mem * mem_objects ,
cl_uint num_events_in_wait_list , cl_uint num_events_in_wait_list ,
const cl_event * event_wait_list , const cl_event * event_wait_list ,
cl_event * event) ; cl_event * event) ;
extern clCreateFromGLBuffer_fn clCreateFromGLBuffer_ptr; extern clCreateFromGLBuffer_fn clCreateFromGLBuffer_ptr;
extern clCreateFromGLTexture_fn clCreateFromGLTexture_ptr; extern clCreateFromGLTexture_fn clCreateFromGLTexture_ptr;
extern clCreateFromGLTexture2D_fn clCreateFromGLTexture2D_ptr; extern clCreateFromGLTexture2D_fn clCreateFromGLTexture2D_ptr;
extern clCreateFromGLTexture3D_fn clCreateFromGLTexture3D_ptr; extern clCreateFromGLTexture3D_fn clCreateFromGLTexture3D_ptr;
extern clCreateFromGLRenderbuffer_fn clCreateFromGLRenderbuffer_ptr; extern clCreateFromGLRenderbuffer_fn clCreateFromGLRenderbuffer_ptr;
extern clGetGLObjectInfo_fn clGetGLObjectInfo_ptr; extern clGetGLObjectInfo_fn clGetGLObjectInfo_ptr;
extern clGetGLTextureInfo_fn clGetGLTextureInfo_ptr; extern clGetGLTextureInfo_fn clGetGLTextureInfo_ptr;
extern clEnqueueAcquireGLObjects_fn clEnqueueAcquireGLObjects_ptr; extern clEnqueueAcquireGLObjects_fn clEnqueueAcquireGLObjects_ptr;
extern clEnqueueReleaseGLObjects_fn clEnqueueReleaseGLObjects_ptr; extern clEnqueueReleaseGLObjects_fn clEnqueueReleaseGLObjects_ptr;
class glBufferWrapper class glBufferWrapper
{ {
public: public:
glBufferWrapper() { mBuffer = 0; } glBufferWrapper() { mBuffer = 0; }
glBufferWrapper( GLuint b ) { mBuffer = b; } glBufferWrapper( GLuint b ) { mBuffer = b; }
~glBufferWrapper() { if( mBuffer != 0 ) glDeleteBuffers( 1, &mBuffer ); } ~glBufferWrapper() { if( mBuffer != 0 ) glDeleteBuffers( 1, &mBuffer ); }
glBufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; } glBufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; }
operator GLuint() { return mBuffer; } operator GLuint() { return mBuffer; }
operator GLuint *() { return &mBuffer; } operator GLuint *() { return &mBuffer; }
GLuint * operator&() { return &mBuffer; } GLuint * operator&() { return &mBuffer; }
bool operator==( GLuint rhs ) { return mBuffer == rhs; } bool operator==( GLuint rhs ) { return mBuffer == rhs; }
protected: protected:
GLuint mBuffer; GLuint mBuffer;
}; };
class glTextureWrapper class glTextureWrapper
{ {
public: public:
glTextureWrapper() { mHandle = 0; } glTextureWrapper() { mHandle = 0; }
glTextureWrapper( GLuint b ) { mHandle = b; } glTextureWrapper( GLuint b ) { mHandle = b; }
~glTextureWrapper() { ~glTextureWrapper() {
if( mHandle != 0 ) glDeleteTextures( 1, &mHandle ); if( mHandle != 0 ) glDeleteTextures( 1, &mHandle );
} }
glTextureWrapper & operator=( const GLuint &rhs ) { mHandle = rhs; return *this; } glTextureWrapper & operator=( const GLuint &rhs ) { mHandle = rhs; return *this; }
operator GLuint() { return mHandle; } operator GLuint() { return mHandle; }
operator GLuint *() { return &mHandle; } operator GLuint *() { return &mHandle; }
GLuint * operator&() { return &mHandle; } GLuint * operator&() { return &mHandle; }
bool operator==( GLuint rhs ) { return mHandle == rhs; } bool operator==( GLuint rhs ) { return mHandle == rhs; }
protected: protected:
// The texture handle. // The texture handle.
GLuint mHandle; GLuint mHandle;
}; };
class glRenderbufferWrapper class glRenderbufferWrapper
{ {
public: public:
glRenderbufferWrapper() { mBuffer = 0; } glRenderbufferWrapper() { mBuffer = 0; }
glRenderbufferWrapper( GLuint b ) { mBuffer = b; } glRenderbufferWrapper( GLuint b ) { mBuffer = b; }
~glRenderbufferWrapper() { if( mBuffer != 0 ) glDeleteRenderbuffersEXT( 1, &mBuffer ); } ~glRenderbufferWrapper() { if( mBuffer != 0 ) glDeleteRenderbuffersEXT( 1, &mBuffer ); }
glRenderbufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; } glRenderbufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; }
operator GLuint() { return mBuffer; } operator GLuint() { return mBuffer; }
operator GLuint *() { return &mBuffer; } operator GLuint *() { return &mBuffer; }
GLuint * operator&() { return &mBuffer; } GLuint * operator&() { return &mBuffer; }
bool operator==( GLuint rhs ) { return mBuffer == rhs; } bool operator==( GLuint rhs ) { return mBuffer == rhs; }
protected: protected:
GLuint mBuffer; GLuint mBuffer;
}; };
class glFramebufferWrapper class glFramebufferWrapper
{ {
public: public:
glFramebufferWrapper() { mBuffer = 0; } glFramebufferWrapper() { mBuffer = 0; }
glFramebufferWrapper( GLuint b ) { mBuffer = b; } glFramebufferWrapper( GLuint b ) { mBuffer = b; }
~glFramebufferWrapper() { if( mBuffer != 0 ) glDeleteFramebuffersEXT( 1, &mBuffer ); } ~glFramebufferWrapper() { if( mBuffer != 0 ) glDeleteFramebuffersEXT( 1, &mBuffer ); }
glFramebufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; } glFramebufferWrapper & operator=( const GLuint &rhs ) { mBuffer = rhs; return *this; }
operator GLuint() { return mBuffer; } operator GLuint() { return mBuffer; }
operator GLuint *() { return &mBuffer; } operator GLuint *() { return &mBuffer; }
GLuint * operator&() { return &mBuffer; } GLuint * operator&() { return &mBuffer; }
bool operator==( GLuint rhs ) { return mBuffer == rhs; } bool operator==( GLuint rhs ) { return mBuffer == rhs; }
protected: protected:
GLuint mBuffer; GLuint mBuffer;
}; };
// Helper functions (defined in helpers.cpp) // Helper functions (defined in helpers.cpp)
extern void * CreateGLTexture1DArray( size_t width, size_t length, extern void * CreateGLTexture1DArray( size_t width, size_t length,
GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType, GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTextureID, int *outError, ExplicitType type, GLuint *outTextureID, int *outError,
bool allocateMem, MTdata d); bool allocateMem, MTdata d);
extern void * CreateGLTexture2DArray( size_t width, size_t height, size_t length, extern void * CreateGLTexture2DArray( size_t width, size_t height, size_t length,
GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType, GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTextureID, int *outError, ExplicitType type, GLuint *outTextureID, int *outError,
bool allocateMem, MTdata d); bool allocateMem, MTdata d);
extern void * CreateGLTextureBuffer( size_t width, extern void * CreateGLTextureBuffer( size_t width,
GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType, GLenum target, GLenum glFormat, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTex, GLuint *outBuf, int *outError, ExplicitType type, GLuint *outTex, GLuint *outBuf, int *outError,
bool allocateMem, MTdata d); bool allocateMem, MTdata d);
extern void * CreateGLTexture1D(size_t width, extern void * CreateGLTexture1D(size_t width,
GLenum target, GLenum glFormat, GLenum target, GLenum glFormat,
GLenum internalFormat, GLenum glType, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTextureID, ExplicitType type, GLuint *outTextureID,
int *outError, bool allocateMem, MTdata d ); int *outError, bool allocateMem, MTdata d );
extern void * CreateGLTexture2D( size_t width, size_t height, extern void * CreateGLTexture2D( size_t width, size_t height,
GLenum target, GLenum glFormat, GLenum target, GLenum glFormat,
GLenum internalFormat, GLenum glType, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTextureID, ExplicitType type, GLuint *outTextureID,
int *outError, bool allocateMem, MTdata d ); int *outError, bool allocateMem, MTdata d );
extern void * CreateGLTexture3D( size_t width, size_t height, size_t depth, extern void * CreateGLTexture3D( size_t width, size_t height, size_t depth,
GLenum target, GLenum glFormat, GLenum target, GLenum glFormat,
GLenum internalFormat, GLenum glType, GLenum internalFormat, GLenum glType,
ExplicitType type, GLuint *outTextureID, ExplicitType type, GLuint *outTextureID,
int *outError, MTdata d, bool allocateMem = true ); int *outError, MTdata d, bool allocateMem = true );
extern void * ReadGLTexture( GLenum glTarget, GLuint glTexture, GLuint glBuf, GLint width, extern void * ReadGLTexture( GLenum glTarget, GLuint glTexture, GLuint glBuf, GLint width,
GLenum glFormat, GLenum glInternalFormat, GLenum glFormat, GLenum glInternalFormat,
GLenum glType, ExplicitType typeToReadAs, GLenum glType, ExplicitType typeToReadAs,
size_t outWidth, size_t outHeight ); size_t outWidth, size_t outHeight );
extern int CreateGLRenderbufferRaw( GLsizei width, GLsizei height, extern int CreateGLRenderbufferRaw( GLsizei width, GLsizei height,
GLenum target, GLenum glFormat, GLenum target, GLenum glFormat,
GLenum internalFormat, GLenum glType, GLenum internalFormat, GLenum glType,
GLuint *outFramebuffer, GLuint *outFramebuffer,
GLuint *outRenderbuffer ); GLuint *outRenderbuffer );
extern void * CreateGLRenderbuffer( GLsizei width, GLsizei height, extern void * CreateGLRenderbuffer( GLsizei width, GLsizei height,
GLenum target, GLenum glFormat, GLenum target, GLenum glFormat,
GLenum internalFormat, GLenum glType, GLenum internalFormat, GLenum glType,
ExplicitType type, ExplicitType type,
GLuint *outFramebuffer, GLuint *outFramebuffer,
GLuint *outRenderbuffer, GLuint *outRenderbuffer,
int *outError, MTdata d, bool allocateMem ); int *outError, MTdata d, bool allocateMem );
extern void * ReadGLRenderbuffer( GLuint glFramebuffer, GLuint glRenderbuffer, extern void * ReadGLRenderbuffer( GLuint glFramebuffer, GLuint glRenderbuffer,
GLenum attachment, GLenum glFormat, GLenum attachment, GLenum glFormat,
GLenum glInternalFormat, GLenum glType, GLenum glInternalFormat, GLenum glType,
ExplicitType typeToReadAs, ExplicitType typeToReadAs,
size_t outWidth, size_t outHeight ); size_t outWidth, size_t outHeight );
extern void DumpGLBuffer(GLenum type, size_t width, size_t height, void* buffer); extern void DumpGLBuffer(GLenum type, size_t width, size_t height, void* buffer);
extern const char *GetGLTypeName( GLenum type ); extern const char *GetGLTypeName( GLenum type );
extern const char *GetGLAttachmentName( GLenum att ); extern const char *GetGLAttachmentName( GLenum att );
extern const char *GetGLTargetName( GLenum tgt ); extern const char *GetGLTargetName( GLenum tgt );
extern const char *GetGLBaseFormatName( GLenum baseformat ); extern const char *GetGLBaseFormatName( GLenum baseformat );
extern const char *GetGLFormatName( GLenum format ); extern const char *GetGLFormatName( GLenum format );
extern void* CreateRandomData( ExplicitType type, size_t count, MTdata d ); extern void* CreateRandomData( ExplicitType type, size_t count, MTdata d );
extern GLenum GetGLFormat(GLenum internalFormat); extern GLenum GetGLFormat(GLenum internalFormat);
extern GLenum GetGLTypeForExplicitType(ExplicitType type); extern GLenum GetGLTypeForExplicitType(ExplicitType type);
extern size_t GetGLTypeSize(GLenum type); extern size_t GetGLTypeSize(GLenum type);
extern ExplicitType GetExplicitTypeForGLType(GLenum type); extern ExplicitType GetExplicitTypeForGLType(GLenum type);
extern GLenum get_base_gl_target( GLenum target ); extern GLenum get_base_gl_target( GLenum target );
extern int init_clgl_ext( void ); extern int init_clgl_ext( void );
#endif // _helpers_h #endif // _helpers_h

View File

@@ -1,48 +1,48 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _setup_h #ifndef _setup_h
#define _setup_h #define _setup_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include "gl_headers.h" #include "gl_headers.h"
#ifdef __APPLE__ #ifdef __APPLE__
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/opencl.h> #include <CL/opencl.h>
#endif #endif
// Note: the idea here is to have every platform define their own setup.cpp file that implements a GLEnvironment // Note: the idea here is to have every platform define their own setup.cpp file that implements a GLEnvironment
// subclass internally, then return it as a definition for GLEnvironment::Create // subclass internally, then return it as a definition for GLEnvironment::Create
class GLEnvironment class GLEnvironment
{ {
public: public:
GLEnvironment() {} GLEnvironment() {}
virtual ~GLEnvironment() {} virtual ~GLEnvironment() {}
virtual int Init( int *argc, char **argv, int use_opengl_32 ) = 0; virtual int Init( int *argc, char **argv, int use_opengl_32 ) = 0;
virtual cl_context CreateCLContext( void ) = 0; virtual cl_context CreateCLContext( void ) = 0;
virtual int SupportsCLGLInterop( cl_device_type device_type) = 0; virtual int SupportsCLGLInterop( cl_device_type device_type) = 0;
static GLEnvironment * Instance( void ); static GLEnvironment * Instance( void );
}; };
#endif // _setup_h #endif // _setup_h

View File

@@ -1,156 +1,156 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "setup.h" #include "setup.h"
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include <OpenGL/CGLDevice.h> #include <OpenGL/CGLDevice.h>
class OSXGLEnvironment : public GLEnvironment class OSXGLEnvironment : public GLEnvironment
{ {
public: public:
OSXGLEnvironment() OSXGLEnvironment()
{ {
mCGLContext = NULL; mCGLContext = NULL;
} }
virtual int Init( int *argc, char **argv, int use_opengl_32 ) virtual int Init( int *argc, char **argv, int use_opengl_32 )
{ {
if (!use_opengl_32) { if (!use_opengl_32) {
// Create a GLUT window to render into // Create a GLUT window to render into
glutInit( argc, argv ); glutInit( argc, argv );
glutInitWindowSize( 512, 512 ); glutInitWindowSize( 512, 512 );
glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE ); glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE );
glutCreateWindow( "OpenCL <-> OpenGL Test" ); glutCreateWindow( "OpenCL <-> OpenGL Test" );
} }
else { else {
CGLPixelFormatAttribute attribs[] = { CGLPixelFormatAttribute attribs[] = {
kCGLPFAOpenGLProfile, (CGLPixelFormatAttribute)kCGLOGLPVersion_3_2_Core, kCGLPFAOpenGLProfile, (CGLPixelFormatAttribute)kCGLOGLPVersion_3_2_Core,
kCGLPFAAllowOfflineRenderers, kCGLPFAAllowOfflineRenderers,
kCGLPFANoRecovery, kCGLPFANoRecovery,
kCGLPFAAccelerated, kCGLPFAAccelerated,
kCGLPFADoubleBuffer, kCGLPFADoubleBuffer,
(CGLPixelFormatAttribute)0 (CGLPixelFormatAttribute)0
}; };
CGLError err; CGLError err;
CGLPixelFormatObj pix; CGLPixelFormatObj pix;
GLint npix; GLint npix;
err = CGLChoosePixelFormat (attribs, &pix, &npix); err = CGLChoosePixelFormat (attribs, &pix, &npix);
if(err != kCGLNoError) if(err != kCGLNoError)
{ {
log_error("Failed to choose pixel format\n"); log_error("Failed to choose pixel format\n");
return -1; return -1;
} }
err = CGLCreateContext(pix, NULL, &mCGLContext); err = CGLCreateContext(pix, NULL, &mCGLContext);
if(err != kCGLNoError) if(err != kCGLNoError)
{ {
log_error("Failed to create GL context\n"); log_error("Failed to create GL context\n");
return -1; return -1;
} }
CGLSetCurrentContext(mCGLContext); CGLSetCurrentContext(mCGLContext);
} }
return 0; return 0;
} }
virtual cl_context CreateCLContext( void ) virtual cl_context CreateCLContext( void )
{ {
int error; int error;
if( mCGLContext == NULL ) if( mCGLContext == NULL )
mCGLContext = CGLGetCurrentContext(); mCGLContext = CGLGetCurrentContext();
CGLShareGroupObj share_group = CGLGetShareGroup(mCGLContext); CGLShareGroupObj share_group = CGLGetShareGroup(mCGLContext);
cl_context_properties properties[] = { CL_CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, (cl_context_properties)share_group, 0 }; cl_context_properties properties[] = { CL_CONTEXT_PROPERTY_USE_CGL_SHAREGROUP_APPLE, (cl_context_properties)share_group, 0 };
cl_context context = clCreateContext(properties, 0, 0, 0, 0, &error); cl_context context = clCreateContext(properties, 0, 0, 0, 0, &error);
if (error) { if (error) {
print_error(error, "clCreateContext failed"); print_error(error, "clCreateContext failed");
return NULL; return NULL;
} }
// Verify that all devices in the context support the required extension // Verify that all devices in the context support the required extension
cl_device_id devices[64]; cl_device_id devices[64];
size_t size_out; size_t size_out;
error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &size_out); error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &size_out);
if (error) { if (error) {
print_error(error, "clGetContextInfo failed"); print_error(error, "clGetContextInfo failed");
return NULL; return NULL;
} }
char extensions[8192]; char extensions[8192];
for (int i=0; i<(int)(size_out/sizeof(cl_device_id)); i++) { for (int i=0; i<(int)(size_out/sizeof(cl_device_id)); i++) {
error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL);
if (error) { if (error) {
print_error(error, "clGetDeviceInfo failed"); print_error(error, "clGetDeviceInfo failed");
return NULL; return NULL;
} }
if (strstr(extensions, "cl_APPLE_gl_sharing") == NULL) { if (strstr(extensions, "cl_APPLE_gl_sharing") == NULL) {
log_error("Device %d does not supporte required extension cl_APPLE_gl_sharing.\n", i); log_error("Device %d does not supporte required extension cl_APPLE_gl_sharing.\n", i);
return NULL; return NULL;
} }
} }
return context; return context;
} }
virtual int SupportsCLGLInterop( cl_device_type device_type ) virtual int SupportsCLGLInterop( cl_device_type device_type )
{ {
int found_valid_device = 0; int found_valid_device = 0;
cl_device_id devices[64]; cl_device_id devices[64];
cl_uint num_of_devices; cl_uint num_of_devices;
int error; int error;
error = clGetDeviceIDs(NULL, device_type, 64, devices, &num_of_devices); error = clGetDeviceIDs(NULL, device_type, 64, devices, &num_of_devices);
if (error) { if (error) {
print_error(error, "clGetDeviceIDs failed"); print_error(error, "clGetDeviceIDs failed");
return -1; return -1;
} }
char extensions[8192]; char extensions[8192];
for (int i=0; i<(int)num_of_devices; i++) { for (int i=0; i<(int)num_of_devices; i++) {
error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL);
if (error) { if (error) {
print_error(error, "clGetDeviceInfo failed"); print_error(error, "clGetDeviceInfo failed");
return -1; return -1;
} }
if (strstr(extensions, "cl_APPLE_gl_sharing") == NULL) { if (strstr(extensions, "cl_APPLE_gl_sharing") == NULL) {
log_info("Device %d of %d does not support required extension cl_APPLE_gl_sharing.\n", i, num_of_devices); log_info("Device %d of %d does not support required extension cl_APPLE_gl_sharing.\n", i, num_of_devices);
} else { } else {
log_info("Device %d of %d does support required extension cl_APPLE_gl_sharing.\n", i, num_of_devices); log_info("Device %d of %d does support required extension cl_APPLE_gl_sharing.\n", i, num_of_devices);
found_valid_device = 1; found_valid_device = 1;
} }
} }
return found_valid_device; return found_valid_device;
} }
virtual ~OSXGLEnvironment() virtual ~OSXGLEnvironment()
{ {
CGLDestroyContext( mCGLContext ); CGLDestroyContext( mCGLContext );
} }
CGLContextObj mCGLContext; CGLContextObj mCGLContext;
}; };
GLEnvironment * GLEnvironment::Instance( void ) GLEnvironment * GLEnvironment::Instance( void )
{ {
static OSXGLEnvironment * env = NULL; static OSXGLEnvironment * env = NULL;
if( env == NULL ) if( env == NULL )
env = new OSXGLEnvironment(); env = new OSXGLEnvironment();
return env; return env;
} }

View File

@@ -1,204 +1,204 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#define GL_GLEXT_PROTOTYPES #define GL_GLEXT_PROTOTYPES
#include "setup.h" #include "setup.h"
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include <GL/gl.h> #include <GL/gl.h>
#include <GL/glut.h> #include <GL/glut.h>
#include <GL/glext.h> #include <GL/glext.h>
#include <GL/glut.h> #include <GL/glut.h>
#include <CL/cl_ext.h> #include <CL/cl_ext.h>
typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetGLContextInfoKHR_fn)( typedef CL_API_ENTRY cl_int (CL_API_CALL *clGetGLContextInfoKHR_fn)(
const cl_context_properties *properties, const cl_context_properties *properties,
cl_gl_context_info param_name, cl_gl_context_info param_name,
size_t param_value_size, size_t param_value_size,
void *param_value, void *param_value,
size_t *param_value_size_ret); size_t *param_value_size_ret);
// Rename references to this dynamically linked function to avoid // Rename references to this dynamically linked function to avoid
// collision with static link version // collision with static link version
#define clGetGLContextInfoKHR clGetGLContextInfoKHR_proc #define clGetGLContextInfoKHR clGetGLContextInfoKHR_proc
static clGetGLContextInfoKHR_fn clGetGLContextInfoKHR; static clGetGLContextInfoKHR_fn clGetGLContextInfoKHR;
#define MAX_DEVICES 32 #define MAX_DEVICES 32
class WGLEnvironment : public GLEnvironment class WGLEnvironment : public GLEnvironment
{ {
private: private:
cl_device_id m_devices[MAX_DEVICES]; cl_device_id m_devices[MAX_DEVICES];
int m_device_count; int m_device_count;
cl_platform_id m_platform; cl_platform_id m_platform;
public: public:
WGLEnvironment() WGLEnvironment()
{ {
m_device_count = 0; m_device_count = 0;
m_platform = 0; m_platform = 0;
} }
virtual int Init( int *argc, char **argv, int use_opengl_32 ) virtual int Init( int *argc, char **argv, int use_opengl_32 )
{ {
// Create a GLUT window to render into // Create a GLUT window to render into
glutInit( argc, argv ); glutInit( argc, argv );
glutInitWindowSize( 512, 512 ); glutInitWindowSize( 512, 512 );
glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE ); glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE );
glutCreateWindow( "OpenCL <-> OpenGL Test" ); glutCreateWindow( "OpenCL <-> OpenGL Test" );
glewInit(); glewInit();
return 0; return 0;
} }
virtual cl_context CreateCLContext( void ) virtual cl_context CreateCLContext( void )
{ {
HGLRC hGLRC = wglGetCurrentContext(); HGLRC hGLRC = wglGetCurrentContext();
HDC hDC = wglGetCurrentDC(); HDC hDC = wglGetCurrentDC();
cl_context_properties properties[] = { cl_context_properties properties[] = {
CL_CONTEXT_PLATFORM, (cl_context_properties) m_platform, CL_CONTEXT_PLATFORM, (cl_context_properties) m_platform,
CL_GL_CONTEXT_KHR, (cl_context_properties) hGLRC, CL_GL_CONTEXT_KHR, (cl_context_properties) hGLRC,
CL_WGL_HDC_KHR, (cl_context_properties) hDC, CL_WGL_HDC_KHR, (cl_context_properties) hDC,
0 0
}; };
cl_device_id devices[MAX_DEVICES]; cl_device_id devices[MAX_DEVICES];
size_t dev_size; size_t dev_size;
cl_int status; cl_int status;
if (!hGLRC || !hDC) { if (!hGLRC || !hDC) {
print_error(CL_INVALID_CONTEXT, "No GL context bound"); print_error(CL_INVALID_CONTEXT, "No GL context bound");
return 0; return 0;
} }
if (!clGetGLContextInfoKHR) { if (!clGetGLContextInfoKHR) {
// As OpenCL for the platforms. Warn if more than one platform found, // As OpenCL for the platforms. Warn if more than one platform found,
// since this might not be the platform we want. By default, we simply // since this might not be the platform we want. By default, we simply
// use the first returned platform. // use the first returned platform.
cl_uint nplatforms; cl_uint nplatforms;
cl_platform_id platform; cl_platform_id platform;
clGetPlatformIDs(0, NULL, &nplatforms); clGetPlatformIDs(0, NULL, &nplatforms);
clGetPlatformIDs(1, &platform, NULL); clGetPlatformIDs(1, &platform, NULL);
if (nplatforms > 1) { if (nplatforms > 1) {
log_info("clGetPlatformIDs returned multiple values. This is not " log_info("clGetPlatformIDs returned multiple values. This is not "
"an error, but might result in obtaining incorrect function " "an error, but might result in obtaining incorrect function "
"pointers if you do not want the first returned platform.\n"); "pointers if you do not want the first returned platform.\n");
// Show them the platform name, in case it is a problem. // Show them the platform name, in case it is a problem.
size_t size; size_t size;
char *name; char *name;
clGetPlatformInfo(platform, CL_PLATFORM_NAME, 0, NULL, &size); clGetPlatformInfo(platform, CL_PLATFORM_NAME, 0, NULL, &size);
name = (char*)malloc(size); name = (char*)malloc(size);
clGetPlatformInfo(platform, CL_PLATFORM_NAME, size, name, NULL); clGetPlatformInfo(platform, CL_PLATFORM_NAME, size, name, NULL);
log_info("Using platform with name: %s \n", name); log_info("Using platform with name: %s \n", name);
free(name); free(name);
} }
clGetGLContextInfoKHR = (clGetGLContextInfoKHR_fn) clGetExtensionFunctionAddressForPlatform(platform, "clGetGLContextInfoKHR"); clGetGLContextInfoKHR = (clGetGLContextInfoKHR_fn) clGetExtensionFunctionAddressForPlatform(platform, "clGetGLContextInfoKHR");
if (!clGetGLContextInfoKHR) { if (!clGetGLContextInfoKHR) {
print_error(CL_INVALID_PLATFORM, "Failed to query proc address for clGetGLContextInfoKHR"); print_error(CL_INVALID_PLATFORM, "Failed to query proc address for clGetGLContextInfoKHR");
} }
} }
status = clGetGLContextInfoKHR(properties, status = clGetGLContextInfoKHR(properties,
CL_DEVICES_FOR_GL_CONTEXT_KHR, CL_DEVICES_FOR_GL_CONTEXT_KHR,
sizeof(devices), sizeof(devices),
devices, devices,
&dev_size); &dev_size);
if (status != CL_SUCCESS) { if (status != CL_SUCCESS) {
print_error(status, "clGetGLContextInfoKHR failed"); print_error(status, "clGetGLContextInfoKHR failed");
return 0; return 0;
} }
dev_size /= sizeof(cl_device_id); dev_size /= sizeof(cl_device_id);
log_info("GL context supports %d compute devices\n", dev_size); log_info("GL context supports %d compute devices\n", dev_size);
status = clGetGLContextInfoKHR(properties, status = clGetGLContextInfoKHR(properties,
CL_CURRENT_DEVICE_FOR_GL_CONTEXT_KHR, CL_CURRENT_DEVICE_FOR_GL_CONTEXT_KHR,
sizeof(devices), sizeof(devices),
devices, devices,
&dev_size); &dev_size);
if (status != CL_SUCCESS) { if (status != CL_SUCCESS) {
print_error(status, "clGetGLContextInfoKHR failed"); print_error(status, "clGetGLContextInfoKHR failed");
return 0; return 0;
} }
cl_device_id ctxDevice = m_devices[0]; cl_device_id ctxDevice = m_devices[0];
if (dev_size > 0) { if (dev_size > 0) {
log_info("GL context current device: 0x%x\n", devices[0]); log_info("GL context current device: 0x%x\n", devices[0]);
for (int i = 0; i < m_device_count; i++) { for (int i = 0; i < m_device_count; i++) {
if (m_devices[i] == devices[0]) { if (m_devices[i] == devices[0]) {
ctxDevice = devices[0]; ctxDevice = devices[0];
break; break;
} }
} }
} else { } else {
log_info("GL context current device is not a CL device, using device %d.\n", ctxDevice); log_info("GL context current device is not a CL device, using device %d.\n", ctxDevice);
} }
return clCreateContext(properties, 1, &ctxDevice, NULL, NULL, &status); return clCreateContext(properties, 1, &ctxDevice, NULL, NULL, &status);
} }
virtual int SupportsCLGLInterop( cl_device_type device_type ) virtual int SupportsCLGLInterop( cl_device_type device_type )
{ {
cl_device_id devices[MAX_DEVICES]; cl_device_id devices[MAX_DEVICES];
cl_uint num_of_devices; cl_uint num_of_devices;
int error; int error;
error = clGetPlatformIDs(1, &m_platform, NULL); error = clGetPlatformIDs(1, &m_platform, NULL);
if (error) { if (error) {
print_error(error, "clGetPlatformIDs failed"); print_error(error, "clGetPlatformIDs failed");
return -1; return -1;
} }
error = clGetDeviceIDs(m_platform, device_type, MAX_DEVICES, devices, &num_of_devices); error = clGetDeviceIDs(m_platform, device_type, MAX_DEVICES, devices, &num_of_devices);
if (error) { if (error) {
print_error(error, "clGetDeviceIDs failed"); print_error(error, "clGetDeviceIDs failed");
return -1; return -1;
} }
// Check all devices, search for one that supports cl_khr_gl_sharing // Check all devices, search for one that supports cl_khr_gl_sharing
char extensions[8192]; char extensions[8192];
for (int i=0; i<(int)num_of_devices; i++) { for (int i=0; i<(int)num_of_devices; i++) {
error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL);
if (error) { if (error) {
print_error(error, "clGetDeviceInfo failed"); print_error(error, "clGetDeviceInfo failed");
return -1; return -1;
} }
if (strstr(extensions, "cl_khr_gl_sharing") == NULL) { if (strstr(extensions, "cl_khr_gl_sharing") == NULL) {
log_info("Device %d of %d does not support required extension cl_khr_gl_sharing.\n", i+1, num_of_devices); log_info("Device %d of %d does not support required extension cl_khr_gl_sharing.\n", i+1, num_of_devices);
} else { } else {
log_info("Device %d of %d supports required extension cl_khr_gl_sharing.\n", i+1, num_of_devices); log_info("Device %d of %d supports required extension cl_khr_gl_sharing.\n", i+1, num_of_devices);
m_devices[m_device_count++] = devices[i]; m_devices[m_device_count++] = devices[i];
} }
} }
return m_device_count > 0; return m_device_count > 0;
} }
virtual ~WGLEnvironment() virtual ~WGLEnvironment()
{ {
} }
}; };
GLEnvironment * GLEnvironment::Instance( void ) GLEnvironment * GLEnvironment::Instance( void )
{ {
static WGLEnvironment * env = NULL; static WGLEnvironment * env = NULL;
if( env == NULL ) if( env == NULL )
env = new WGLEnvironment(); env = new WGLEnvironment();
return env; return env;
} }

View File

@@ -1,122 +1,122 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#define GL_GLEXT_PROTOTYPES #define GL_GLEXT_PROTOTYPES
#include "setup.h" #include "setup.h"
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include <GL/gl.h> #include <GL/gl.h>
#include <GL/glut.h> #include <GL/glut.h>
#include <GL/glext.h> #include <GL/glext.h>
#include <GL/freeglut.h> #include <GL/freeglut.h>
#include <GL/glx.h> #include <GL/glx.h>
#include <CL/cl_ext.h> #include <CL/cl_ext.h>
class X11GLEnvironment : public GLEnvironment class X11GLEnvironment : public GLEnvironment
{ {
private: private:
cl_device_id m_devices[64]; cl_device_id m_devices[64];
cl_uint m_device_count; cl_uint m_device_count;
public: public:
X11GLEnvironment() X11GLEnvironment()
{ {
m_device_count = 0; m_device_count = 0;
} }
virtual int Init( int *argc, char **argv, int use_opencl_32 ) virtual int Init( int *argc, char **argv, int use_opencl_32 )
{ {
// Create a GLUT window to render into // Create a GLUT window to render into
glutInit( argc, argv ); glutInit( argc, argv );
glutInitWindowSize( 512, 512 ); glutInitWindowSize( 512, 512 );
glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE ); glutInitDisplayMode( GLUT_RGB | GLUT_DOUBLE );
glutCreateWindow( "OpenCL <-> OpenGL Test" ); glutCreateWindow( "OpenCL <-> OpenGL Test" );
glewInit(); glewInit();
return 0; return 0;
} }
virtual cl_context CreateCLContext( void ) virtual cl_context CreateCLContext( void )
{ {
GLXContext context = glXGetCurrentContext(); GLXContext context = glXGetCurrentContext();
Display *dpy = glXGetCurrentDisplay(); Display *dpy = glXGetCurrentDisplay();
cl_context_properties properties[] = { cl_context_properties properties[] = {
CL_GL_CONTEXT_KHR, (cl_context_properties) context, CL_GL_CONTEXT_KHR, (cl_context_properties) context,
CL_GLX_DISPLAY_KHR, (cl_context_properties) dpy, CL_GLX_DISPLAY_KHR, (cl_context_properties) dpy,
0 0
}; };
cl_int status; cl_int status;
if (!context || !dpy) { if (!context || !dpy) {
print_error(CL_INVALID_CONTEXT, "No GL context bound"); print_error(CL_INVALID_CONTEXT, "No GL context bound");
return 0; return 0;
} }
return clCreateContext(properties, 1, m_devices, NULL, NULL, &status); return clCreateContext(properties, 1, m_devices, NULL, NULL, &status);
} }
virtual int SupportsCLGLInterop( cl_device_type device_type ) virtual int SupportsCLGLInterop( cl_device_type device_type )
{ {
int found_valid_device = 0; int found_valid_device = 0;
cl_platform_id platform; cl_platform_id platform;
cl_device_id devices[64]; cl_device_id devices[64];
cl_uint num_of_devices; cl_uint num_of_devices;
int error; int error;
error = clGetPlatformIDs(1, &platform, NULL); error = clGetPlatformIDs(1, &platform, NULL);
if (error) { if (error) {
print_error(error, "clGetPlatformIDs failed"); print_error(error, "clGetPlatformIDs failed");
return -1; return -1;
} }
error = clGetDeviceIDs(platform, device_type, 64, devices, &num_of_devices); error = clGetDeviceIDs(platform, device_type, 64, devices, &num_of_devices);
// If this platform doesn't have any of the requested device_type (namely GPUs) then return 0 // If this platform doesn't have any of the requested device_type (namely GPUs) then return 0
if (error == CL_DEVICE_NOT_FOUND) if (error == CL_DEVICE_NOT_FOUND)
return 0; return 0;
if (error) { if (error) {
print_error(error, "clGetDeviceIDs failed"); print_error(error, "clGetDeviceIDs failed");
return -1; return -1;
} }
char extensions[8192]; char extensions[8192];
for (int i=0; i<(int)num_of_devices; i++) { for (int i=0; i<(int)num_of_devices; i++) {
error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, NULL);
if (error) { if (error) {
print_error(error, "clGetDeviceInfo failed"); print_error(error, "clGetDeviceInfo failed");
return -1; return -1;
} }
if (strstr(extensions, "cl_khr_gl_sharing ") == NULL) { if (strstr(extensions, "cl_khr_gl_sharing ") == NULL) {
log_info("Device %d of %d does not support required extension cl_khr_gl_sharing.\n", i+1, num_of_devices); log_info("Device %d of %d does not support required extension cl_khr_gl_sharing.\n", i+1, num_of_devices);
} else { } else {
log_info("Device %d of %d supports required extension cl_khr_gl_sharing.\n", i+1, num_of_devices); log_info("Device %d of %d supports required extension cl_khr_gl_sharing.\n", i+1, num_of_devices);
found_valid_device = 1; found_valid_device = 1;
m_devices[m_device_count++] = devices[i]; m_devices[m_device_count++] = devices[i];
} }
} }
return found_valid_device; return found_valid_device;
} }
virtual ~X11GLEnvironment() virtual ~X11GLEnvironment()
{ {
} }
}; };
GLEnvironment * GLEnvironment::Instance( void ) GLEnvironment * GLEnvironment::Instance( void )
{ {
static X11GLEnvironment * env = NULL; static X11GLEnvironment * env = NULL;
if( env == NULL ) if( env == NULL )
env = new X11GLEnvironment(); env = new X11GLEnvironment();
return env; return env;
} }

View File

@@ -1,18 +1,18 @@
project project
: requirements <include>. : requirements <include>.
<toolset>gcc:<cflags>"-xc++" <toolset>gcc:<cflags>"-xc++"
<toolset>msvc:<cflags>"/TP" <toolset>msvc:<cflags>"/TP"
<warnings-as-errors>off <warnings-as-errors>off
: usage-requirements <include>. : usage-requirements <include>.
; ;
local harness.objs ; local harness.objs ;
for source in [ glob *.c *.cpp ] for source in [ glob *.c *.cpp ]
{ {
harness.objs += [ obj $(source:B).obj : $(source) ] ; harness.objs += [ obj $(source:B).obj : $(source) ] ;
} }
alias harness : $(harness.objs) alias harness : $(harness.objs)
: <use>/Runtime//OpenCL.lib : : <use>/Runtime//OpenCL.lib :
: <library>/Runtime//OpenCL.lib : <library>/Runtime//OpenCL.lib
; ;

View File

@@ -1,41 +1,41 @@
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
SRCS = conversions.c \ SRCS = conversions.c \
errorHelpers.c \ errorHelpers.c \
genericThread.cpp \ genericThread.cpp \
imageHelpers.cpp \ imageHelpers.cpp \
kernelHelpers.c \ kernelHelpers.c \
mt19937.c \ mt19937.c \
rounding_mode.c \ rounding_mode.c \
testHarness.c \ testHarness.c \
testHarness.cpp \ testHarness.cpp \
ThreadPool.c \ ThreadPool.c \
threadTesting.c \ threadTesting.c \
typeWrappers.cpp typeWrappers.cpp
DEFINES = DONT_TEST_GARBAGE_POINTERS DEFINES = DONT_TEST_GARBAGE_POINTERS
SOURCES = $(abspath $(SRCS)) SOURCES = $(abspath $(SRCS))
LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries
LIBPATH += -L. LIBPATH += -L.
HEADERS = HEADERS =
INCLUDE = INCLUDE =
COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32 COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32
CC = c++ CC = c++
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF} LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF}
OBJECTS := ${SOURCES:.c=.o} OBJECTS := ${SOURCES:.c=.o}
OBJECTS := ${OBJECTS:.cpp=.o} OBJECTS := ${OBJECTS:.cpp=.o}
all: $(OBJECTS) all: $(OBJECTS)
clean: clean:
rm -f $(OBJECTS) rm -f $(OBJECTS)
.DEFAULT: .DEFAULT:
@echo The target \"$@\" does not exist in Makefile. @echo The target \"$@\" does not exist in Makefile.

File diff suppressed because it is too large Load Diff

View File

@@ -1,76 +1,76 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef THREAD_POOL_H #ifndef THREAD_POOL_H
#define THREAD_POOL_H #define THREAD_POOL_H
#if defined( __APPLE__ ) #if defined( __APPLE__ )
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/cl.h> #include <CL/cl.h>
#endif #endif
#if defined(__cplusplus) #if defined(__cplusplus)
extern "C" { extern "C" {
#endif #endif
// //
// An atomic add operator // An atomic add operator
cl_int ThreadPool_AtomicAdd( volatile cl_int *a, cl_int b ); // returns old value cl_int ThreadPool_AtomicAdd( volatile cl_int *a, cl_int b ); // returns old value
// Your function prototype // Your function prototype
// //
// A function pointer to the function you want to execute in a multithreaded context. No // A function pointer to the function you want to execute in a multithreaded context. No
// synchronization primitives are provided, other than the atomic add above. You may not // synchronization primitives are provided, other than the atomic add above. You may not
// call ThreadPool_Do from your function. ThreadPool_AtomicAdd() and GetThreadCount() should // call ThreadPool_Do from your function. ThreadPool_AtomicAdd() and GetThreadCount() should
// work, however. // work, however.
// //
// job ids and thread ids are 0 based. If number of jobs or threads was 8, they will numbered be 0 through 7. // job ids and thread ids are 0 based. If number of jobs or threads was 8, they will numbered be 0 through 7.
// Note that while every job will be run, it is not guaranteed that every thread will wake up before // Note that while every job will be run, it is not guaranteed that every thread will wake up before
// the work is done. // the work is done.
typedef cl_int (*TPFuncPtr)( cl_uint /*job_id*/, cl_uint /* thread_id */, void *userInfo ); typedef cl_int (*TPFuncPtr)( cl_uint /*job_id*/, cl_uint /* thread_id */, void *userInfo );
// returns first non-zero result from func_ptr, or CL_SUCCESS if all are zero. // returns first non-zero result from func_ptr, or CL_SUCCESS if all are zero.
// Some workitems may not run if a non-zero result is returned from func_ptr(). // Some workitems may not run if a non-zero result is returned from func_ptr().
// This function may not be called from a TPFuncPtr. // This function may not be called from a TPFuncPtr.
cl_int ThreadPool_Do( TPFuncPtr func_ptr, cl_int ThreadPool_Do( TPFuncPtr func_ptr,
cl_uint count, cl_uint count,
void *userInfo ); void *userInfo );
// Returns the number of worker threads that underlie the threadpool. The value passed // Returns the number of worker threads that underlie the threadpool. The value passed
// as the TPFuncPtrs thread_id will be between 0 and this value less one, inclusive. // as the TPFuncPtrs thread_id will be between 0 and this value less one, inclusive.
// This is safe to call from a TPFuncPtr. // This is safe to call from a TPFuncPtr.
cl_uint GetThreadCount( void ); cl_uint GetThreadCount( void );
// SetThreadCount() may be used to artifically set the number of worker threads // SetThreadCount() may be used to artifically set the number of worker threads
// If the value is 0 (the default) the number of threads will be determined based on // If the value is 0 (the default) the number of threads will be determined based on
// the number of CPU cores. If it is a unicore machine, then 2 will be used, so // the number of CPU cores. If it is a unicore machine, then 2 will be used, so
// that we still get some testing for thread safety. // that we still get some testing for thread safety.
// //
// If count < 2 or the CL_TEST_SINGLE_THREADED environment variable is set then the // If count < 2 or the CL_TEST_SINGLE_THREADED environment variable is set then the
// code will run single threaded, but will report an error to indicate that the test // code will run single threaded, but will report an error to indicate that the test
// is invalid. This option is intended for debugging purposes only. It is suggested // is invalid. This option is intended for debugging purposes only. It is suggested
// as a convention that test apps set the thread count to 1 in response to the -m flag. // as a convention that test apps set the thread count to 1 in response to the -m flag.
// //
// SetThreadCount() must be called before the first call to GetThreadCount() or ThreadPool_Do(), // SetThreadCount() must be called before the first call to GetThreadCount() or ThreadPool_Do(),
// otherwise the behavior is indefined. It may not be called from a TPFuncPtr. // otherwise the behavior is indefined. It may not be called from a TPFuncPtr.
void SetThreadCount( int count ); void SetThreadCount( int count );
#ifdef __cplusplus #ifdef __cplusplus
} /* extern "C" */ } /* extern "C" */
#endif #endif
#endif /* THREAD_POOL_H */ #endif /* THREAD_POOL_H */

View File

@@ -1,253 +1,253 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef test_conformance_clImageHelper_h #ifndef test_conformance_clImageHelper_h
#define test_conformance_clImageHelper_h #define test_conformance_clImageHelper_h
#ifdef __APPLE__ #ifdef __APPLE__
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/cl.h> #include <CL/cl.h>
#endif #endif
#include <stdio.h> #include <stdio.h>
#include "errorHelpers.h" #include "errorHelpers.h"
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
// helper function to replace clCreateImage2D , to make the existing code use // helper function to replace clCreateImage2D , to make the existing code use
// the functions of version 1.2 and veriosn 1.1 respectively // the functions of version 1.2 and veriosn 1.1 respectively
inline cl_mem create_image_2d (cl_context context, inline cl_mem create_image_2d (cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
const cl_image_format *image_format, const cl_image_format *image_format,
size_t image_width, size_t image_width,
size_t image_height, size_t image_height,
size_t image_row_pitch, size_t image_row_pitch,
void *host_ptr, void *host_ptr,
cl_int *errcode_ret) cl_int *errcode_ret)
{ {
cl_mem mImage = NULL; cl_mem mImage = NULL;
#ifdef CL_VERSION_1_2 #ifdef CL_VERSION_1_2
cl_image_desc image_desc_dest; cl_image_desc image_desc_dest;
image_desc_dest.image_type = CL_MEM_OBJECT_IMAGE2D;; image_desc_dest.image_type = CL_MEM_OBJECT_IMAGE2D;;
image_desc_dest.image_width = image_width; image_desc_dest.image_width = image_width;
image_desc_dest.image_height = image_height; image_desc_dest.image_height = image_height;
image_desc_dest.image_depth= 0;// not usedfor 2d image_desc_dest.image_depth= 0;// not usedfor 2d
image_desc_dest.image_array_size = 0;// not used for 2d image_desc_dest.image_array_size = 0;// not used for 2d
image_desc_dest.image_row_pitch = image_row_pitch; image_desc_dest.image_row_pitch = image_row_pitch;
image_desc_dest.image_slice_pitch = 0; image_desc_dest.image_slice_pitch = 0;
image_desc_dest.num_mip_levels = 0; image_desc_dest.num_mip_levels = 0;
image_desc_dest.num_samples = 0; image_desc_dest.num_samples = 0;
image_desc_dest.buffer = NULL;// no image type of CL_MEM_OBJECT_IMAGE1D_BUFFER in CL_VERSION_1_1, so always is NULL image_desc_dest.buffer = NULL;// no image type of CL_MEM_OBJECT_IMAGE1D_BUFFER in CL_VERSION_1_1, so always is NULL
mImage = clCreateImage( context, flags, image_format, &image_desc_dest, host_ptr, errcode_ret ); mImage = clCreateImage( context, flags, image_format, &image_desc_dest, host_ptr, errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage failed (%d)\n", *errcode_ret); log_info("clCreateImage failed (%d)\n", *errcode_ret);
} }
#else #else
mImage = clCreateImage2D( context, flags, image_format, image_width, image_height, image_row_pitch, host_ptr, errcode_ret ); mImage = clCreateImage2D( context, flags, image_format, image_width, image_height, image_row_pitch, host_ptr, errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage2D failed (%d)\n", *errcode_ret); log_info("clCreateImage2D failed (%d)\n", *errcode_ret);
} }
#endif #endif
return mImage; return mImage;
} }
inline cl_mem create_image_3d (cl_context context, inline cl_mem create_image_3d (cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
const cl_image_format *image_format, const cl_image_format *image_format,
size_t image_width, size_t image_width,
size_t image_height, size_t image_height,
size_t image_depth, size_t image_depth,
size_t image_row_pitch, size_t image_row_pitch,
size_t image_slice_pitch, size_t image_slice_pitch,
void *host_ptr, void *host_ptr,
cl_int *errcode_ret) cl_int *errcode_ret)
{ {
cl_mem mImage; cl_mem mImage;
#ifdef CL_VERSION_1_2 #ifdef CL_VERSION_1_2
cl_image_desc image_desc; cl_image_desc image_desc;
image_desc.image_type = CL_MEM_OBJECT_IMAGE3D; image_desc.image_type = CL_MEM_OBJECT_IMAGE3D;
image_desc.image_width = image_width; image_desc.image_width = image_width;
image_desc.image_height = image_height; image_desc.image_height = image_height;
image_desc.image_depth = image_depth; image_desc.image_depth = image_depth;
image_desc.image_array_size = 0;// not used for one image image_desc.image_array_size = 0;// not used for one image
image_desc.image_row_pitch = image_row_pitch; image_desc.image_row_pitch = image_row_pitch;
image_desc.image_slice_pitch = image_slice_pitch; image_desc.image_slice_pitch = image_slice_pitch;
image_desc.num_mip_levels = 0; image_desc.num_mip_levels = 0;
image_desc.num_samples = 0; image_desc.num_samples = 0;
image_desc.buffer = NULL; // no image type of CL_MEM_OBJECT_IMAGE1D_BUFFER in CL_VERSION_1_1, so always is NULL image_desc.buffer = NULL; // no image type of CL_MEM_OBJECT_IMAGE1D_BUFFER in CL_VERSION_1_1, so always is NULL
mImage = clCreateImage( context, mImage = clCreateImage( context,
flags, flags,
image_format, image_format,
&image_desc, &image_desc,
host_ptr, host_ptr,
errcode_ret ); errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage failed (%d)\n", *errcode_ret); log_info("clCreateImage failed (%d)\n", *errcode_ret);
} }
#else #else
mImage = clCreateImage3D( context, mImage = clCreateImage3D( context,
flags, image_format, flags, image_format,
image_width, image_width,
image_height, image_height,
image_depth, image_depth,
image_row_pitch, image_row_pitch,
image_slice_pitch, image_slice_pitch,
host_ptr, host_ptr,
errcode_ret ); errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage3D failed (%d)\n", *errcode_ret); log_info("clCreateImage3D failed (%d)\n", *errcode_ret);
} }
#endif #endif
return mImage; return mImage;
} }
inline cl_mem create_image_2d_array (cl_context context, inline cl_mem create_image_2d_array (cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
const cl_image_format *image_format, const cl_image_format *image_format,
size_t image_width, size_t image_width,
size_t image_height, size_t image_height,
size_t image_array_size, size_t image_array_size,
size_t image_row_pitch, size_t image_row_pitch,
size_t image_slice_pitch, size_t image_slice_pitch,
void *host_ptr, void *host_ptr,
cl_int *errcode_ret) cl_int *errcode_ret)
{ {
cl_mem mImage; cl_mem mImage;
cl_image_desc image_desc; cl_image_desc image_desc;
image_desc.image_type = CL_MEM_OBJECT_IMAGE2D_ARRAY; image_desc.image_type = CL_MEM_OBJECT_IMAGE2D_ARRAY;
image_desc.image_width = image_width; image_desc.image_width = image_width;
image_desc.image_height = image_height; image_desc.image_height = image_height;
image_desc.image_depth = 1; image_desc.image_depth = 1;
image_desc.image_array_size = image_array_size; image_desc.image_array_size = image_array_size;
image_desc.image_row_pitch = image_row_pitch; image_desc.image_row_pitch = image_row_pitch;
image_desc.image_slice_pitch = image_slice_pitch; image_desc.image_slice_pitch = image_slice_pitch;
image_desc.num_mip_levels = 0; image_desc.num_mip_levels = 0;
image_desc.num_samples = 0; image_desc.num_samples = 0;
image_desc.buffer = NULL; image_desc.buffer = NULL;
mImage = clCreateImage( context, mImage = clCreateImage( context,
flags, flags,
image_format, image_format,
&image_desc, &image_desc,
host_ptr, host_ptr,
errcode_ret ); errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage failed (%d)\n", *errcode_ret); log_info("clCreateImage failed (%d)\n", *errcode_ret);
} }
return mImage; return mImage;
} }
inline cl_mem create_image_1d_array (cl_context context, inline cl_mem create_image_1d_array (cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
const cl_image_format *image_format, const cl_image_format *image_format,
size_t image_width, size_t image_width,
size_t image_array_size, size_t image_array_size,
size_t image_row_pitch, size_t image_row_pitch,
size_t image_slice_pitch, size_t image_slice_pitch,
void *host_ptr, void *host_ptr,
cl_int *errcode_ret) cl_int *errcode_ret)
{ {
cl_mem mImage; cl_mem mImage;
cl_image_desc image_desc; cl_image_desc image_desc;
image_desc.image_type = CL_MEM_OBJECT_IMAGE1D_ARRAY; image_desc.image_type = CL_MEM_OBJECT_IMAGE1D_ARRAY;
image_desc.image_width = image_width; image_desc.image_width = image_width;
image_desc.image_height = 1; image_desc.image_height = 1;
image_desc.image_depth = 1; image_desc.image_depth = 1;
image_desc.image_array_size = image_array_size; image_desc.image_array_size = image_array_size;
image_desc.image_row_pitch = image_row_pitch; image_desc.image_row_pitch = image_row_pitch;
image_desc.image_slice_pitch = image_slice_pitch; image_desc.image_slice_pitch = image_slice_pitch;
image_desc.num_mip_levels = 0; image_desc.num_mip_levels = 0;
image_desc.num_samples = 0; image_desc.num_samples = 0;
image_desc.buffer = NULL; image_desc.buffer = NULL;
mImage = clCreateImage( context, mImage = clCreateImage( context,
flags, flags,
image_format, image_format,
&image_desc, &image_desc,
host_ptr, host_ptr,
errcode_ret ); errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage failed (%d)\n", *errcode_ret); log_info("clCreateImage failed (%d)\n", *errcode_ret);
} }
return mImage; return mImage;
} }
inline cl_mem create_image_1d (cl_context context, inline cl_mem create_image_1d (cl_context context,
cl_mem_flags flags, cl_mem_flags flags,
const cl_image_format *image_format, const cl_image_format *image_format,
size_t image_width, size_t image_width,
size_t image_row_pitch, size_t image_row_pitch,
void *host_ptr, void *host_ptr,
cl_mem buffer, cl_mem buffer,
cl_int *errcode_ret) cl_int *errcode_ret)
{ {
cl_mem mImage; cl_mem mImage;
cl_image_desc image_desc; cl_image_desc image_desc;
image_desc.image_type = buffer ? CL_MEM_OBJECT_IMAGE1D_BUFFER: CL_MEM_OBJECT_IMAGE1D; image_desc.image_type = buffer ? CL_MEM_OBJECT_IMAGE1D_BUFFER: CL_MEM_OBJECT_IMAGE1D;
image_desc.image_width = image_width; image_desc.image_width = image_width;
image_desc.image_height = 1; image_desc.image_height = 1;
image_desc.image_depth = 1; image_desc.image_depth = 1;
image_desc.image_row_pitch = image_row_pitch; image_desc.image_row_pitch = image_row_pitch;
image_desc.image_slice_pitch = 0; image_desc.image_slice_pitch = 0;
image_desc.num_mip_levels = 0; image_desc.num_mip_levels = 0;
image_desc.num_samples = 0; image_desc.num_samples = 0;
image_desc.buffer = buffer; image_desc.buffer = buffer;
mImage = clCreateImage( context, mImage = clCreateImage( context,
flags, flags,
image_format, image_format,
&image_desc, &image_desc,
host_ptr, host_ptr,
errcode_ret ); errcode_ret );
if (errcode_ret && (*errcode_ret)) { if (errcode_ret && (*errcode_ret)) {
// Log an info message and rely on the calling function to produce an error // Log an info message and rely on the calling function to produce an error
// if necessary. // if necessary.
log_info("clCreateImage failed (%d)\n", *errcode_ret); log_info("clCreateImage failed (%d)\n", *errcode_ret);
} }
return mImage; return mImage;
} }
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
#endif #endif

View File

@@ -1,200 +1,393 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _COMPAT_H_ #ifndef _COMPAT_H_
#define _COMPAT_H_ #define _COMPAT_H_
#if defined(_WIN32) && defined (_MSC_VER) #if defined(_WIN32) && defined (_MSC_VER)
#include <Windows.h>
#include <Windows.h> #endif
#include <Winbase.h>
#include <CL/cl.h> #ifdef __cplusplus
#include <float.h> #define EXTERN_C extern "C"
#include <xmmintrin.h> #else
#define EXTERN_C
#define MAKE_HEX_FLOAT(x,y,z) ((float)ldexp( (float)(y), z)) #endif
#define MAKE_HEX_DOUBLE(x,y,z) ldexp( (double)(y), z)
#define MAKE_HEX_LONG(x,y,z) ((long double) ldexp( (long double)(y), z))
//
#define isfinite(x) _finite(x) // stdlib.h
//
#if !defined(__cplusplus)
typedef char bool; #include <stdlib.h> // On Windows, _MAX_PATH defined there.
#define inline
// llabs appeared in MS C v16 (VS 10/2010).
#else #if defined( _MSC_VER ) && _MSC_VER <= 1500
extern "C" { EXTERN_C inline long long llabs(long long __x) { return __x >= 0 ? __x : -__x; }
#endif #endif
typedef unsigned char uint8_t;
typedef char int8_t; //
typedef unsigned short uint16_t; // stdbool.h
typedef short int16_t; //
typedef unsigned int uint32_t;
typedef int int32_t; // stdbool.h appeared in MS C v18 (VS 12/2013).
typedef unsigned long long uint64_t; #if defined( _MSC_VER ) && MSC_VER <= 1700
typedef long long int64_t; #if !defined(__cplusplus)
typedef char bool;
#define MAXPATHLEN MAX_PATH #define true 1
#define false 0
typedef unsigned short ushort; #endif
typedef unsigned int uint; #else
typedef unsigned long ulong; #include <stdbool.h>
#endif
#define INFINITY (FLT_MAX + FLT_MAX)
//#define NAN (INFINITY | 1)
//const static int PINFBITPATT_SP32 = INFINITY; //
// stdint.h
#ifndef M_PI //
#define M_PI 3.14159265358979323846264338327950288
#endif // stdint.h appeared in MS C v16 (VS 10/2010) and Intel C v12.
#if defined( _MSC_VER ) && ( ! defined( __INTEL_COMPILER ) && _MSC_VER <= 1500 || defined( __INTEL_COMPILER ) && __INTEL_COMPILER < 1200 )
typedef unsigned char uint8_t;
#define isnan( x ) ((x) != (x)) typedef char int8_t;
#define isinf( _x) ((_x) == INFINITY || (_x) == -INFINITY) typedef unsigned short uint16_t;
typedef short int16_t;
double rint( double x); typedef unsigned int uint32_t;
float rintf( float x); typedef int int32_t;
long double rintl( long double x); typedef unsigned long long uint64_t;
typedef long long int64_t;
float cbrtf( float ); #else
double cbrt( double ); #ifndef __STDC_LIMIT_MACROS
#define __STDC_LIMIT_MACROS
int ilogb( double x); #endif
int ilogbf (float x); #include <stdint.h>
int ilogbl(long double x); #endif
double fmax(double x, double y);
double fmin(double x, double y);
float fmaxf( float x, float y ); //
float fminf(float x, float y); // float.h
//
double log2(double x);
long double log2l(long double x); #include <float.h>
double exp2(double x);
long double exp2l(long double x);
//
double fdim(double x, double y); // fenv.h
float fdimf(float x, float y); //
long double fdiml(long double x, long double y);
// fenv.h appeared in MS C v18 (VS 12/2013).
double remquo( double x, double y, int *quo); #if defined( _MSC_VER ) && _MSC_VER <= 1700 && ! defined( __INTEL_COMPILER )
float remquof( float x, float y, int *quo); // reimplement fenv.h because windows doesn't have it
long double remquol( long double x, long double y, int *quo); #define FE_INEXACT 0x0020
#define FE_UNDERFLOW 0x0010
long double scalblnl(long double x, long n); #define FE_OVERFLOW 0x0008
#define FE_DIVBYZERO 0x0004
inline long long #define FE_INVALID 0x0001
llabs(long long __x) { return __x >= 0 ? __x : -__x; } #define FE_ALL_EXCEPT 0x003D
int fetestexcept(int excepts);
int feclearexcept(int excepts);
// end of math functions #else
#include <fenv.h>
uint64_t ReadTime( void ); #endif
double SubtractTime( uint64_t endTime, uint64_t startTime );
#define sleep(X) Sleep(1000*X) //
#define snprintf sprintf_s // math.h
//#define hypotl _hypot //
float make_nan(); #if defined( __INTEL_COMPILER )
float nanf( const char* str); #include <mathimf.h>
double nan( const char* str); #else
long double nanl( const char* str); #include <math.h>
#endif
//#if defined USE_BOOST
//#include <boost/math/tr1.hpp> #if defined( _MSC_VER )
//double hypot(double x, double y);
float hypotf(float x, float y); #ifdef __cplusplus
long double hypotl(long double x, long double y) ; extern "C" {
double lgamma(double x); #endif
float lgammaf(float x);
#ifndef M_PI
double trunc(double x); #define M_PI 3.14159265358979323846264338327950288
float truncf(float x); #endif
double log1p(double x); #if ! defined( __INTEL_COMPILER )
float log1pf(float x);
long double log1pl(long double x); #ifndef NAN
#define NAN (INFINITY - INFINITY)
double copysign(double x, double y); #endif
float copysignf(float x, float y); #ifndef HUGE_VALF
long double copysignl(long double x, long double y); #define HUGE_VALF (float)HUGE_VAL
#endif
long lround(double x); #ifndef INFINITY
long lroundf(float x); #define INFINITY (FLT_MAX + FLT_MAX)
//long lroundl(long double x) #endif
#ifndef isfinite
double round(double x); #define isfinite(x) _finite(x)
float roundf(float x); #endif
long double roundl(long double x); #ifndef isnan
#define isnan( x ) ((x) != (x))
int signbit(double x); #endif
int signbitf(float x); #ifndef isinf
#define isinf( _x) ((_x) == INFINITY || (_x) == -INFINITY)
//bool signbitl(long double x) { return boost::math::tr1::signbit<long double>(x); } #endif
//#endif // USE_BOOST
double rint( double x);
long int lrint (double flt); float rintf( float x);
long int lrintf (float flt); long double rintl( long double x);
float cbrtf( float );
float int2float (int32_t ix); double cbrt( double );
int32_t float2int (float fx);
int ilogb( double x);
/** Returns the number of leading 0-bits in x, int ilogbf (float x);
starting at the most significant bit position. int ilogbl(long double x);
If x is 0, the result is undefined.
*/ double fmax(double x, double y);
int __builtin_clz(unsigned int pattern); double fmin(double x, double y);
float fmaxf( float x, float y );
float fminf(float x, float y);
static const double zero= 0.00000000000000000000e+00;
#define NAN (INFINITY - INFINITY) double log2(double x);
#define HUGE_VALF (float)HUGE_VAL long double log2l(long double x);
int usleep(int usec); double exp2(double x);
long double exp2l(long double x);
// reimplement fenv.h because windows doesn't have it
#define FE_INEXACT 0x0020 double fdim(double x, double y);
#define FE_UNDERFLOW 0x0010 float fdimf(float x, float y);
#define FE_OVERFLOW 0x0008 long double fdiml(long double x, long double y);
#define FE_DIVBYZERO 0x0004
#define FE_INVALID 0x0001 double remquo( double x, double y, int *quo);
#define FE_ALL_EXCEPT 0x003D float remquof( float x, float y, int *quo);
long double remquol( long double x, long double y, int *quo);
int fetestexcept(int excepts);
int feclearexcept(int excepts); long double scalblnl(long double x, long n);
#ifdef __cplusplus float hypotf(float x, float y);
} long double hypotl(long double x, long double y) ;
#endif double lgamma(double x);
float lgammaf(float x);
#else // !((defined(_WIN32) && defined(_MSC_VER)
#if defined(__MINGW32__) double trunc(double x);
#include <windows.h> float truncf(float x);
#define sleep(X) Sleep(1000*X)
double log1p(double x);
#endif float log1pf(float x);
#define MAKE_HEX_FLOAT(x,y,z) x long double log1pl(long double x);
#define MAKE_HEX_DOUBLE(x,y,z) x
#define MAKE_HEX_LONG(x,y,z) x double copysign(double x, double y);
float copysignf(float x, float y);
#endif // !((defined(_WIN32) && defined(_MSC_VER) long double copysignl(long double x, long double y);
long lround(double x);
#endif // _COMPAT_H_ long lroundf(float x);
//long lroundl(long double x)
double round(double x);
float roundf(float x);
long double roundl(long double x);
int cf_signbit(double x);
int cf_signbitf(float x);
// Added in _MSC_VER == 1800 (Visual Studio 2013)
#if _MSC_VER < 1800
static int signbit(double x) { return cf_signbit(x); }
#endif
static int signbitf(float x) { return cf_signbitf(x); }
long int lrint (double flt);
long int lrintf (float flt);
float int2float (int32_t ix);
int32_t float2int (float fx);
#endif
#if ! defined( __INTEL_COMPILER ) || __INTEL_COMPILER < 1300
// These functions appeared in Intel C v13.
float nanf( const char* str);
double nan( const char* str);
long double nanl( const char* str);
#endif
#ifdef __cplusplus
}
#endif
#endif
#if defined( __ANDROID__ )
#define log2(X) (log(X)/log(2))
#endif
//
// stdio.h
//
#if defined(_MSC_VER)
// snprintf added in _MSC_VER == 1900 (Visual Studio 2015)
#if _MSC_VER < 1900
#define snprintf sprintf_s
#endif
#endif
//
// unistd.h
//
#if defined( _MSC_VER )
EXTERN_C unsigned int sleep( unsigned int sec );
EXTERN_C int usleep( int usec );
#endif
//
// syscall.h
//
#if defined( __ANDROID__ )
// Android bionic's isn't providing SYS_sysctl wrappers.
#define SYS__sysctl __NR__sysctl
#elif defined( __aarch64__ )
// Enable deprecated syscalls on arm 64-bit.
#define __ARCH_WANT_SYSCALL_DEPRECATED
// And use the NR variant of syscall too.
#define SYS__sysctl __NR__sysctl
#endif
// Some tests use _malloca which defined in malloc.h.
#if !defined (__APPLE__)
#include <malloc.h>
#endif
//
// ???
//
#if defined( _MSC_VER )
#define MAXPATHLEN _MAX_PATH
EXTERN_C uint64_t ReadTime( void );
EXTERN_C double SubtractTime( uint64_t endTime, uint64_t startTime );
/** Returns the number of leading 0-bits in x,
starting at the most significant bit position.
If x is 0, the result is undefined.
*/
EXTERN_C int __builtin_clz(unsigned int pattern);
#endif
#ifndef MIN
#define MIN(x,y) (((x)<(y))?(x):(y))
#endif
#ifndef MAX
#define MAX(x,y) (((x)>(y))?(x):(y))
#endif
/*
------------------------------------------------------------------------------------------------
WARNING: DO NOT USE THESE MACROS: MAKE_HEX_FLOAT, MAKE_HEX_DOUBLE, MAKE_HEX_LONG.
This is a typical usage of the macros:
double yhi = MAKE_HEX_DOUBLE(0x1.5555555555555p-2,0x15555555555555LL,-2);
(taken from math_brute_force/reference_math.c). There are two problems:
1. There is an error here. On Windows in will produce incorrect result
`0x1.5555555555555p+50'. To have a correct result it should be written as
`MAKE_HEX_DOUBLE(0x1.5555555555555p-2,0x15555555555555LL,-54)'. A proper value of the
third argument is not obvious -- sometimes it should be the same as exponent of the
first argument, but sometimes not.
2. Information is duplicated. It is easy to make a mistake.
Use HEX_FLT, HEX_DBL, HEX_LDBL macros instead (see them in the bottom of the file).
------------------------------------------------------------------------------------------------
*/
#if defined ( _MSC_VER ) && ! defined( __INTEL_COMPILER )
#define MAKE_HEX_FLOAT(x,y,z) ((float)ldexp( (float)(y), z))
#define MAKE_HEX_DOUBLE(x,y,z) ldexp( (double)(y), z)
#define MAKE_HEX_LONG(x,y,z) ((long double) ldexp( (long double)(y), z))
#else
// Do not use these macros in new code, use HEX_FLT, HEX_DBL, HEX_LDBL instead.
#define MAKE_HEX_FLOAT(x,y,z) x
#define MAKE_HEX_DOUBLE(x,y,z) x
#define MAKE_HEX_LONG(x,y,z) x
#endif
/*
------------------------------------------------------------------------------------------------
HEX_FLT, HEXT_DBL, HEX_LDBL -- Create hex floating point literal of type float, double, long
double respectively. Arguments:
sm -- sign of number,
int -- integer part of mantissa (without `0x' prefix),
fract -- fractional part of mantissa (without decimal point and `L' or `LL' suffixes),
se -- sign of exponent,
exp -- absolute value of (binary) exponent.
Example:
double yhi = HEX_DBL( +, 1, 5555555555555, -, 2 ); // == 0x1.5555555555555p-2
Note:
We have to pass signs as separate arguments because gcc pass negative integer values
(e. g. `-2') into a macro as two separate tokens, so `HEX_FLT( 1, 0, -2 )' produces result
`0x1.0p- 2' (note a space between minus and two) which is not a correct floating point
literal.
------------------------------------------------------------------------------------------------
*/
#if defined ( _MSC_VER ) && ! defined( __INTEL_COMPILER )
// If compiler does not support hex floating point literals:
#define HEX_FLT( sm, int, fract, se, exp ) sm ldexpf( (float)( 0x ## int ## fract ## UL ), se exp + ilogbf( (float) 0x ## int ) - ilogbf( ( float )( 0x ## int ## fract ## UL ) ) )
#define HEX_DBL( sm, int, fract, se, exp ) sm ldexp( (double)( 0x ## int ## fract ## ULL ), se exp + ilogb( (double) 0x ## int ) - ilogb( ( double )( 0x ## int ## fract ## ULL ) ) )
#define HEX_LDBL( sm, int, fract, se, exp ) sm ldexpl( (long double)( 0x ## int ## fract ## ULL ), se exp + ilogbl( (long double) 0x ## int ) - ilogbl( ( long double )( 0x ## int ## fract ## ULL ) ) )
#else
// If compiler supports hex floating point literals: just concatenate all the parts into a literal.
#define HEX_FLT( sm, int, fract, se, exp ) sm 0x ## int ## . ## fract ## p ## se ## exp ## F
#define HEX_DBL( sm, int, fract, se, exp ) sm 0x ## int ## . ## fract ## p ## se ## exp
#define HEX_LDBL( sm, int, fract, se, exp ) sm 0x ## int ## . ## fract ## p ## se ## exp ## L
#endif
#if defined(__MINGW32__)
#include <Windows.h>
#define sleep(sec) Sleep((sec) * 1000)
#endif
#endif // _COMPAT_H_

File diff suppressed because it is too large Load Diff

View File

@@ -1,127 +1,126 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _conversions_h #ifndef _conversions_h
#define _conversions_h #define _conversions_h
#include "errorHelpers.h" #include "compat.h"
#include "mt19937.h"
#include <stdio.h> #include "errorHelpers.h"
#include <stdlib.h> #include "mt19937.h"
#include <math.h> #include <stdio.h>
#include <float.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <sys/types.h> #include <sys/types.h>
#include "compat.h"
#if defined(__cplusplus)
#if defined(__cplusplus) extern "C" {
extern "C" { #endif
#endif
/* Note: the next three all have to match in size and order!! */
/* Note: the next three all have to match in size and order!! */
enum ExplicitTypes
enum ExplicitTypes {
{ kBool = 0,
kBool = 0, kChar,
kChar, kUChar,
kUChar, kUnsignedChar,
kUnsignedChar, kShort,
kShort, kUShort,
kUShort, kUnsignedShort,
kUnsignedShort, kInt,
kInt, kUInt,
kUInt, kUnsignedInt,
kUnsignedInt, kLong,
kLong, kULong,
kULong, kUnsignedLong,
kUnsignedLong, kFloat,
kFloat, kHalf,
kHalf, kDouble,
kDouble, kNumExplicitTypes
kNumExplicitTypes };
};
typedef enum ExplicitTypes ExplicitType;
typedef enum ExplicitTypes ExplicitType;
enum RoundingTypes
enum RoundingTypes {
{ kRoundToEven = 0,
kRoundToEven = 0, kRoundToZero,
kRoundToZero, kRoundToPosInf,
kRoundToPosInf, kRoundToNegInf,
kRoundToNegInf, kRoundToNearest,
kRoundToNearest,
kNumRoundingTypes,
kNumRoundingTypes,
kDefaultRoundingType = kRoundToNearest
kDefaultRoundingType = kRoundToNearest };
};
typedef enum RoundingTypes RoundingType;
typedef enum RoundingTypes RoundingType;
extern void print_type_to_string(ExplicitType type, void *data, char* string);
extern void print_type_to_string(ExplicitType type, void *data, char* string); extern size_t get_explicit_type_size( ExplicitType type );
extern size_t get_explicit_type_size( ExplicitType type ); extern const char * get_explicit_type_name( ExplicitType type );
extern const char * get_explicit_type_name( ExplicitType type ); extern void convert_explicit_value( void *inRaw, void *outRaw, ExplicitType inType, bool saturate, RoundingType roundType, ExplicitType outType );
extern void convert_explicit_value( void *inRaw, void *outRaw, ExplicitType inType, bool saturate, RoundingType roundType, ExplicitType outType );
extern void generate_random_data( ExplicitType type, size_t count, MTdata d, void *outData );
extern void generate_random_data( ExplicitType type, size_t count, MTdata d, void *outData ); extern void * create_random_data( ExplicitType type, MTdata d, size_t count );
extern void * create_random_data( ExplicitType type, MTdata d, size_t count );
extern cl_long read_upscale_signed( void *inRaw, ExplicitType inType );
extern cl_long read_upscale_signed( void *inRaw, ExplicitType inType ); extern cl_ulong read_upscale_unsigned( void *inRaw, ExplicitType inType );
extern cl_ulong read_upscale_unsigned( void *inRaw, ExplicitType inType ); extern float read_as_float( void *inRaw, ExplicitType inType );
extern float read_as_float( void *inRaw, ExplicitType inType );
extern float get_random_float(float low, float high, MTdata d);
extern float get_random_float(float low, float high, MTdata d); extern double get_random_double(double low, double high, MTdata d);
extern double get_random_double(double low, double high, MTdata d); extern float any_float( MTdata d );
extern float any_float( MTdata d ); extern double any_double( MTdata d );
extern double any_double( MTdata d );
extern int random_in_range( int minV, int maxV, MTdata d );
extern int random_in_range( int minV, int maxV, MTdata d );
size_t get_random_size_t(size_t low, size_t high, MTdata d);
size_t get_random_size_t(size_t low, size_t high, MTdata d);
// Note: though this takes a double, this is for use with single precision tests
// Note: though this takes a double, this is for use with single precision tests static inline int IsFloatSubnormal( float x )
static inline int IsFloatSubnormal( float x ) {
{ #if 2 == FLT_RADIX
#if 2 == FLT_RADIX // Do this in integer to avoid problems with FTZ behavior
// Do this in integer to avoid problems with FTZ behavior union{ float d; uint32_t u;}u;
union{ float d; uint32_t u;}u; u.d = fabsf(x);
u.d = fabsf(x); return (u.u-1) < 0x007fffffU;
return (u.u-1) < 0x007fffffU; #else
#else // rely on floating point hardware for non-radix2 non-IEEE-754 hardware -- will fail if you flush subnormals to zero
// rely on floating point hardware for non-radix2 non-IEEE-754 hardware -- will fail if you flush subnormals to zero return fabs(x) < (double) FLT_MIN && x != 0.0;
return fabs(x) < (double) FLT_MIN && x != 0.0; #endif
#endif }
}
static inline int IsDoubleSubnormal( double x )
static inline int IsDoubleSubnormal( double x ) {
{ #if 2 == FLT_RADIX
#if 2 == FLT_RADIX // Do this in integer to avoid problems with FTZ behavior
// Do this in integer to avoid problems with FTZ behavior union{ double d; uint64_t u;}u;
union{ double d; uint64_t u;}u; u.d = fabs( x);
u.d = fabs( x); return (u.u-1) < 0x000fffffffffffffULL;
return (u.u-1) < 0x000fffffffffffffULL; #else
#else // rely on floating point hardware for non-radix2 non-IEEE-754 hardware -- will fail if you flush subnormals to zero
// rely on floating point hardware for non-radix2 non-IEEE-754 hardware -- will fail if you flush subnormals to zero return fabs(x) < (double) DBL_MIN && x != 0.0;
return fabs(x) < (double) DBL_MIN && x != 0.0; #endif
#endif }
}
#if defined(__cplusplus)
#if defined(__cplusplus) }
} #endif
#endif
#endif // _conversions_h
#endif // _conversions_h

File diff suppressed because it is too large Load Diff

View File

@@ -1,149 +1,149 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _errorHelpers_h #ifndef _errorHelpers_h
#define _errorHelpers_h #define _errorHelpers_h
#ifdef __APPLE__ #ifdef __APPLE__
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/opencl.h> #include <CL/opencl.h>
#endif #endif
#include <stdlib.h> #include <stdlib.h>
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
#define LOWER_IS_BETTER 0 #define LOWER_IS_BETTER 0
#define HIGHER_IS_BETTER 1 #define HIGHER_IS_BETTER 1
// If USE_ATF is defined, all log_error and log_info calls can be routed to test library // If USE_ATF is defined, all log_error and log_info calls can be routed to test library
// functions as described below. This is helpful for integration into an automated testing // functions as described below. This is helpful for integration into an automated testing
// system. // system.
#if USE_ATF #if USE_ATF
// export BUILD_WITH_ATF=1 // export BUILD_WITH_ATF=1
#include <ATF/ATF.h> #include <ATF/ATF.h>
#define test_start() ATFTestStart() #define test_start() ATFTestStart()
#define log_info ATFLogInfo #define log_info ATFLogInfo
#define log_error ATFLogError #define log_error ATFLogError
#define log_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format, ##__VA_ARGS__) #define log_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format, ##__VA_ARGS__)
#define test_finish() ATFTestFinish() #define test_finish() ATFTestFinish()
#define vlog_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format,##__VA_ARGS__) #define vlog_perf(_number, _higherBetter, _numType, _format, ...) ATFLogPerformanceNumber(_number, _higherBetter, _numType, _format,##__VA_ARGS__)
#define vlog ATFLogInfo #define vlog ATFLogInfo
#define vlog_error ATFLogError #define vlog_error ATFLogError
#else #else
#define test_start() #define test_start()
#define log_info printf #define log_info printf
#define log_error printf #define log_error printf
#define log_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, \ #define log_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, \
_higherBetter?"higher is better":"lower is better", _number ) _higherBetter?"higher is better":"lower is better", _number )
#define test_finish() #define test_finish()
#define vlog_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, \ #define vlog_perf(_number, _higherBetter, _numType, _format, ...) printf("Performance Number " _format " (in %s, %s): %g\n",##__VA_ARGS__, _numType, \
_higherBetter?"higher is better":"lower is better" , _number) _higherBetter?"higher is better":"lower is better" , _number)
#ifdef _WIN32 #ifdef _WIN32
#ifdef __MINGW32__ #ifdef __MINGW32__
// Use __mingw_printf since it supports "%a" format specifier // Use __mingw_printf since it supports "%a" format specifier
#define vlog __mingw_printf #define vlog __mingw_printf
#define vlog_error __mingw_printf #define vlog_error __mingw_printf
#else #else
// Use home-baked function that treats "%a" as "%f" // Use home-baked function that treats "%a" as "%f"
static int vlog_win32(const char *format, ...); static int vlog_win32(const char *format, ...);
#define vlog vlog_win32 #define vlog vlog_win32
#define vlog_error vlog_win32 #define vlog_error vlog_win32
#endif #endif
#else #else
#define vlog_error printf #define vlog_error printf
#define vlog printf #define vlog printf
#endif #endif
#endif #endif
#define ct_assert(b) ct_assert_i(b, __LINE__) #define ct_assert(b) ct_assert_i(b, __LINE__)
#define ct_assert_i(b, line) ct_assert_ii(b, line) #define ct_assert_i(b, line) ct_assert_ii(b, line)
#define ct_assert_ii(b, line) int _compile_time_assertion_on_line_##line[b ? 1 : -1]; #define ct_assert_ii(b, line) int _compile_time_assertion_on_line_##line[b ? 1 : -1];
#define test_error(errCode,msg) test_error_ret(errCode,msg,errCode) #define test_error(errCode,msg) test_error_ret(errCode,msg,errCode)
#define test_error_ret(errCode,msg,retValue) { if( errCode != CL_SUCCESS ) { print_error( errCode, msg ); return retValue ; } } #define test_error_ret(errCode,msg,retValue) { if( errCode != CL_SUCCESS ) { print_error( errCode, msg ); return retValue ; } }
#define print_error(errCode,msg) log_error( "ERROR: %s! (%s from %s:%d)\n", msg, IGetErrorString( errCode ), __FILE__, __LINE__ ); #define print_error(errCode,msg) log_error( "ERROR: %s! (%s from %s:%d)\n", msg, IGetErrorString( errCode ), __FILE__, __LINE__ );
// expected error code vs. what we got // expected error code vs. what we got
#define test_failure_error(errCode, expectedErrCode, msg) test_failure_error_ret(errCode, expectedErrCode, msg, errCode != expectedErrCode) #define test_failure_error(errCode, expectedErrCode, msg) test_failure_error_ret(errCode, expectedErrCode, msg, errCode != expectedErrCode)
#define test_failure_error_ret(errCode, expectedErrCode, msg, retValue) { if( errCode != expectedErrCode ) { print_failure_error( errCode, expectedErrCode, msg ); return retValue ; } } #define test_failure_error_ret(errCode, expectedErrCode, msg, retValue) { if( errCode != expectedErrCode ) { print_failure_error( errCode, expectedErrCode, msg ); return retValue ; } }
#define print_failure_error(errCode, expectedErrCode, msg) log_error( "ERROR: %s! (Got %s, expected %s from %s:%d)\n", msg, IGetErrorString( errCode ), IGetErrorString( expectedErrCode ), __FILE__, __LINE__ ); #define print_failure_error(errCode, expectedErrCode, msg) log_error( "ERROR: %s! (Got %s, expected %s from %s:%d)\n", msg, IGetErrorString( errCode ), IGetErrorString( expectedErrCode ), __FILE__, __LINE__ );
#define test_failure_warning(errCode, expectedErrCode, msg) test_failure_warning_ret(errCode, expectedErrCode, msg, errCode != expectedErrCode) #define test_failure_warning(errCode, expectedErrCode, msg) test_failure_warning_ret(errCode, expectedErrCode, msg, errCode != expectedErrCode)
#define test_failure_warning_ret(errCode, expectedErrCode, msg, retValue) { if( errCode != expectedErrCode ) { print_failure_warning( errCode, expectedErrCode, msg ); warnings++ ; } } #define test_failure_warning_ret(errCode, expectedErrCode, msg, retValue) { if( errCode != expectedErrCode ) { print_failure_warning( errCode, expectedErrCode, msg ); warnings++ ; } }
#define print_failure_warning(errCode, expectedErrCode, msg) log_error( "WARNING: %s! (Got %s, expected %s from %s:%d)\n", msg, IGetErrorString( errCode ), IGetErrorString( expectedErrCode ), __FILE__, __LINE__ ); #define print_failure_warning(errCode, expectedErrCode, msg) log_error( "WARNING: %s! (Got %s, expected %s from %s:%d)\n", msg, IGetErrorString( errCode ), IGetErrorString( expectedErrCode ), __FILE__, __LINE__ );
extern const char *IGetErrorString( int clErrorCode ); extern const char *IGetErrorString( int clErrorCode );
extern float Ulp_Error_Half( cl_ushort test, float reference ); extern float Ulp_Error_Half( cl_ushort test, float reference );
extern float Ulp_Error( float test, double reference ); extern float Ulp_Error( float test, double reference );
extern float Ulp_Error_Double( double test, long double reference ); extern float Ulp_Error_Double( double test, long double reference );
extern const char *GetChannelTypeName( cl_channel_type type ); extern const char *GetChannelTypeName( cl_channel_type type );
extern int IsChannelTypeSupported( cl_channel_type type ); extern int IsChannelTypeSupported( cl_channel_type type );
extern const char *GetChannelOrderName( cl_channel_order order ); extern const char *GetChannelOrderName( cl_channel_order order );
extern int IsChannelOrderSupported( cl_channel_order order ); extern int IsChannelOrderSupported( cl_channel_order order );
extern const char *GetAddressModeName( cl_addressing_mode mode ); extern const char *GetAddressModeName( cl_addressing_mode mode );
extern const char *GetDeviceTypeName( cl_device_type type ); extern const char *GetDeviceTypeName( cl_device_type type );
// NON-REENTRANT UNLESS YOU PROVIDE A BUFFER PTR (pass null to use static storage, but it's not reentrant then!) // NON-REENTRANT UNLESS YOU PROVIDE A BUFFER PTR (pass null to use static storage, but it's not reentrant then!)
extern const char *GetDataVectorString( void *dataBuffer, size_t typeSize, size_t vecSize, char *buffer ); extern const char *GetDataVectorString( void *dataBuffer, size_t typeSize, size_t vecSize, char *buffer );
#if defined (_WIN32) && !defined(__MINGW32__) #if defined (_WIN32) && !defined(__MINGW32__)
#include <stdarg.h> #include <stdarg.h>
#include <stdio.h> #include <stdio.h>
#include <string.h> #include <string.h>
static int vlog_win32(const char *format, ...) static int vlog_win32(const char *format, ...)
{ {
const char *new_format = format; const char *new_format = format;
if (strstr(format, "%a")) { if (strstr(format, "%a")) {
char *temp; char *temp;
if ((temp = strdup(format)) == NULL) { if ((temp = strdup(format)) == NULL) {
printf("vlog_win32: Failed to allocate memory for strdup\n"); printf("vlog_win32: Failed to allocate memory for strdup\n");
return -1; return -1;
} }
new_format = temp; new_format = temp;
while (*temp) { while (*temp) {
// replace %a with %f // replace %a with %f
if ((*temp == '%') && (*(temp+1) == 'a')) { if ((*temp == '%') && (*(temp+1) == 'a')) {
*(temp+1) = 'f'; *(temp+1) = 'f';
} }
temp++; temp++;
} }
} }
va_list args; va_list args;
va_start(args, format); va_start(args, format);
vprintf(new_format, args); vprintf(new_format, args);
va_end(args); va_end(args);
if (new_format != format) { if (new_format != format) {
free((void*)new_format); free((void*)new_format);
} }
return 0; return 0;
} }
#endif #endif
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
#endif // _errorHelpers_h #endif // _errorHelpers_h

View File

@@ -1,89 +1,104 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _fpcontrol_h #ifndef _fpcontrol_h
#define _fpcontrol_h #define _fpcontrol_h
// In order to get tests for correctly rounded operations (e.g. multiply) to work properly we need to be able to set the reference hardware // In order to get tests for correctly rounded operations (e.g. multiply) to work properly we need to be able to set the reference hardware
// to FTZ mode if the device hardware is running in that mode. We have explored all other options short of writing correctly rounded operations // to FTZ mode if the device hardware is running in that mode. We have explored all other options short of writing correctly rounded operations
// in integer code, and have found this is the only way to correctly verify operation. // in integer code, and have found this is the only way to correctly verify operation.
// //
// Non-Apple implementations will need to provide their own implentation for these features. If the reference hardware and device are both // Non-Apple implementations will need to provide their own implentation for these features. If the reference hardware and device are both
// running in the same state (either FTZ or IEEE compliant modes) then these functions may be empty. If the device is running in non-default // running in the same state (either FTZ or IEEE compliant modes) then these functions may be empty. If the device is running in non-default
// rounding mode (e.g. round toward zero), then these functions should also set the reference device into that rounding mode. // rounding mode (e.g. round toward zero), then these functions should also set the reference device into that rounding mode.
#if defined( __APPLE__ ) || defined( _MSC_VER ) || defined( __linux__ ) || defined (__MINGW32__) #if defined( __APPLE__ ) || defined( _MSC_VER ) || defined( __linux__ ) || defined (__MINGW32__)
typedef int FPU_mode_type; typedef int FPU_mode_type;
#if defined( __i386__ ) || defined( __x86_64__ ) #if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined( __MINGW32__ )
#include <xmmintrin.h> #include <xmmintrin.h>
#elif defined( __PPC__ ) #elif defined( __PPC__ )
#include <fpu_control.h> #include <fpu_control.h>
extern __thread fpu_control_t fpu_control; extern __thread fpu_control_t fpu_control;
#endif #endif
// Set the reference hardware floating point unit to FTZ mode // Set the reference hardware floating point unit to FTZ mode
static inline void ForceFTZ( FPU_mode_type *mode ) static inline void ForceFTZ( FPU_mode_type *mode )
{ {
#if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__) #if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__)
*mode = _mm_getcsr(); *mode = _mm_getcsr();
_mm_setcsr( *mode | 0x8040); _mm_setcsr( *mode | 0x8040);
#elif defined( __PPC__ ) #elif defined( __PPC__ )
*mode = fpu_control; *mode = fpu_control;
fpu_control |= _FPU_MASK_NI; fpu_control |= _FPU_MASK_NI;
#elif defined ( __arm__ ) #elif defined ( __arm__ )
unsigned fpscr; unsigned fpscr;
__asm__ volatile ("fmrx %0, fpscr" : "=r"(fpscr)); __asm__ volatile ("fmrx %0, fpscr" : "=r"(fpscr));
*mode = fpscr; *mode = fpscr;
__asm__ volatile ("fmxr fpscr, %0" :: "r"(fpscr | (1U << 24))); __asm__ volatile ("fmxr fpscr, %0" :: "r"(fpscr | (1U << 24)));
#else // Add 64 bit support
#error ForceFTZ needs an implentation #elif defined (__aarch64__)
#endif unsigned fpcr;
} __asm__ volatile ("mrs %0, fpcr" : "=r"(fpcr));
*mode = fpcr;
// Disable the denorm flush to zero __asm__ volatile ("msr fpcr, %0" :: "r"(fpcr | (1U << 24)));
static inline void DisableFTZ( FPU_mode_type *mode ) #else
{ #error ForceFTZ needs an implentation
#if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__) #endif
*mode = _mm_getcsr(); }
_mm_setcsr( *mode & ~0x8040);
#elif defined( __PPC__ ) // Disable the denorm flush to zero
*mode = fpu_control; static inline void DisableFTZ( FPU_mode_type *mode )
fpu_control &= ~_FPU_MASK_NI; {
#elif defined ( __arm__ ) #if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__)
unsigned fpscr; *mode = _mm_getcsr();
__asm__ volatile ("fmrx %0, fpscr" : "=r"(fpscr)); _mm_setcsr( *mode & ~0x8040);
*mode = fpscr; #elif defined( __PPC__ )
__asm__ volatile ("fmxr fpscr, %0" :: "r"(fpscr & ~(1U << 24))); *mode = fpu_control;
#else fpu_control &= ~_FPU_MASK_NI;
#error DisableFTZ needs an implentation #elif defined ( __arm__ )
#endif unsigned fpscr;
} __asm__ volatile ("fmrx %0, fpscr" : "=r"(fpscr));
*mode = fpscr;
// Restore the reference hardware to floating point state indicated by *mode __asm__ volatile ("fmxr fpscr, %0" :: "r"(fpscr & ~(1U << 24)));
static inline void RestoreFPState( FPU_mode_type *mode ) // Add 64 bit support
{ #elif defined (__aarch64__)
#if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__) unsigned fpcr;
_mm_setcsr( *mode ); __asm__ volatile ("mrs %0, fpcr" : "=r"(fpcr));
#elif defined( __PPC__) *mode = fpcr;
fpu_control = *mode; __asm__ volatile ("msr fpcr, %0" :: "r"(fpcr & ~(1U << 24)));
#elif defined (__arm__) #else
__asm__ volatile ("fmxr fpscr, %0" :: "r"(*mode)); #error DisableFTZ needs an implentation
#else #endif
#error RestoreFPState needs an implementation }
#endif
} // Restore the reference hardware to floating point state indicated by *mode
#else static inline void RestoreFPState( FPU_mode_type *mode )
#error ForceFTZ and RestoreFPState need implentations {
#endif #if defined( __i386__ ) || defined( __x86_64__ ) || defined( _MSC_VER ) || defined (__MINGW32__)
_mm_setcsr( *mode );
#endif #elif defined( __PPC__)
fpu_control = *mode;
#elif defined (__arm__)
__asm__ volatile ("fmxr fpscr, %0" :: "r"(*mode));
// Add 64 bit support
#elif defined (__aarch64__)
__asm__ volatile ("msr fpcr, %0" :: "r"(*mode));
#else
#error RestoreFPState needs an implementation
#endif
}
#else
#error ForceFTZ and RestoreFPState need implentations
#endif
#endif

View File

@@ -1,53 +1,53 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "genericThread.h" #include "genericThread.h"
#if defined(_WIN32) #if defined(_WIN32)
#include <windows.h> #include <windows.h>
#else // !_WIN32 #else // !_WIN32
#include <pthread.h> #include <pthread.h>
#endif #endif
void * genericThread::IStaticReflector( void * data ) void * genericThread::IStaticReflector( void * data )
{ {
genericThread *t = (genericThread *)data; genericThread *t = (genericThread *)data;
return t->IRun(); return t->IRun();
} }
bool genericThread::Start( void ) bool genericThread::Start( void )
{ {
#if defined(_WIN32) #if defined(_WIN32)
mHandle = CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE) IStaticReflector, this, 0, NULL ); mHandle = CreateThread( NULL, 0, (LPTHREAD_START_ROUTINE) IStaticReflector, this, 0, NULL );
return ( mHandle != NULL ); return ( mHandle != NULL );
#else // !_WIN32 #else // !_WIN32
int error = pthread_create( (pthread_t*)&mHandle, NULL, IStaticReflector, (void *)this ); int error = pthread_create( (pthread_t*)&mHandle, NULL, IStaticReflector, (void *)this );
return ( error == 0 ); return ( error == 0 );
#endif // !_WIN32 #endif // !_WIN32
} }
void * genericThread::Join( void ) void * genericThread::Join( void )
{ {
#if defined(_WIN32) #if defined(_WIN32)
WaitForSingleObject( (HANDLE)mHandle, INFINITE ); WaitForSingleObject( (HANDLE)mHandle, INFINITE );
return NULL; return NULL;
#else // !_WIN32 #else // !_WIN32
void * retVal; void * retVal;
int error = pthread_join( (pthread_t)mHandle, &retVal ); int error = pthread_join( (pthread_t)mHandle, &retVal );
if( error != 0 ) if( error != 0 )
retVal = NULL; retVal = NULL;
return retVal; return retVal;
#endif // !_WIN32 #endif // !_WIN32
} }

View File

@@ -1,42 +1,42 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _genericThread_h #ifndef _genericThread_h
#define _genericThread_h #define _genericThread_h
#include <stdio.h> #include <stdio.h>
class genericThread class genericThread
{ {
public: public:
virtual ~genericThread() {} virtual ~genericThread() {}
bool Start( void ); bool Start( void );
void * Join( void ); void * Join( void );
protected: protected:
virtual void * IRun( void ) = 0; virtual void * IRun( void ) = 0;
private: private:
void* mHandle; void* mHandle;
static void * IStaticReflector( void * data ); static void * IStaticReflector( void * data );
}; };
#endif // _genericThread_h #endif // _genericThread_h

View File

@@ -1,249 +1,249 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "imageHelpers.h" #include "imageHelpers.h"
size_t get_format_type_size( const cl_image_format *format ) size_t get_format_type_size( const cl_image_format *format )
{ {
return get_channel_data_type_size( format->image_channel_data_type ); return get_channel_data_type_size( format->image_channel_data_type );
} }
size_t get_channel_data_type_size( cl_channel_type channelType ) size_t get_channel_data_type_size( cl_channel_type channelType )
{ {
switch( channelType ) switch( channelType )
{ {
case CL_SNORM_INT8: case CL_SNORM_INT8:
case CL_UNORM_INT8: case CL_UNORM_INT8:
case CL_SIGNED_INT8: case CL_SIGNED_INT8:
case CL_UNSIGNED_INT8: case CL_UNSIGNED_INT8:
return 1; return 1;
case CL_SNORM_INT16: case CL_SNORM_INT16:
case CL_UNORM_INT16: case CL_UNORM_INT16:
case CL_SIGNED_INT16: case CL_SIGNED_INT16:
case CL_UNSIGNED_INT16: case CL_UNSIGNED_INT16:
case CL_HALF_FLOAT: case CL_HALF_FLOAT:
#ifdef CL_SFIXED14_APPLE #ifdef CL_SFIXED14_APPLE
case CL_SFIXED14_APPLE: case CL_SFIXED14_APPLE:
#endif #endif
return sizeof( cl_short ); return sizeof( cl_short );
case CL_SIGNED_INT32: case CL_SIGNED_INT32:
case CL_UNSIGNED_INT32: case CL_UNSIGNED_INT32:
return sizeof( cl_int ); return sizeof( cl_int );
case CL_UNORM_SHORT_565: case CL_UNORM_SHORT_565:
case CL_UNORM_SHORT_555: case CL_UNORM_SHORT_555:
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_SHORT_565_REV: case CL_UNORM_SHORT_565_REV:
case CL_UNORM_SHORT_555_REV: case CL_UNORM_SHORT_555_REV:
#endif #endif
return 2; return 2;
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_INT_8888: case CL_UNORM_INT_8888:
case CL_UNORM_INT_8888_REV: case CL_UNORM_INT_8888_REV:
return 4; return 4;
#endif #endif
case CL_UNORM_INT_101010: case CL_UNORM_INT_101010:
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_INT_101010_REV: case CL_UNORM_INT_101010_REV:
#endif #endif
return 4; return 4;
case CL_FLOAT: case CL_FLOAT:
return sizeof( cl_float ); return sizeof( cl_float );
default: default:
return 0; return 0;
} }
} }
size_t get_format_channel_count( const cl_image_format *format ) size_t get_format_channel_count( const cl_image_format *format )
{ {
return get_channel_order_channel_count( format->image_channel_order ); return get_channel_order_channel_count( format->image_channel_order );
} }
size_t get_channel_order_channel_count( cl_channel_order order ) size_t get_channel_order_channel_count( cl_channel_order order )
{ {
switch( order ) switch( order )
{ {
case CL_R: case CL_R:
case CL_A: case CL_A:
case CL_Rx: case CL_Rx:
case CL_INTENSITY: case CL_INTENSITY:
case CL_LUMINANCE: case CL_LUMINANCE:
return 1; return 1;
case CL_RG: case CL_RG:
case CL_RA: case CL_RA:
case CL_RGx: case CL_RGx:
return 2; return 2;
case CL_RGB: case CL_RGB:
case CL_RGBx: case CL_RGBx:
return 3; return 3;
case CL_RGBA: case CL_RGBA:
case CL_ARGB: case CL_ARGB:
case CL_BGRA: case CL_BGRA:
#ifdef CL_1RGB_APPLE #ifdef CL_1RGB_APPLE
case CL_1RGB_APPLE: case CL_1RGB_APPLE:
#endif #endif
#ifdef CL_BGR1_APPLE #ifdef CL_BGR1_APPLE
case CL_BGR1_APPLE: case CL_BGR1_APPLE:
#endif #endif
return 4; return 4;
default: default:
return 0; return 0;
} }
} }
int is_format_signed( const cl_image_format *format ) int is_format_signed( const cl_image_format *format )
{ {
switch( format->image_channel_data_type ) switch( format->image_channel_data_type )
{ {
case CL_SNORM_INT8: case CL_SNORM_INT8:
case CL_SIGNED_INT8: case CL_SIGNED_INT8:
case CL_SNORM_INT16: case CL_SNORM_INT16:
case CL_SIGNED_INT16: case CL_SIGNED_INT16:
case CL_SIGNED_INT32: case CL_SIGNED_INT32:
case CL_HALF_FLOAT: case CL_HALF_FLOAT:
case CL_FLOAT: case CL_FLOAT:
#ifdef CL_SFIXED14_APPLE #ifdef CL_SFIXED14_APPLE
case CL_SFIXED14_APPLE: case CL_SFIXED14_APPLE:
#endif #endif
return 1; return 1;
default: default:
return 0; return 0;
} }
} }
size_t get_pixel_size( cl_image_format *format ) size_t get_pixel_size( cl_image_format *format )
{ {
switch( format->image_channel_data_type ) switch( format->image_channel_data_type )
{ {
case CL_SNORM_INT8: case CL_SNORM_INT8:
case CL_UNORM_INT8: case CL_UNORM_INT8:
case CL_SIGNED_INT8: case CL_SIGNED_INT8:
case CL_UNSIGNED_INT8: case CL_UNSIGNED_INT8:
return get_format_channel_count( format ); return get_format_channel_count( format );
case CL_SNORM_INT16: case CL_SNORM_INT16:
case CL_UNORM_INT16: case CL_UNORM_INT16:
case CL_SIGNED_INT16: case CL_SIGNED_INT16:
case CL_UNSIGNED_INT16: case CL_UNSIGNED_INT16:
case CL_HALF_FLOAT: case CL_HALF_FLOAT:
#ifdef CL_SFIXED14_APPLE #ifdef CL_SFIXED14_APPLE
case CL_SFIXED14_APPLE: case CL_SFIXED14_APPLE:
#endif #endif
return get_format_channel_count( format ) * sizeof( cl_ushort ); return get_format_channel_count( format ) * sizeof( cl_ushort );
case CL_SIGNED_INT32: case CL_SIGNED_INT32:
case CL_UNSIGNED_INT32: case CL_UNSIGNED_INT32:
return get_format_channel_count( format ) * sizeof( cl_int ); return get_format_channel_count( format ) * sizeof( cl_int );
case CL_UNORM_SHORT_565: case CL_UNORM_SHORT_565:
case CL_UNORM_SHORT_555: case CL_UNORM_SHORT_555:
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_SHORT_565_REV: case CL_UNORM_SHORT_565_REV:
case CL_UNORM_SHORT_555_REV: case CL_UNORM_SHORT_555_REV:
#endif #endif
return 2; return 2;
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_INT_8888: case CL_UNORM_INT_8888:
case CL_UNORM_INT_8888_REV: case CL_UNORM_INT_8888_REV:
return 4; return 4;
#endif #endif
case CL_UNORM_INT_101010: case CL_UNORM_INT_101010:
#ifdef OBSOLETE_FORAMT #ifdef OBSOLETE_FORAMT
case CL_UNORM_INT_101010_REV: case CL_UNORM_INT_101010_REV:
#endif #endif
return 4; return 4;
case CL_FLOAT: case CL_FLOAT:
return get_format_channel_count( format ) * sizeof( cl_float ); return get_format_channel_count( format ) * sizeof( cl_float );
default: default:
return 0; return 0;
} }
} }
int get_8_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat ) int get_8_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat )
{ {
cl_image_format formatList[ 128 ]; cl_image_format formatList[ 128 ];
unsigned int outFormatCount, i; unsigned int outFormatCount, i;
int error; int error;
/* Make sure each image format is supported */ /* Make sure each image format is supported */
if ((error = clGetSupportedImageFormats( context, flags, objType, 128, formatList, &outFormatCount ))) if ((error = clGetSupportedImageFormats( context, flags, objType, 128, formatList, &outFormatCount )))
return error; return error;
/* Look for one that is an 8-bit format */ /* Look for one that is an 8-bit format */
for( i = 0; i < outFormatCount; i++ ) for( i = 0; i < outFormatCount; i++ )
{ {
if( formatList[ i ].image_channel_data_type == CL_SNORM_INT8 || if( formatList[ i ].image_channel_data_type == CL_SNORM_INT8 ||
formatList[ i ].image_channel_data_type == CL_UNORM_INT8 || formatList[ i ].image_channel_data_type == CL_UNORM_INT8 ||
formatList[ i ].image_channel_data_type == CL_SIGNED_INT8 || formatList[ i ].image_channel_data_type == CL_SIGNED_INT8 ||
formatList[ i ].image_channel_data_type == CL_UNSIGNED_INT8 ) formatList[ i ].image_channel_data_type == CL_UNSIGNED_INT8 )
{ {
if ( !channelCount || ( channelCount && ( get_format_channel_count( &formatList[ i ] ) == channelCount ) ) ) if ( !channelCount || ( channelCount && ( get_format_channel_count( &formatList[ i ] ) == channelCount ) ) )
{ {
*outFormat = formatList[ i ]; *outFormat = formatList[ i ];
return 0; return 0;
} }
} }
} }
return -1; return -1;
} }
int get_32_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat ) int get_32_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat )
{ {
cl_image_format formatList[ 128 ]; cl_image_format formatList[ 128 ];
unsigned int outFormatCount, i; unsigned int outFormatCount, i;
int error; int error;
/* Make sure each image format is supported */ /* Make sure each image format is supported */
if ((error = clGetSupportedImageFormats( context, flags, objType, 128, formatList, &outFormatCount ))) if ((error = clGetSupportedImageFormats( context, flags, objType, 128, formatList, &outFormatCount )))
return error; return error;
/* Look for one that is an 8-bit format */ /* Look for one that is an 8-bit format */
for( i = 0; i < outFormatCount; i++ ) for( i = 0; i < outFormatCount; i++ )
{ {
if( formatList[ i ].image_channel_data_type == CL_UNORM_INT_101010 || if( formatList[ i ].image_channel_data_type == CL_UNORM_INT_101010 ||
formatList[ i ].image_channel_data_type == CL_FLOAT || formatList[ i ].image_channel_data_type == CL_FLOAT ||
formatList[ i ].image_channel_data_type == CL_SIGNED_INT32 || formatList[ i ].image_channel_data_type == CL_SIGNED_INT32 ||
formatList[ i ].image_channel_data_type == CL_UNSIGNED_INT32 ) formatList[ i ].image_channel_data_type == CL_UNSIGNED_INT32 )
{ {
if ( !channelCount || ( channelCount && ( get_format_channel_count( &formatList[ i ] ) == channelCount ) ) ) if ( !channelCount || ( channelCount && ( get_format_channel_count( &formatList[ i ] ) == channelCount ) ) )
{ {
*outFormat = formatList[ i ]; *outFormat = formatList[ i ];
return 0; return 0;
} }
} }
} }
return -1; return -1;
} }

View File

@@ -1,37 +1,37 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _imageHelpers_h #ifndef _imageHelpers_h
#define _imageHelpers_h #define _imageHelpers_h
#include "errorHelpers.h" #include "errorHelpers.h"
extern size_t get_format_type_size( const cl_image_format *format ); extern size_t get_format_type_size( const cl_image_format *format );
extern size_t get_channel_data_type_size( cl_channel_type channelType ); extern size_t get_channel_data_type_size( cl_channel_type channelType );
extern size_t get_format_channel_count( const cl_image_format *format ); extern size_t get_format_channel_count( const cl_image_format *format );
extern size_t get_channel_order_channel_count( cl_channel_order order ); extern size_t get_channel_order_channel_count( cl_channel_order order );
extern int is_format_signed( const cl_image_format *format ); extern int is_format_signed( const cl_image_format *format );
extern size_t get_pixel_size( cl_image_format *format ); extern size_t get_pixel_size( cl_image_format *format );
/* Helper to get any ol image format as long as it is 8-bits-per-channel */ /* Helper to get any ol image format as long as it is 8-bits-per-channel */
extern int get_8_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat ); extern int get_8_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat );
/* Helper to get any ol image format as long as it is 32-bits-per-channel */ /* Helper to get any ol image format as long as it is 32-bits-per-channel */
extern int get_32_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat ); extern int get_32_bit_image_format( cl_context context, cl_mem_object_type objType, cl_mem_flags flags, size_t channelCount, cl_image_format *outFormat );
#endif // _imageHelpers_h #endif // _imageHelpers_h

File diff suppressed because it is too large Load Diff

View File

@@ -1,131 +1,131 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _kernelHelpers_h #ifndef _kernelHelpers_h
#define _kernelHelpers_h #define _kernelHelpers_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#if defined (__MINGW32__) #if defined (__MINGW32__)
#include <malloc.h> #include <malloc.h>
#endif #endif
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#ifdef __APPLE__ #ifdef __APPLE__
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/opencl.h> #include <CL/opencl.h>
#endif #endif
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif // __cplusplus #endif // __cplusplus
/* /*
* The below code is intended to be used at the top of kernels that appear inline in files to set line and file info for the kernel: * The below code is intended to be used at the top of kernels that appear inline in files to set line and file info for the kernel:
* *
* const char *source = { * const char *source = {
* INIT_OPENCL_DEBUG_INFO * INIT_OPENCL_DEBUG_INFO
* "__kernel void foo( int x )\n" * "__kernel void foo( int x )\n"
* "{\n" * "{\n"
* " ...\n" * " ...\n"
* "}\n" * "}\n"
* }; * };
*/ */
#define INIT_OPENCL_DEBUG_INFO SET_OPENCL_LINE_INFO( __LINE__, __FILE__ ) #define INIT_OPENCL_DEBUG_INFO SET_OPENCL_LINE_INFO( __LINE__, __FILE__ )
#define SET_OPENCL_LINE_INFO(_line, _file) "#line " STRINGIFY(_line) " " STRINGIFY(_file) "\n" #define SET_OPENCL_LINE_INFO(_line, _file) "#line " STRINGIFY(_line) " " STRINGIFY(_file) "\n"
#ifndef STRINGIFY_VALUE #ifndef STRINGIFY_VALUE
#define STRINGIFY_VALUE(_x) STRINGIFY(_x) #define STRINGIFY_VALUE(_x) STRINGIFY(_x)
#endif #endif
#ifndef STRINGIFY #ifndef STRINGIFY
#define STRINGIFY(_x) #_x #define STRINGIFY(_x) #_x
#endif #endif
/* Helper that creates a single program and kernel from a single-kernel program source */ /* Helper that creates a single program and kernel from a single-kernel program source */
extern int create_single_kernel_helper( cl_context context, cl_program *outProgram, cl_kernel *outKernel, unsigned int numKernelLines, const char **kernelProgram, const char *kernelName ); extern int create_single_kernel_helper( cl_context context, cl_program *outProgram, cl_kernel *outKernel, unsigned int numKernelLines, const char **kernelProgram, const char *kernelName );
/* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */ /* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */
extern int get_max_common_work_group_size( cl_context context, cl_kernel kernel, size_t globalThreadSize, size_t *outSize ); extern int get_max_common_work_group_size( cl_context context, cl_kernel kernel, size_t globalThreadSize, size_t *outSize );
/* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */ /* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */
extern int get_max_common_2D_work_group_size( cl_context context, cl_kernel kernel, size_t *globalThreadSize, size_t *outSizes ); extern int get_max_common_2D_work_group_size( cl_context context, cl_kernel kernel, size_t *globalThreadSize, size_t *outSizes );
/* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */ /* Helper to obtain the biggest fit work group size for all the devices in a given group and for the given global thread size */
extern int get_max_common_3D_work_group_size( cl_context context, cl_kernel kernel, size_t *globalThreadSize, size_t *outSizes ); extern int get_max_common_3D_work_group_size( cl_context context, cl_kernel kernel, size_t *globalThreadSize, size_t *outSizes );
/* Helper to get major/minor number for a device */ /* Helper to get major/minor number for a device */
extern int get_device_version( cl_device_id id, size_t* major, size_t* minor); extern int get_device_version( cl_device_id id, size_t* major, size_t* minor);
/* Helper to obtain the biggest allowed work group size for all the devices in a given group */ /* Helper to obtain the biggest allowed work group size for all the devices in a given group */
extern int get_max_allowed_work_group_size( cl_context context, cl_kernel kernel, size_t *outSize, size_t *outLimits ); extern int get_max_allowed_work_group_size( cl_context context, cl_kernel kernel, size_t *outSize, size_t *outLimits );
/* Helper to determine if an extension is supported by a device */ /* Helper to determine if an extension is supported by a device */
extern int is_extension_available( cl_device_id device, const char *extensionName ); extern int is_extension_available( cl_device_id device, const char *extensionName );
/* Helper to determine if a device supports an image format */ /* Helper to determine if a device supports an image format */
extern int is_image_format_supported( cl_context context, cl_mem_flags flags, cl_mem_object_type image_type, const cl_image_format *fmt ); extern int is_image_format_supported( cl_context context, cl_mem_flags flags, cl_mem_object_type image_type, const cl_image_format *fmt );
/* Helper to get pixel size for a pixel format */ /* Helper to get pixel size for a pixel format */
size_t get_pixel_bytes( const cl_image_format *fmt ); size_t get_pixel_bytes( const cl_image_format *fmt );
/* Verify the given device supports images. 0 means you're good to go, otherwise an error */ /* Verify the given device supports images. 0 means you're good to go, otherwise an error */
extern int verifyImageSupport( cl_device_id device ); extern int verifyImageSupport( cl_device_id device );
/* Checks that the given device supports images. Same as verify, but doesn't print an error */ /* Checks that the given device supports images. Same as verify, but doesn't print an error */
extern int checkForImageSupport( cl_device_id device ); extern int checkForImageSupport( cl_device_id device );
extern int checkFor3DImageSupport( cl_device_id device ); extern int checkFor3DImageSupport( cl_device_id device );
/* Checks that a given queue property is supported on the specified device. Returns 1 if supported, 0 if not or an error. */ /* Checks that a given queue property is supported on the specified device. Returns 1 if supported, 0 if not or an error. */
extern int checkDeviceForQueueSupport( cl_device_id device, cl_command_queue_properties prop ); extern int checkDeviceForQueueSupport( cl_device_id device, cl_command_queue_properties prop );
/* Helper for aligned memory allocation */ /* Helper for aligned memory allocation */
void * align_malloc(size_t size, size_t alignment); void * align_malloc(size_t size, size_t alignment);
void align_free(void *); void align_free(void *);
/* Helper to obtain the min alignment for a given context, i.e the max of all min alignments for devices attached to the context*/ /* Helper to obtain the min alignment for a given context, i.e the max of all min alignments for devices attached to the context*/
size_t get_min_alignment(cl_context context); size_t get_min_alignment(cl_context context);
/* Helper to obtain the default rounding mode for single precision computation. (Double is always CL_FP_ROUND_TO_NEAREST.) Returns 0 on error. */ /* Helper to obtain the default rounding mode for single precision computation. (Double is always CL_FP_ROUND_TO_NEAREST.) Returns 0 on error. */
cl_device_fp_config get_default_rounding_mode( cl_device_id device ); cl_device_fp_config get_default_rounding_mode( cl_device_id device );
#define PASSIVE_REQUIRE_IMAGE_SUPPORT( device ) \ #define PASSIVE_REQUIRE_IMAGE_SUPPORT( device ) \
if( checkForImageSupport( device ) ) \ if( checkForImageSupport( device ) ) \
{ \ { \
log_info( "\n\tNote: device does not support images. Skipping test...\n" ); \ log_info( "\n\tNote: device does not support images. Skipping test...\n" ); \
return 0; \ return 0; \
} }
#define PASSIVE_REQUIRE_3D_IMAGE_SUPPORT( device ) \ #define PASSIVE_REQUIRE_3D_IMAGE_SUPPORT( device ) \
if( checkFor3DImageSupport( device ) ) \ if( checkFor3DImageSupport( device ) ) \
{ \ { \
log_info( "\n\tNote: device does not support 3D images. Skipping test...\n" ); \ log_info( "\n\tNote: device does not support 3D images. Skipping test...\n" ); \
return 0; \ return 0; \
} }
/* Prints out the standard device header for all tests given the device to print for */ /* Prints out the standard device header for all tests given the device to print for */
extern int printDeviceHeader( cl_device_id device ); extern int printDeviceHeader( cl_device_id device );
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif // __cplusplus #endif // __cplusplus
#endif // _kernelHelpers_h #endif // _kernelHelpers_h

View File

@@ -1,59 +1,59 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#if defined(__MINGW32__) #if defined(__MINGW32__)
#include "mingw_compat.h" #include "mingw_compat.h"
#include <stdio.h> #include <stdio.h>
#include <string.h> #include <string.h>
//This function is unavailable on various mingw compilers, //This function is unavailable on various mingw compilers,
//especially 64 bit so implementing it here //especially 64 bit so implementing it here
const char *basename_dot="."; const char *basename_dot=".";
char* char*
basename(char *path) basename(char *path)
{ {
char *p = path, *b = NULL; char *p = path, *b = NULL;
int len = strlen(path); int len = strlen(path);
if (path == NULL) { if (path == NULL) {
return (char*)basename_dot; return (char*)basename_dot;
} }
// Not absolute path on windows // Not absolute path on windows
if (path[1] != ':') { if (path[1] != ':') {
return path; return path;
} }
// Trim trailing path seperators // Trim trailing path seperators
if (path[len - 1] == '\\' || if (path[len - 1] == '\\' ||
path[len - 1] == '/' ) { path[len - 1] == '/' ) {
len--; len--;
path[len] = '\0'; path[len] = '\0';
} }
while (len) { while (len) {
while((*p != '\\' || *p != '/') && len) { while((*p != '\\' || *p != '/') && len) {
p++; p++;
len--; len--;
} }
p++; p++;
b = p; b = p;
} }
return b; return b;
} }
#endif #endif

View File

@@ -1,31 +1,31 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef MINGW_COMPAT_H #ifndef MINGW_COMPAT_H
#define MINGW_COMPAT_H #define MINGW_COMPAT_H
#if defined(__MINGW32__) #if defined(__MINGW32__)
char *basename(char *path); char *basename(char *path);
#include <malloc.h> #include <malloc.h>
#if defined(__MINGW64__) #if defined(__MINGW64__)
//mingw-w64 doesnot have __mingw_aligned_malloc, instead it has _aligned_malloc //mingw-w64 doesnot have __mingw_aligned_malloc, instead it has _aligned_malloc
#define __mingw_aligned_malloc _aligned_malloc #define __mingw_aligned_malloc _aligned_malloc
#define __mingw_aligned_free _aligned_free #define __mingw_aligned_free _aligned_free
#include <stddef.h> #include <stddef.h>
#endif //(__MINGW64__) #endif //(__MINGW64__)
#endif //(__MINGW32__) #endif //(__MINGW32__)
#endif // MINGW_COMPAT_H #endif // MINGW_COMPAT_H

File diff suppressed because it is too large Load Diff

View File

@@ -1,274 +1,280 @@
/* /*
A C-program for MT19937, with initialization improved 2002/1/26. A C-program for MT19937, with initialization improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto. Coded by Takuji Nishimura and Makoto Matsumoto.
Before using, initialize the state by using init_genrand(seed) Before using, initialize the state by using init_genrand(seed)
or init_by_array(init_key, key_length). or init_by_array(init_key, key_length).
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura, Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
All rights reserved. All rights reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
are met: are met:
1. Redistributions of source code must retain the above copyright 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright 2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution. documentation and/or other materials provided with the distribution.
3. The names of its contributors may not be used to endorse or promote 3. The names of its contributors may not be used to endorse or promote
products derived from this software without specific prior written products derived from this software without specific prior written
permission. permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Any feedback is very welcome. Any feedback is very welcome.
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
email: m-mat @ math.sci.hiroshima-u.ac.jp (remove space) email: m-mat @ math.sci.hiroshima-u.ac.jp (remove space)
Modifications for use in OpenCL by Ian Ollmann, Apple Inc. Modifications for use in OpenCL by Ian Ollmann, Apple Inc.
*/ */
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include "mt19937.h" #include "mt19937.h"
#include "mingw_compat.h" #include "mingw_compat.h"
#ifdef __SSE2__ #ifdef __SSE2__
#include <emmintrin.h> #include <emmintrin.h>
#endif #endif
static void * align_malloc(size_t size, size_t alignment) static void * align_malloc(size_t size, size_t alignment)
{ {
#if defined(_WIN32) && defined(_MSC_VER) #if defined(_WIN32) && defined(_MSC_VER)
return _aligned_malloc(size, alignment); return _aligned_malloc(size, alignment);
#elif defined(__linux__) || defined (linux) || defined(__APPLE__) #elif defined(__linux__) || defined (linux) || defined(__APPLE__)
void * ptr = NULL; void * ptr = NULL;
if (0 == posix_memalign(&ptr, alignment, size)) #if defined(__ANDROID__)
return ptr; ptr = memalign(alignment, size);
return NULL; if ( ptr )
#elif defined(__MINGW32__) return ptr;
return __mingw_aligned_malloc(size, alignment); #else
#else if (0 == posix_memalign(&ptr, alignment, size))
#error "Please add support OS for aligned malloc" return ptr;
#endif #endif
} return NULL;
#elif defined(__MINGW32__)
static void align_free(void * ptr) return __mingw_aligned_malloc(size, alignment);
{ #else
#if defined(_WIN32) && defined(_MSC_VER) #error "Please add support OS for aligned malloc"
_aligned_free(ptr); #endif
#elif defined(__linux__) || defined (linux) || defined(__APPLE__) }
return free(ptr);
#elif defined(__MINGW32__) static void align_free(void * ptr)
return __mingw_aligned_free(ptr); {
#else #if defined(_WIN32) && defined(_MSC_VER)
#error "Please add support OS for aligned free" _aligned_free(ptr);
#endif #elif defined(__linux__) || defined (linux) || defined(__APPLE__)
} return free(ptr);
#elif defined(__MINGW32__)
return __mingw_aligned_free(ptr);
/* Period parameters */ #else
#define N 624 /* vector code requires multiple of 4 here */ #error "Please add support OS for aligned free"
#define M 397 #endif
#define MATRIX_A (cl_uint) 0x9908b0dfUL /* constant vector a */ }
#define UPPER_MASK (cl_uint) 0x80000000UL /* most significant w-r bits */
#define LOWER_MASK (cl_uint) 0x7fffffffUL /* least significant r bits */
/* Period parameters */
typedef struct _MTdata #define N 624 /* vector code requires multiple of 4 here */
{ #define M 397
cl_uint mt[N]; #define MATRIX_A (cl_uint) 0x9908b0dfUL /* constant vector a */
#ifdef __SSE2__ #define UPPER_MASK (cl_uint) 0x80000000UL /* most significant w-r bits */
cl_uint cache[N]; #define LOWER_MASK (cl_uint) 0x7fffffffUL /* least significant r bits */
#endif
cl_int mti; typedef struct _MTdata
}_MTdata; {
cl_uint mt[N];
/* initializes mt[N] with a seed */ #ifdef __SSE2__
MTdata init_genrand(cl_uint s) cl_uint cache[N];
{ #endif
MTdata r = (MTdata) align_malloc( sizeof( _MTdata ), 16 ); cl_int mti;
if( NULL != r ) }_MTdata;
{
cl_uint *mt = r->mt; /* initializes mt[N] with a seed */
int mti = 0; MTdata init_genrand(cl_uint s)
mt[0]= s; // & 0xffffffffUL; {
for (mti=1; mti<N; mti++) { MTdata r = (MTdata) align_malloc( sizeof( _MTdata ), 16 );
mt[mti] = (cl_uint) if( NULL != r )
(1812433253UL * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti); {
/* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */ cl_uint *mt = r->mt;
/* In the previous versions, MSBs of the seed affect */ int mti = 0;
/* only MSBs of the array mt[]. */ mt[0]= s; // & 0xffffffffUL;
/* 2002/01/09 modified by Makoto Matsumoto */ for (mti=1; mti<N; mti++) {
// mt[mti] &= 0xffffffffUL; mt[mti] = (cl_uint)
/* for >32 bit machines */ (1812433253UL * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
} /* See Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier. */
r->mti = mti; /* In the previous versions, MSBs of the seed affect */
} /* only MSBs of the array mt[]. */
/* 2002/01/09 modified by Makoto Matsumoto */
return r; // mt[mti] &= 0xffffffffUL;
} /* for >32 bit machines */
}
void free_mtdata( MTdata d ) r->mti = mti;
{ }
if(d)
align_free(d); return r;
} }
/* generates a random number on [0,0xffffffff]-interval */ void free_mtdata( MTdata d )
cl_uint genrand_int32( MTdata d) {
{ if(d)
/* mag01[x] = x * MATRIX_A for x=0,1 */ align_free(d);
static const cl_uint mag01[2]={0x0UL, MATRIX_A}; }
#ifdef __SSE2__
static volatile int init = 0; /* generates a random number on [0,0xffffffff]-interval */
static union{ __m128i v; cl_uint s[4]; } upper_mask, lower_mask, one, matrix_a, c0, c1; cl_uint genrand_int32( MTdata d)
#endif {
/* mag01[x] = x * MATRIX_A for x=0,1 */
static const cl_uint mag01[2]={0x0UL, MATRIX_A};
cl_uint *mt = d->mt; #ifdef __SSE2__
cl_uint y; static volatile int init = 0;
static union{ __m128i v; cl_uint s[4]; } upper_mask, lower_mask, one, matrix_a, c0, c1;
if (d->mti == N) #endif
{ /* generate N words at one time */
int kk;
cl_uint *mt = d->mt;
#ifdef __SSE2__ cl_uint y;
if( 0 == init )
{ if (d->mti == N)
upper_mask.s[0] = upper_mask.s[1] = upper_mask.s[2] = upper_mask.s[3] = UPPER_MASK; { /* generate N words at one time */
lower_mask.s[0] = lower_mask.s[1] = lower_mask.s[2] = lower_mask.s[3] = LOWER_MASK; int kk;
one.s[0] = one.s[1] = one.s[2] = one.s[3] = 1;
matrix_a.s[0] = matrix_a.s[1] = matrix_a.s[2] = matrix_a.s[3] = MATRIX_A; #ifdef __SSE2__
c0.s[0] = c0.s[1] = c0.s[2] = c0.s[3] = (cl_uint) 0x9d2c5680UL; if( 0 == init )
c1.s[0] = c1.s[1] = c1.s[2] = c1.s[3] = (cl_uint) 0xefc60000UL; {
init = 1; upper_mask.s[0] = upper_mask.s[1] = upper_mask.s[2] = upper_mask.s[3] = UPPER_MASK;
} lower_mask.s[0] = lower_mask.s[1] = lower_mask.s[2] = lower_mask.s[3] = LOWER_MASK;
#endif one.s[0] = one.s[1] = one.s[2] = one.s[3] = 1;
matrix_a.s[0] = matrix_a.s[1] = matrix_a.s[2] = matrix_a.s[3] = MATRIX_A;
kk = 0; c0.s[0] = c0.s[1] = c0.s[2] = c0.s[3] = (cl_uint) 0x9d2c5680UL;
#ifdef __SSE2__ c1.s[0] = c1.s[1] = c1.s[2] = c1.s[3] = (cl_uint) 0xefc60000UL;
// vector loop init = 1;
for( ; kk + 4 <= N-M; kk += 4 ) }
{ #endif
__m128i vy = _mm_or_si128( _mm_and_si128( _mm_load_si128( (__m128i*)(mt + kk) ), upper_mask.v ),
_mm_and_si128( _mm_loadu_si128( (__m128i*)(mt + kk + 1) ), lower_mask.v )); // ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK)) kk = 0;
#ifdef __SSE2__
__m128i mask = _mm_cmpeq_epi32( _mm_and_si128( vy, one.v), one.v ); // y & 1 ? -1 : 0 // vector loop
__m128i vmag01 = _mm_and_si128( mask, matrix_a.v ); // y & 1 ? MATRIX_A, 0 = mag01[y & (cl_uint) 0x1UL] for( ; kk + 4 <= N-M; kk += 4 )
__m128i vr = _mm_xor_si128( _mm_loadu_si128( (__m128i*)(mt + kk + M)), (__m128i) _mm_srli_epi32( vy, 1 ) ); // mt[kk+M] ^ (y >> 1) {
vr = _mm_xor_si128( vr, vmag01 ); // mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL] __m128i vy = _mm_or_si128( _mm_and_si128( _mm_load_si128( (__m128i*)(mt + kk) ), upper_mask.v ),
_mm_store_si128( (__m128i*) (mt + kk ), vr ); _mm_and_si128( _mm_loadu_si128( (__m128i*)(mt + kk + 1) ), lower_mask.v )); // ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK))
}
#endif __m128i mask = _mm_cmpeq_epi32( _mm_and_si128( vy, one.v), one.v ); // y & 1 ? -1 : 0
for ( ;kk<N-M;kk++) { __m128i vmag01 = _mm_and_si128( mask, matrix_a.v ); // y & 1 ? MATRIX_A, 0 = mag01[y & (cl_uint) 0x1UL]
y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK)); __m128i vr = _mm_xor_si128( _mm_loadu_si128( (__m128i*)(mt + kk + M)), (__m128i) _mm_srli_epi32( vy, 1 ) ); // mt[kk+M] ^ (y >> 1)
mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL]; vr = _mm_xor_si128( vr, vmag01 ); // mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL]
} _mm_store_si128( (__m128i*) (mt + kk ), vr );
}
#ifdef __SSE2__ #endif
// advance to next aligned location for ( ;kk<N-M;kk++) {
for (;kk<N-1 && (kk & 3);kk++) { y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK));
y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK)); mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL];
mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL]; }
}
#ifdef __SSE2__
// vector loop // advance to next aligned location
for( ; kk + 4 <= N-1; kk += 4 ) for (;kk<N-1 && (kk & 3);kk++) {
{ y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK));
__m128i vy = _mm_or_si128( _mm_and_si128( _mm_load_si128( (__m128i*)(mt + kk) ), upper_mask.v ), mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL];
_mm_and_si128( _mm_loadu_si128( (__m128i*)(mt + kk + 1) ), lower_mask.v )); // ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK)) }
__m128i mask = _mm_cmpeq_epi32( _mm_and_si128( vy, one.v), one.v ); // y & 1 ? -1 : 0 // vector loop
__m128i vmag01 = _mm_and_si128( mask, matrix_a.v ); // y & 1 ? MATRIX_A, 0 = mag01[y & (cl_uint) 0x1UL] for( ; kk + 4 <= N-1; kk += 4 )
__m128i vr = _mm_xor_si128( _mm_loadu_si128( (__m128i*)(mt + kk + M - N)), _mm_srli_epi32( vy, 1 ) ); // mt[kk+M-N] ^ (y >> 1) {
vr = _mm_xor_si128( vr, vmag01 ); // mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL] __m128i vy = _mm_or_si128( _mm_and_si128( _mm_load_si128( (__m128i*)(mt + kk) ), upper_mask.v ),
_mm_store_si128( (__m128i*) (mt + kk ), vr ); _mm_and_si128( _mm_loadu_si128( (__m128i*)(mt + kk + 1) ), lower_mask.v )); // ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK))
}
#endif __m128i mask = _mm_cmpeq_epi32( _mm_and_si128( vy, one.v), one.v ); // y & 1 ? -1 : 0
__m128i vmag01 = _mm_and_si128( mask, matrix_a.v ); // y & 1 ? MATRIX_A, 0 = mag01[y & (cl_uint) 0x1UL]
for (;kk<N-1;kk++) { __m128i vr = _mm_xor_si128( _mm_loadu_si128( (__m128i*)(mt + kk + M - N)), _mm_srli_epi32( vy, 1 ) ); // mt[kk+M-N] ^ (y >> 1)
y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK)); vr = _mm_xor_si128( vr, vmag01 ); // mt[kk+M] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL]
mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL]; _mm_store_si128( (__m128i*) (mt + kk ), vr );
} }
y = (cl_uint)((mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK)); #endif
mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL];
for (;kk<N-1;kk++) {
#ifdef __SSE2__ y = (cl_uint) ((mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK));
// Do the tempering ahead of time in vector code mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL];
for( kk = 0; kk + 4 <= N; kk += 4 ) }
{ y = (cl_uint)((mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK));
__m128i vy = _mm_load_si128( (__m128i*)(mt + kk ) ); // y = mt[k]; mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & (cl_uint) 0x1UL];
vy = _mm_xor_si128( vy, _mm_srli_epi32( vy, 11 ) ); // y ^= (y >> 11);
vy = _mm_xor_si128( vy, _mm_and_si128( _mm_slli_epi32( vy, 7 ), c0.v) ); // y ^= (y << 7) & (cl_uint) 0x9d2c5680UL; #ifdef __SSE2__
vy = _mm_xor_si128( vy, _mm_and_si128( _mm_slli_epi32( vy, 15 ), c1.v) ); // y ^= (y << 15) & (cl_uint) 0xefc60000UL; // Do the tempering ahead of time in vector code
vy = _mm_xor_si128( vy, _mm_srli_epi32( vy, 18 ) ); // y ^= (y >> 18); for( kk = 0; kk + 4 <= N; kk += 4 )
_mm_store_si128( (__m128i*)(d->cache+kk), vy ); {
} __m128i vy = _mm_load_si128( (__m128i*)(mt + kk ) ); // y = mt[k];
#endif vy = _mm_xor_si128( vy, _mm_srli_epi32( vy, 11 ) ); // y ^= (y >> 11);
vy = _mm_xor_si128( vy, _mm_and_si128( _mm_slli_epi32( vy, 7 ), c0.v) ); // y ^= (y << 7) & (cl_uint) 0x9d2c5680UL;
d->mti = 0; vy = _mm_xor_si128( vy, _mm_and_si128( _mm_slli_epi32( vy, 15 ), c1.v) ); // y ^= (y << 15) & (cl_uint) 0xefc60000UL;
} vy = _mm_xor_si128( vy, _mm_srli_epi32( vy, 18 ) ); // y ^= (y >> 18);
#ifdef __SSE2__ _mm_store_si128( (__m128i*)(d->cache+kk), vy );
y = d->cache[d->mti++]; }
#else #endif
y = mt[d->mti++];
d->mti = 0;
/* Tempering */ }
y ^= (y >> 11); #ifdef __SSE2__
y ^= (y << 7) & (cl_uint) 0x9d2c5680UL; y = d->cache[d->mti++];
y ^= (y << 15) & (cl_uint) 0xefc60000UL; #else
y ^= (y >> 18); y = mt[d->mti++];
#endif
/* Tempering */
y ^= (y >> 11);
return y; y ^= (y << 7) & (cl_uint) 0x9d2c5680UL;
} y ^= (y << 15) & (cl_uint) 0xefc60000UL;
y ^= (y >> 18);
cl_ulong genrand_int64( MTdata d) #endif
{
return ((cl_ulong) genrand_int32(d) << 32) | (cl_uint) genrand_int32(d);
} return y;
}
/* generates a random number on [0,1]-real-interval */
double genrand_real1(MTdata d) cl_ulong genrand_int64( MTdata d)
{ {
return genrand_int32(d)*(1.0/4294967295.0); return ((cl_ulong) genrand_int32(d) << 32) | (cl_uint) genrand_int32(d);
/* divided by 2^32-1 */ }
}
/* generates a random number on [0,1]-real-interval */
/* generates a random number on [0,1)-real-interval */ double genrand_real1(MTdata d)
double genrand_real2(MTdata d) {
{ return genrand_int32(d)*(1.0/4294967295.0);
return genrand_int32(d)*(1.0/4294967296.0); /* divided by 2^32-1 */
/* divided by 2^32 */ }
}
/* generates a random number on [0,1)-real-interval */
/* generates a random number on (0,1)-real-interval */ double genrand_real2(MTdata d)
double genrand_real3(MTdata d) {
{ return genrand_int32(d)*(1.0/4294967296.0);
return (((double)genrand_int32(d)) + 0.5)*(1.0/4294967296.0); /* divided by 2^32 */
/* divided by 2^32 */ }
}
/* generates a random number on (0,1)-real-interval */
/* generates a random number on [0,1) with 53-bit resolution*/ double genrand_real3(MTdata d)
double genrand_res53(MTdata d) {
{ return (((double)genrand_int32(d)) + 0.5)*(1.0/4294967296.0);
unsigned long a=genrand_int32(d)>>5, b=genrand_int32(d)>>6; /* divided by 2^32 */
return(a*67108864.0+b)*(1.0/9007199254740992.0); }
}
/* generates a random number on [0,1) with 53-bit resolution*/
double genrand_res53(MTdata d)
{
unsigned long a=genrand_int32(d)>>5, b=genrand_int32(d)>>6;
return(a*67108864.0+b)*(1.0/9007199254740992.0);
}

View File

@@ -1,99 +1,99 @@
/* /*
* mt19937.h * mt19937.h
* *
* Mersenne Twister. * Mersenne Twister.
* *
A C-program for MT19937, with initialization improved 2002/1/26. A C-program for MT19937, with initialization improved 2002/1/26.
Coded by Takuji Nishimura and Makoto Matsumoto. Coded by Takuji Nishimura and Makoto Matsumoto.
Before using, initialize the state by using init_genrand(seed) Before using, initialize the state by using init_genrand(seed)
or init_by_array(init_key, key_length). or init_by_array(init_key, key_length).
Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura, Copyright (C) 1997 - 2002, Makoto Matsumoto and Takuji Nishimura,
All rights reserved. All rights reserved.
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions modification, are permitted provided that the following conditions
are met: are met:
1. Redistributions of source code must retain the above copyright 1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer. notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright 2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution. documentation and/or other materials provided with the distribution.
3. The names of its contributors may not be used to endorse or promote 3. The names of its contributors may not be used to endorse or promote
products derived from this software without specific prior written products derived from this software without specific prior written
permission. permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Any feedback is very welcome. Any feedback is very welcome.
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
email: m-mat @ math.sci.hiroshima-u.ac.jp (remove space) email: m-mat @ math.sci.hiroshima-u.ac.jp (remove space)
*/ */
#ifndef MT19937_H #ifndef MT19937_H
#define MT19937_H 1 #define MT19937_H 1
#if defined( __APPLE__ ) #if defined( __APPLE__ )
#include <OpenCL/cl_platform.h> #include <OpenCL/cl_platform.h>
#else #else
#include <CL/cl_platform.h> #include <CL/cl_platform.h>
#endif #endif
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
/* /*
* Interfaces here have been modified from original sources so that they * Interfaces here have been modified from original sources so that they
* are safe to call reentrantly, so long as a different MTdata is used * are safe to call reentrantly, so long as a different MTdata is used
* on each thread. * on each thread.
*/ */
typedef struct _MTdata *MTdata; typedef struct _MTdata *MTdata;
/* Create the random number generator with seed */ /* Create the random number generator with seed */
MTdata init_genrand( cl_uint /*seed*/ ); MTdata init_genrand( cl_uint /*seed*/ );
/* release memory used by a MTdata private data */ /* release memory used by a MTdata private data */
void free_mtdata( MTdata /*data*/ ); void free_mtdata( MTdata /*data*/ );
/* generates a random number on [0,0xffffffff]-interval */ /* generates a random number on [0,0xffffffff]-interval */
cl_uint genrand_int32( MTdata /*data*/); cl_uint genrand_int32( MTdata /*data*/);
/* generates a random number on [0,0xffffffffffffffffULL]-interval */ /* generates a random number on [0,0xffffffffffffffffULL]-interval */
cl_ulong genrand_int64( MTdata /*data*/); cl_ulong genrand_int64( MTdata /*data*/);
/* generates a random number on [0,1]-real-interval */ /* generates a random number on [0,1]-real-interval */
double genrand_real1( MTdata /*data*/); double genrand_real1( MTdata /*data*/);
/* generates a random number on [0,1)-real-interval */ /* generates a random number on [0,1)-real-interval */
double genrand_real2( MTdata /*data*/); double genrand_real2( MTdata /*data*/);
/* generates a random number on (0,1)-real-interval */ /* generates a random number on (0,1)-real-interval */
double genrand_real3( MTdata /*data*/); double genrand_real3( MTdata /*data*/);
/* generates a random number on [0,1) with 53-bit resolution*/ /* generates a random number on [0,1) with 53-bit resolution*/
double genrand_res53( MTdata /*data*/ ); double genrand_res53( MTdata /*data*/ );
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
#endif /* MT19937_H */ #endif /* MT19937_H */

View File

@@ -0,0 +1,564 @@
//
// Copyright (c) 2017 The Khronos Group Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
#include "os_helpers.h"
#include "errorHelpers.h"
// =================================================================================================
// C++ interface.
// =================================================================================================
#include <cerrno> // errno, error constants
#include <climits> // PATH_MAX
#include <cstdlib> // abort, _splitpath, _makepath
#include <cstring> // strdup, strerror_r
#include <sstream>
#include <vector>
#define CHECK_PTR( ptr ) \
if ( (ptr) == NULL ) { \
abort(); \
}
typedef std::vector< char > buffer_t;
#if ! defined( PATH_MAX )
#define PATH_MAX 1000
#endif
int const _size = PATH_MAX + 1; // Initial buffer size for path.
int const _count = 8; // How many times we will try to double buffer size.
// -------------------------------------------------------------------------------------------------
// MacOS X
// -------------------------------------------------------------------------------------------------
#if defined( __APPLE__ )
#include <mach-o/dyld.h> // _NSGetExecutablePath
#include <libgen.h> // dirname
static
std::string
_err_msg(
int err, // Error number (e. g. errno).
int level // Nesting level, for avoiding infinite recursion.
) {
/*
There are 3 incompatible versions of strerror_r:
char * strerror_r( int, char *, size_t ); // GNU version
int strerror_r( int, char *, size_t ); // BSD version
int strerror_r( int, char *, size_t ); // XSI version
BSD version returns error code, while XSI version returns 0 or -1 and sets errno.
*/
// BSD version of strerror_r.
buffer_t buffer( 100 );
int count = _count;
for ( ; ; ) {
int rc = strerror_r( err, & buffer.front(), buffer.size() );
if ( rc == EINVAL ) {
// Error code is not recognized, but anyway we got the message.
return & buffer.front();
} else if ( rc == ERANGE ) {
// Buffer is not enough.
if ( count > 0 ) {
// Enlarge the buffer.
-- count;
buffer.resize( buffer.size() * 2 );
} else {
std::stringstream ostr;
ostr
<< "Error " << err << " "
<< "(Getting error message failed: "
<< "Buffer of " << buffer.size() << " bytes is still too small"
<< ")";
return ostr.str();
}; // if
} else if ( rc == 0 ) {
// We got the message.
return & buffer.front();
} else {
std::stringstream ostr;
ostr
<< "Error " << err << " "
<< "(Getting error message failed: "
<< ( level < 2 ? _err_msg( rc, level + 1 ) : "Oops" )
<< ")";
return ostr.str();
}; // if
}; // forever
} // _err_msg
std::string
dir_sep(
) {
return "/";
} // dir_sep
std::string
exe_path(
) {
buffer_t path( _size );
int count = _count;
for ( ; ; ) {
uint32_t size = path.size();
int rc = _NSGetExecutablePath( & path.front(), & size );
if ( rc == 0 ) {
break;
}; // if
if ( count > 0 ) {
-- count;
path.resize( size );
} else {
log_error(
"ERROR: Getting executable path failed: "
"_NSGetExecutablePath failed: Buffer of %lu bytes is still too small\n",
(unsigned long) path.size()
);
exit( 2 );
}; // if
}; // forever
return & path.front();
} // exe_path
std::string
exe_dir(
) {
std::string path = exe_path();
// We cannot pass path.c_str() to `dirname' bacause `dirname' modifies its argument.
buffer_t buffer( path.c_str(), path.c_str() + path.size() + 1 ); // Copy with trailing zero.
return dirname( & buffer.front() );
} // exe_dir
#endif // __APPLE__
// -------------------------------------------------------------------------------------------------
// Linux
// -------------------------------------------------------------------------------------------------
#if defined( __linux__ )
#include <cerrno> // errno
#include <libgen.h> // dirname
#include <unistd.h> // readlink
static
std::string
_err_msg(
int err,
int level
) {
/*
There are 3 incompatible versions of strerror_r:
char * strerror_r( int, char *, size_t ); // GNU version
int strerror_r( int, char *, size_t ); // BSD version
int strerror_r( int, char *, size_t ); // XSI version
BSD version returns error code, while XSI version returns 0 or -1 and sets errno.
*/
#if defined(__ANDROID__) || ( ( _POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600 ) && ! _GNU_SOURCE )
// XSI version of strerror_r.
#warning Not tested!
buffer_t buffer( 200 );
int count = _count;
for ( ; ; ) {
int rc = strerror_r( err, & buffer.front(), buffer.size() );
if ( rc == -1 ) {
int _err = errno;
if ( _err == ERANGE ) {
if ( count > 0 ) {
// Enlarge the buffer.
-- count;
buffer.resize( buffer.size() * 2 );
} else {
std::stringstream ostr;
ostr
<< "Error " << err << " "
<< "(Getting error message failed: "
<< "Buffer of " << buffer.size() << " bytes is still too small"
<< ")";
return ostr.str();
}; // if
} else {
std::stringstream ostr;
ostr
<< "Error " << err << " "
<< "(Getting error message failed: "
<< ( level < 2 ? _err_msg( _err, level + 1 ) : "Oops" )
<< ")";
return ostr.str();
}; // if
} else {
// We got the message.
return & buffer.front();
}; // if
}; // forever
#else
// GNU version of strerror_r.
char buffer[ 2000 ];
return strerror_r( err, buffer, sizeof( buffer ) );
#endif
} // _err_msg
std::string
dir_sep(
) {
return "/";
} // dir_sep
std::string
exe_path(
) {
static std::string const exe = "/proc/self/exe";
buffer_t path( _size );
int count = _count; // Max number of iterations.
for ( ; ; ) {
ssize_t len = readlink( exe.c_str(), & path.front(), path.size() );
if ( len < 0 ) {
// Oops.
int err = errno;
log_error(
"ERROR: Getting executable path failed: "
"Reading symlink `%s' failed: %s\n",
exe.c_str(), err_msg( err ).c_str()
);
exit( 2 );
}; // if
if ( len < path.size() ) {
// We got the path.
path.resize( len );
break;
}; // if
// Oops, buffer is too small.
if ( count > 0 ) {
-- count;
// Enlarge the buffer.
path.resize( path.size() * 2 );
} else {
log_error(
"ERROR: Getting executable path failed: "
"Reading symlink `%s' failed: Buffer of %lu bytes is still too small\n",
exe.c_str(),
(unsigned long) path.size()
);
exit( 2 );
}; // if
}; // forever
return std::string( & path.front(), path.size() );
} // exe_path
std::string
exe_dir(
) {
std::string path = exe_path();
// We cannot pass path.c_str() to `dirname' bacause `dirname' modifies its argument.
buffer_t buffer( path.c_str(), path.c_str() + path.size() + 1 ); // Copy with trailing zero.
return dirname( & buffer.front() );
} // exe_dir
#endif // __linux__
// -------------------------------------------------------------------------------------------------
// MS Windows
// -------------------------------------------------------------------------------------------------
#if defined( _WIN32 )
#include <windows.h>
#if defined( max )
#undef max
#endif
#include <cctype>
#include <algorithm>
static
std::string
_err_msg(
int err,
int level
) {
std::string msg;
LPSTR buffer = NULL;
DWORD flags =
FORMAT_MESSAGE_ALLOCATE_BUFFER |
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_IGNORE_INSERTS;
DWORD len =
FormatMessageA(
flags,
NULL,
err,
LANG_USER_DEFAULT,
reinterpret_cast< LPSTR >( & buffer ),
0,
NULL
);
if ( buffer == NULL || len == 0 ) {
int _err = GetLastError();
char str[1024] = { 0 };
snprintf(str, sizeof(str), "Error 0x%08x (Getting error message failed: %s )", err, ( level < 2 ? _err_msg( _err, level + 1 ).c_str() : "Oops" ));
msg = std::string(str);
} else {
// Trim trailing whitespace (including `\r' and `\n').
while ( len > 0 && isspace( buffer[ len - 1 ] ) ) {
-- len;
}; // while
// Drop trailing full stop.
if ( len > 0 && buffer[ len - 1 ] == '.' ) {
-- len;
}; // if
msg.assign( buffer, len );
}; //if
if ( buffer != NULL ) {
LocalFree( buffer );
}; // if
return msg;
} // _get_err_msg
std::string
dir_sep(
) {
return "\\";
} // dir_sep
std::string
exe_path(
) {
buffer_t path( _size );
int count = _count;
for ( ; ; ) {
DWORD len = GetModuleFileNameA( NULL, & path.front(), path.size() );
if ( len == 0 ) {
int err = GetLastError();
log_error( "ERROR: Getting executable path failed: %s\n", err_msg( err ).c_str() );
exit( 2 );
}; // if
if ( len < path.size() ) {
path.resize( len );
break;
}; // if
// Buffer too small.
if ( count > 0 ) {
-- count;
path.resize( path.size() * 2 );
} else {
log_error(
"ERROR: Getting executable path failed: "
"Buffer of %lu bytes is still too small\n",
(unsigned long) path.size()
);
exit( 2 );
}; // if
}; // forever
return std::string( & path.front(), path.size() );
} // exe_path
std::string
exe_dir(
) {
std::string exe = exe_path();
int count = 0;
// Splitting path into components.
buffer_t drv( _MAX_DRIVE );
buffer_t dir( _MAX_DIR );
count = _count;
#if defined(_MSC_VER)
for ( ; ; ) {
int rc =
_splitpath_s(
exe.c_str(),
& drv.front(), drv.size(),
& dir.front(), dir.size(),
NULL, 0, // We need neither name
NULL, 0 // nor extension
);
if ( rc == 0 ) {
break;
} else if ( rc == ERANGE ) {
if ( count > 0 ) {
-- count;
// Buffer is too small, but it is not clear which one.
// So we have to enlarge all.
drv.resize( drv.size() * 2 );
dir.resize( dir.size() * 2 );
} else {
log_error(
"ERROR: Getting executable path failed: "
"Splitting path `%s' to components failed: "
"Buffers of %lu and %lu bytes are still too small\n",
exe.c_str(),
(unsigned long) drv.size(),
(unsigned long) dir.size()
);
exit( 2 );
}; // if
} else {
log_error(
"ERROR: Getting executable path failed: "
"Splitting path `%s' to components failed: %s\n",
exe.c_str(),
err_msg( rc ).c_str()
);
exit( 2 );
}; // if
}; // forever
#else // __MINGW32__
// MinGW does not have the "secure" _splitpath_s, use the insecure version instead.
_splitpath(
exe.c_str(),
& drv.front(),
& dir.front(),
NULL, // We need neither name
NULL // nor extension
);
#endif // __MINGW32__
// Combining components back to path.
// I failed with "secure" `_makepath_s'. If buffer is too small, instead of returning
// ERANGE, `_makepath_s' pops up dialog box and offers to debug the program. D'oh!
// So let us try to guess the size of result and go with insecure `_makepath'.
buffer_t path( std::max( drv.size() + dir.size(), size_t( _MAX_PATH ) ) + 10 );
_makepath( & path.front(), & drv.front(), & dir.front(), NULL, NULL );
return & path.front();
} // exe_dir
#endif // _WIN32
std::string
err_msg(
int err
) {
return _err_msg( err, 0 );
} // err_msg
// =================================================================================================
// C interface.
// =================================================================================================
char *
get_err_msg(
int err
) {
char * msg = strdup( err_msg( err ).c_str() );
CHECK_PTR( msg );
return msg;
} // get_err_msg
char *
get_dir_sep(
) {
char * sep = strdup( dir_sep().c_str() );
CHECK_PTR( sep );
return sep;
} // get_dir_sep
char *
get_exe_path(
) {
char * path = strdup( exe_path().c_str() );
CHECK_PTR( path );
return path;
} // get_exe_path
char *
get_exe_dir(
) {
char * dir = strdup( exe_dir().c_str() );
CHECK_PTR( dir );
return dir;
} // get_exe_dir
// end of file //

View File

@@ -0,0 +1,53 @@
//
// Copyright (c) 2017 The Khronos Group Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
#ifndef __os_helpers_h__
#define __os_helpers_h__
#include "compat.h"
// -------------------------------------------------------------------------------------------------
// C++ interface.
// -------------------------------------------------------------------------------------------------
#ifdef __cplusplus
#include <string>
std::string err_msg( int err );
std::string dir_sep();
std::string exe_path();
std::string exe_dir();
#endif // __cplusplus
// -------------------------------------------------------------------------------------------------
// C interface.
// -------------------------------------------------------------------------------------------------
#ifdef __cplusplus
extern "C" {
#endif // __cplusplus
char * get_err_msg( int err ); // Returns system error message. Subject to free.
char * get_dir_sep(); // Returns dir separator. Subject to free.
char * get_exe_path(); // Returns path of current executable. Subject to free.
char * get_exe_dir(); // Returns dir of current executable. Subject to free.
#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus
#endif // __os_helpers_h__

View File

@@ -0,0 +1,42 @@
//
// Copyright (c) 2017 The Khronos Group Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
#include "parseParameters.h"
#include "errorHelpers.h"
#include <string.h>
bool is_power_of_two(int number)
{
return number && !(number & (number - 1));
}
extern void parseWimpyReductionFactor(const char *&arg, int &wimpyReductionFactor)
{
const char *arg_temp = strchr(&arg[1], ']');
if (arg_temp != 0)
{
int new_factor = atoi(&arg[1]);
arg = arg_temp; // Advance until ']'
if (is_power_of_two(new_factor))
{
log_info("\n Wimpy reduction factor changed from %d to %d \n", wimpyReductionFactor, new_factor);
wimpyReductionFactor = new_factor;
}
else
{
log_info("\n WARNING: Incorrect wimpy reduction factor %d, must be power of 2. The default value will be used.\n", new_factor);
}
}
}

View File

@@ -0,0 +1,24 @@
//
// Copyright (c) 2017 The Khronos Group Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
#ifndef _parseParameters_h
#define _parseParameters_h
#include "compat.h"
#include <string>
extern void parseWimpyReductionFactor(const char *&arg, int &wimpyReductionFactor);
#endif // _parseParameters_h

View File

@@ -1,49 +1,49 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _ref_counting_h #ifndef _ref_counting_h
#define _ref_counting_h #define _ref_counting_h
#define MARK_REF_COUNT_BASE( c, type, bigType ) \ #define MARK_REF_COUNT_BASE( c, type, bigType ) \
cl_uint c##_refCount; \ cl_uint c##_refCount; \
error = clGet##type##Info( c, CL_##bigType##_REFERENCE_COUNT, sizeof( c##_refCount ), &c##_refCount, NULL ); \ error = clGet##type##Info( c, CL_##bigType##_REFERENCE_COUNT, sizeof( c##_refCount ), &c##_refCount, NULL ); \
test_error( error, "Unable to check reference count for " #type ); test_error( error, "Unable to check reference count for " #type );
#define TEST_REF_COUNT_BASE( c, type, bigType ) \ #define TEST_REF_COUNT_BASE( c, type, bigType ) \
cl_uint c##_refCount_new; \ cl_uint c##_refCount_new; \
error = clGet##type##Info( c, CL_##bigType##_REFERENCE_COUNT, sizeof( c##_refCount_new ), &c##_refCount_new, NULL ); \ error = clGet##type##Info( c, CL_##bigType##_REFERENCE_COUNT, sizeof( c##_refCount_new ), &c##_refCount_new, NULL ); \
test_error( error, "Unable to check reference count for " #type ); \ test_error( error, "Unable to check reference count for " #type ); \
if( c##_refCount != c##_refCount_new ) \ if( c##_refCount != c##_refCount_new ) \
{ \ { \
log_error( "ERROR: Reference count for " #type " changed! (was %d, now %d)\n", c##_refCount, c##_refCount_new ); \ log_error( "ERROR: Reference count for " #type " changed! (was %d, now %d)\n", c##_refCount, c##_refCount_new ); \
return -1; \ return -1; \
} }
#define MARK_REF_COUNT_CONTEXT( c ) MARK_REF_COUNT_BASE( c, Context, CONTEXT ) #define MARK_REF_COUNT_CONTEXT( c ) MARK_REF_COUNT_BASE( c, Context, CONTEXT )
#define TEST_REF_COUNT_CONTEXT( c ) TEST_REF_COUNT_BASE( c, Context, CONTEXT ) #define TEST_REF_COUNT_CONTEXT( c ) TEST_REF_COUNT_BASE( c, Context, CONTEXT )
#define MARK_REF_COUNT_DEVICE( c ) MARK_REF_COUNT_BASE( c, Device, DEVICE ) #define MARK_REF_COUNT_DEVICE( c ) MARK_REF_COUNT_BASE( c, Device, DEVICE )
#define TEST_REF_COUNT_DEVICE( c ) TEST_REF_COUNT_BASE( c, Device, DEVICE ) #define TEST_REF_COUNT_DEVICE( c ) TEST_REF_COUNT_BASE( c, Device, DEVICE )
#define MARK_REF_COUNT_QUEUE( c ) MARK_REF_COUNT_BASE( c, CommandQueue, QUEUE ) #define MARK_REF_COUNT_QUEUE( c ) MARK_REF_COUNT_BASE( c, CommandQueue, QUEUE )
#define TEST_REF_COUNT_QUEUE( c ) TEST_REF_COUNT_BASE( c, CommandQueue, QUEUE ) #define TEST_REF_COUNT_QUEUE( c ) TEST_REF_COUNT_BASE( c, CommandQueue, QUEUE )
#define MARK_REF_COUNT_PROGRAM( c ) MARK_REF_COUNT_BASE( c, Program, PROGRAM ) #define MARK_REF_COUNT_PROGRAM( c ) MARK_REF_COUNT_BASE( c, Program, PROGRAM )
#define TEST_REF_COUNT_PROGRAM( c ) TEST_REF_COUNT_BASE( c, Program, PROGRAM ) #define TEST_REF_COUNT_PROGRAM( c ) TEST_REF_COUNT_BASE( c, Program, PROGRAM )
#define MARK_REF_COUNT_MEM( c ) MARK_REF_COUNT_BASE( c, MemObject, MEM ) #define MARK_REF_COUNT_MEM( c ) MARK_REF_COUNT_BASE( c, MemObject, MEM )
#define TEST_REF_COUNT_MEM( c ) TEST_REF_COUNT_BASE( c, MemObject, MEM ) #define TEST_REF_COUNT_MEM( c ) TEST_REF_COUNT_BASE( c, MemObject, MEM )
#endif // _ref_counting_h #endif // _ref_counting_h

View File

@@ -1,175 +1,241 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "rounding_mode.h" #include "rounding_mode.h"
#if !(defined(_WIN32) && defined(_MSC_VER)) #if (defined( __arm__ ) || defined(__aarch64__))
RoundingMode set_round( RoundingMode r, Type outType ) #define FPSCR_FZ (1 << 24) // Flush-To-Zero mode
{ #define FPSCR_ROUND_MASK (3 << 22) // Rounding mode:
static const int flt_rounds[ kRoundingModeCount ] = { FE_TONEAREST, FE_TONEAREST, FE_UPWARD, FE_DOWNWARD, FE_TOWARDZERO };
static const int int_rounds[ kRoundingModeCount ] = { FE_TOWARDZERO, FE_TONEAREST, FE_UPWARD, FE_DOWNWARD, FE_TOWARDZERO }; #define _ARM_FE_FTZ 0x1000000
const int *p = int_rounds; #define _ARM_FE_NFTZ 0x0
if( outType == kfloat || outType == kdouble ) #if defined(__aarch64__)
p = flt_rounds; #define _FPU_GETCW(cw) __asm__ ("MRS %0,FPCR" : "=r" (cw))
int oldRound = fegetround(); #define _FPU_SETCW(cw) __asm__ ("MSR FPCR,%0" : :"ri" (cw))
fesetround( p[r] ); #else
#define _FPU_GETCW(cw) __asm__ ("VMRS %0,FPSCR" : "=r" (cw))
switch( oldRound ) #define _FPU_SETCW(cw) __asm__ ("VMSR FPSCR,%0" : :"ri" (cw))
{ #endif
case FE_TONEAREST: #endif
return kRoundToNearestEven;
case FE_UPWARD: #if (defined( __arm__ ) || defined(__aarch64__)) && defined( __GNUC__ )
return kRoundUp; #define _ARM_FE_TONEAREST 0x0
case FE_DOWNWARD: #define _ARM_FE_UPWARD 0x400000
return kRoundDown; #define _ARM_FE_DOWNWARD 0x800000
case FE_TOWARDZERO: #define _ARM_FE_TOWARDZERO 0xc00000
return kRoundTowardZero; RoundingMode set_round( RoundingMode r, Type outType )
default: {
abort(); // ??! static const int flt_rounds[ kRoundingModeCount ] = { _ARM_FE_TONEAREST,
} _ARM_FE_TONEAREST, _ARM_FE_UPWARD, _ARM_FE_DOWNWARD, _ARM_FE_TOWARDZERO };
return kDefaultRoundingMode; //never happens static const int int_rounds[ kRoundingModeCount ] = { _ARM_FE_TOWARDZERO,
} _ARM_FE_TONEAREST, _ARM_FE_UPWARD, _ARM_FE_DOWNWARD, _ARM_FE_TOWARDZERO };
const int *p = int_rounds;
RoundingMode get_round( void ) if( outType == kfloat || outType == kdouble )
{ p = flt_rounds;
int oldRound = fegetround();
int fpscr = 0;
switch( oldRound ) RoundingMode oldRound = get_round();
{
case FE_TONEAREST: _FPU_GETCW(fpscr);
return kRoundToNearestEven; _FPU_SETCW( p[r] | (fpscr & ~FPSCR_ROUND_MASK));
case FE_UPWARD:
return kRoundUp; return oldRound;
case FE_DOWNWARD: }
return kRoundDown;
case FE_TOWARDZERO: RoundingMode get_round( void )
return kRoundTowardZero; {
} int fpscr;
int oldRound;
return kDefaultRoundingMode;
} _FPU_GETCW(fpscr);
oldRound = (fpscr & FPSCR_ROUND_MASK);
#else
RoundingMode set_round( RoundingMode r, Type outType ) switch( oldRound )
{ {
static const int flt_rounds[ kRoundingModeCount ] = { _RC_NEAR, _RC_NEAR, _RC_UP, _RC_DOWN, _RC_CHOP }; case _ARM_FE_TONEAREST:
static const int int_rounds[ kRoundingModeCount ] = { _RC_CHOP, _RC_NEAR, _RC_UP, _RC_DOWN, _RC_CHOP }; return kRoundToNearestEven;
const int *p = ( outType == kfloat || outType == kdouble )? flt_rounds : int_rounds; case _ARM_FE_UPWARD:
unsigned int oldRound; return kRoundUp;
case _ARM_FE_DOWNWARD:
int err = _controlfp_s(&oldRound, 0, 0); //get rounding mode into oldRound return kRoundDown;
if (err) { case _ARM_FE_TOWARDZERO:
vlog_error("\t\tERROR: -- cannot get rounding mode in %s:%d\n", __FILE__, __LINE__); return kRoundTowardZero;
return kDefaultRoundingMode; //what else never happens }
}
return kDefaultRoundingMode;
oldRound &= _MCW_RC; }
RoundingMode old = #elif !(defined(_WIN32) && defined(_MSC_VER))
(oldRound == _RC_NEAR)? kRoundToNearestEven : RoundingMode set_round( RoundingMode r, Type outType )
(oldRound == _RC_UP)? kRoundUp : {
(oldRound == _RC_DOWN)? kRoundDown : static const int flt_rounds[ kRoundingModeCount ] = { FE_TONEAREST, FE_TONEAREST, FE_UPWARD, FE_DOWNWARD, FE_TOWARDZERO };
(oldRound == _RC_CHOP)? kRoundTowardZero: static const int int_rounds[ kRoundingModeCount ] = { FE_TOWARDZERO, FE_TONEAREST, FE_UPWARD, FE_DOWNWARD, FE_TOWARDZERO };
kDefaultRoundingMode; const int *p = int_rounds;
if( outType == kfloat || outType == kdouble )
_controlfp_s(&oldRound, p[r], _MCW_RC); //setting new rounding mode p = flt_rounds;
return old; //returning old rounding mode int oldRound = fegetround();
} fesetround( p[r] );
RoundingMode get_round( void ) switch( oldRound )
{ {
unsigned int oldRound; case FE_TONEAREST:
return kRoundToNearestEven;
int err = _controlfp_s(&oldRound, 0, 0); //get rounding mode into oldRound case FE_UPWARD:
oldRound &= _MCW_RC; return kRoundUp;
return case FE_DOWNWARD:
(oldRound == _RC_NEAR)? kRoundToNearestEven : return kRoundDown;
(oldRound == _RC_UP)? kRoundUp : case FE_TOWARDZERO:
(oldRound == _RC_DOWN)? kRoundDown : return kRoundTowardZero;
(oldRound == _RC_CHOP)? kRoundTowardZero: default:
kDefaultRoundingMode; abort(); // ??!
} }
return kDefaultRoundingMode; //never happens
#endif }
// RoundingMode get_round( void )
// FlushToZero() sets the host processor into ftz mode. It is intended to have a remote effect on the behavior of the code in {
// basic_test_conversions.c. Some host processors may not support this mode, which case you'll need to do some clamping in int oldRound = fegetround();
// software by testing against FLT_MIN or DBL_MIN in that file.
// switch( oldRound )
// Note: IEEE-754 says conversions are basic operations. As such they do *NOT* have the behavior in section 7.5.3 of {
// the OpenCL spec. They *ALWAYS* flush to zero for subnormal inputs or outputs when FTZ mode is on like other basic case FE_TONEAREST:
// operators do (e.g. add, subtract, multiply, divide, etc.) return kRoundToNearestEven;
// case FE_UPWARD:
// Configuring hardware to FTZ mode varies by platform. return kRoundUp;
// CAUTION: Some C implementations may also fail to behave properly in this mode. case FE_DOWNWARD:
// return kRoundDown;
// On PowerPC, it is done by setting the FPSCR into non-IEEE mode. case FE_TOWARDZERO:
// On Intel, you can do this by turning on the FZ and DAZ bits in the MXCSR -- provided that SSE/SSE2 return kRoundTowardZero;
// is used for floating point computation! If your OS uses x87, you'll need to figure out how }
// to turn that off for the conversions code in basic_test_conversions.c so that they flush to
// zero properly. Otherwise, you'll need to add appropriate software clamping to basic_test_conversions.c return kDefaultRoundingMode;
// in which case, these function are at liberty to do nothing. }
//
#if defined( __i386__ ) || defined( __x86_64__ ) || defined (_WIN32) #else
#include <xmmintrin.h> RoundingMode set_round( RoundingMode r, Type outType )
#elif defined( __PPC__ ) {
#include <fpu_control.h> static const int flt_rounds[ kRoundingModeCount ] = { _RC_NEAR, _RC_NEAR, _RC_UP, _RC_DOWN, _RC_CHOP };
#endif static const int int_rounds[ kRoundingModeCount ] = { _RC_CHOP, _RC_NEAR, _RC_UP, _RC_DOWN, _RC_CHOP };
void *FlushToZero( void ) const int *p = ( outType == kfloat || outType == kdouble )? flt_rounds : int_rounds;
{ unsigned int oldRound;
#if defined( __APPLE__ ) || defined(__linux__) || defined (_WIN32)
#if defined( __i386__ ) || defined( __x86_64__ ) || defined(_MSC_VER) int err = _controlfp_s(&oldRound, 0, 0); //get rounding mode into oldRound
union{ int i; void *p; }u = { _mm_getcsr() }; if (err) {
_mm_setcsr( u.i | 0x8040 ); vlog_error("\t\tERROR: -- cannot get rounding mode in %s:%d\n", __FILE__, __LINE__);
return u.p; return kDefaultRoundingMode; //what else never happens
#elif defined( __arm__ ) }
// processor is already in FTZ mode -- do nothing
return NULL; oldRound &= _MCW_RC;
#elif defined( __PPC__ )
fpu_control_t flags = 0; RoundingMode old =
_FPU_GETCW(flags); (oldRound == _RC_NEAR)? kRoundToNearestEven :
flags |= _FPU_MASK_NI; (oldRound == _RC_UP)? kRoundUp :
_FPU_SETCW(flags); (oldRound == _RC_DOWN)? kRoundDown :
return NULL; (oldRound == _RC_CHOP)? kRoundTowardZero:
#else kDefaultRoundingMode;
#error Unknown arch
#endif _controlfp_s(&oldRound, p[r], _MCW_RC); //setting new rounding mode
#else return old; //returning old rounding mode
#error Please configure FlushToZero and UnFlushToZero to behave properly on this operating system. }
#endif
} RoundingMode get_round( void )
{
// Undo the effects of FlushToZero above, restoring the host to default behavior, using the information passed in p. unsigned int oldRound;
void UnFlushToZero( void *p)
{ int err = _controlfp_s(&oldRound, 0, 0); //get rounding mode into oldRound
#if defined( __APPLE__ ) || defined(__linux__) || defined (_WIN32) oldRound &= _MCW_RC;
#if defined( __i386__ ) || defined( __x86_64__ ) || defined(_MSC_VER) return
union{ void *p; int i; }u = { p }; (oldRound == _RC_NEAR)? kRoundToNearestEven :
_mm_setcsr( u.i ); (oldRound == _RC_UP)? kRoundUp :
#elif defined( __arm__ ) (oldRound == _RC_DOWN)? kRoundDown :
// processor is already in FTZ mode -- do nothing (oldRound == _RC_CHOP)? kRoundTowardZero:
#elif defined( __PPC__) kDefaultRoundingMode;
fpu_control_t flags = 0; }
_FPU_GETCW(flags);
flags &= ~_FPU_MASK_NI; #endif
_FPU_SETCW(flags);
#else //
#error Unknown arch // FlushToZero() sets the host processor into ftz mode. It is intended to have a remote effect on the behavior of the code in
#endif // basic_test_conversions.c. Some host processors may not support this mode, which case you'll need to do some clamping in
#else // software by testing against FLT_MIN or DBL_MIN in that file.
#error Please configure FlushToZero and UnFlushToZero to behave properly on this operating system. //
#endif // Note: IEEE-754 says conversions are basic operations. As such they do *NOT* have the behavior in section 7.5.3 of
} // the OpenCL spec. They *ALWAYS* flush to zero for subnormal inputs or outputs when FTZ mode is on like other basic
// operators do (e.g. add, subtract, multiply, divide, etc.)
//
// Configuring hardware to FTZ mode varies by platform.
// CAUTION: Some C implementations may also fail to behave properly in this mode.
//
// On PowerPC, it is done by setting the FPSCR into non-IEEE mode.
// On Intel, you can do this by turning on the FZ and DAZ bits in the MXCSR -- provided that SSE/SSE2
// is used for floating point computation! If your OS uses x87, you'll need to figure out how
// to turn that off for the conversions code in basic_test_conversions.c so that they flush to
// zero properly. Otherwise, you'll need to add appropriate software clamping to basic_test_conversions.c
// in which case, these function are at liberty to do nothing.
//
#if defined( __i386__ ) || defined( __x86_64__ ) || defined (_WIN32)
#include <xmmintrin.h>
#elif defined( __PPC__ )
#include <fpu_control.h>
#endif
void *FlushToZero( void )
{
#if defined( __APPLE__ ) || defined(__linux__) || defined (_WIN32)
#if defined( __i386__ ) || defined( __x86_64__ ) || defined(_MSC_VER)
union{ int i; void *p; }u = { _mm_getcsr() };
_mm_setcsr( u.i | 0x8040 );
return u.p;
#elif defined( __arm__ ) || defined(__aarch64__)
int fpscr;
_FPU_GETCW(fpscr);
_FPU_SETCW(fpscr | FPSCR_FZ);
return NULL;
#elif defined( __PPC__ )
fpu_control_t flags = 0;
_FPU_GETCW(flags);
flags |= _FPU_MASK_NI;
_FPU_SETCW(flags);
return NULL;
#else
#error Unknown arch
#endif
#else
#error Please configure FlushToZero and UnFlushToZero to behave properly on this operating system.
#endif
}
// Undo the effects of FlushToZero above, restoring the host to default behavior, using the information passed in p.
void UnFlushToZero( void *p)
{
#if defined( __APPLE__ ) || defined(__linux__) || defined (_WIN32)
#if defined( __i386__ ) || defined( __x86_64__ ) || defined(_MSC_VER)
union{ void *p; int i; }u = { p };
_mm_setcsr( u.i );
#elif defined( __arm__ ) || defined(__aarch64__)
int fpscr;
_FPU_GETCW(fpscr);
_FPU_SETCW(fpscr & ~FPSCR_FZ);
#elif defined( __PPC__)
fpu_control_t flags = 0;
_FPU_GETCW(flags);
flags &= ~_FPU_MASK_NI;
_FPU_SETCW(flags);
#else
#error Unknown arch
#endif
#else
#error Please configure FlushToZero and UnFlushToZero to behave properly on this operating system.
#endif
}

View File

@@ -1,73 +1,69 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef __ROUNDING_MODE_H__ #ifndef __ROUNDING_MODE_H__
#define __ROUNDING_MODE_H__ #define __ROUNDING_MODE_H__
#include <stdlib.h> #include "compat.h"
#if (defined(_WIN32) && defined (_MSC_VER)) #if (defined(_WIN32) && defined (_MSC_VER))
// need for _controlfp_s and rouinding modes in RoundingMode #include "errorHelpers.h"
#include <float.h> #include "testHarness.h"
#include "errorHelpers.h" #endif
#include "testHarness.h"
#else typedef enum
#include <fenv.h> {
#endif kDefaultRoundingMode = 0,
kRoundToNearestEven,
typedef enum kRoundUp,
{ kRoundDown,
kDefaultRoundingMode = 0, kRoundTowardZero,
kRoundToNearestEven,
kRoundUp, kRoundingModeCount
kRoundDown, }RoundingMode;
kRoundTowardZero,
typedef enum
kRoundingModeCount {
}RoundingMode; kuchar = 0,
kchar = 1,
typedef enum kushort = 2,
{ kshort = 3,
kuchar = 0, kuint = 4,
kchar = 1, kint = 5,
kushort = 2, kfloat = 6,
kshort = 3, kdouble = 7,
kuint = 4, kulong = 8,
kint = 5, klong = 9,
kfloat = 6,
kdouble = 7, //This goes last
kulong = 8, kTypeCount
klong = 9, }Type;
//This goes last #ifdef __cplusplus
kTypeCount extern "C" {
}Type; #endif
#ifdef __cplusplus extern RoundingMode set_round( RoundingMode r, Type outType );
extern "C" { extern RoundingMode get_round( void );
#endif extern void *FlushToZero( void );
extern void UnFlushToZero( void *p);
extern RoundingMode set_round( RoundingMode r, Type outType );
extern RoundingMode get_round( void ); #ifdef __cplusplus
extern void *FlushToZero( void ); }
extern void UnFlushToZero( void *p); #endif
#ifdef __cplusplus
}
#endif #endif /* __ROUNDING_MODE_H__ */
#endif /* __ROUNDING_MODE_H__ */

File diff suppressed because it is too large Load Diff

View File

@@ -1,104 +1,104 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _testHarness_h #ifndef _testHarness_h
#define _testHarness_h #define _testHarness_h
#include "threadTesting.h" #include "threadTesting.h"
#include "clImageHelper.h" #include "clImageHelper.h"
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
extern cl_uint gReSeed; extern cl_uint gReSeed;
extern cl_uint gRandomSeed; extern cl_uint gRandomSeed;
// Supply a list of functions to test here. This will allocate a CL device, create a context, all that // Supply a list of functions to test here. This will allocate a CL device, create a context, all that
// setup work, and then call each function in turn as dictatated by the passed arguments. // setup work, and then call each function in turn as dictatated by the passed arguments.
extern int runTestHarness( int argc, const char *argv[], unsigned int num_fns, extern int runTestHarness( int argc, const char *argv[], unsigned int num_fns,
basefn fnList[], const char *fnNames[], basefn fnList[], const char *fnNames[],
int imageSupportRequired, int forceNoContextCreation, cl_command_queue_properties queueProps ); int imageSupportRequired, int forceNoContextCreation, cl_command_queue_properties queueProps );
// Device checking function. See runTestHarnessWithCheck. If this function returns anything other than CL_SUCCESS (0), the harness exits. // Device checking function. See runTestHarnessWithCheck. If this function returns anything other than CL_SUCCESS (0), the harness exits.
typedef int (*DeviceCheckFn)( cl_device_id device ); typedef int (*DeviceCheckFn)( cl_device_id device );
// Same as runTestHarness, but also supplies a function that checks the created device for required functionality. // Same as runTestHarness, but also supplies a function that checks the created device for required functionality.
extern int runTestHarnessWithCheck( int argc, const char *argv[], unsigned int num_fns, extern int runTestHarnessWithCheck( int argc, const char *argv[], unsigned int num_fns,
basefn fnList[], const char *fnNames[], basefn fnList[], const char *fnNames[],
int imageSupportRequired, int forceNoContextCreation, cl_command_queue_properties queueProps, DeviceCheckFn deviceCheckFn ); int imageSupportRequired, int forceNoContextCreation, cl_command_queue_properties queueProps, DeviceCheckFn deviceCheckFn );
// The command line parser used by runTestHarness to break up parameters into calls to callTestFunctions // The command line parser used by runTestHarness to break up parameters into calls to callTestFunctions
extern int parseAndCallCommandLineTests( int argc, const char *argv[], cl_device_id device, unsigned int num_fns, extern int parseAndCallCommandLineTests( int argc, const char *argv[], cl_device_id device, unsigned int num_fns,
basefn *fnList, const char *fnNames[], basefn *fnList, const char *fnNames[],
int forceNoContextCreation, cl_command_queue_properties queueProps, int num_elements ); int forceNoContextCreation, cl_command_queue_properties queueProps, int num_elements );
// Call this function if you need to do all the setup work yourself, and just need the function list called/ // Call this function if you need to do all the setup work yourself, and just need the function list called/
// managed. // managed.
// functionIndexToCall can be a valid index into the function list, or -1 to run all of them. // functionIndexToCall can be a valid index into the function list, or -1 to run all of them.
// partialName can be a string to partially match function names against and only execute functions who // partialName can be a string to partially match function names against and only execute functions who
// match, or NULL to not restrict execution (ignored if functionIndexToCall is not -1) // match, or NULL to not restrict execution (ignored if functionIndexToCall is not -1)
// functionList is the actual array of functions // functionList is the actual array of functions
// numFunctions is the number of functions in the list (which should NOT have NULL at the end for "all") // numFunctions is the number of functions in the list (which should NOT have NULL at the end for "all")
// functionNames is an array of strings representing the name of each function, to be used in partial matching // functionNames is an array of strings representing the name of each function, to be used in partial matching
// contextProps are used to create a testing context for each test // contextProps are used to create a testing context for each test
// deviceToUse, deviceGroupToUse and numElementsToUse are all just passed to each test function // deviceToUse, deviceGroupToUse and numElementsToUse are all just passed to each test function
extern int callTestFunctions( basefn functionList[], int numFunctions, extern int callTestFunctions( basefn functionList[], int numFunctions,
const char *functionNames[], const char *functionNames[],
cl_device_id deviceToUse, int forceNoContextCreation, cl_device_id deviceToUse, int forceNoContextCreation,
int numElementsToUse, int numElementsToUse,
int functionIndexToCall, const char *partialName, cl_command_queue_properties queueProps ); int functionIndexToCall, const char *partialName, cl_command_queue_properties queueProps );
// This function is called by callTestFunctions, once per function, to do setup, call, logging and cleanup // This function is called by callTestFunctions, once per function, to do setup, call, logging and cleanup
extern int callSingleTestFunction( basefn functionToCall, const char *functionName, extern int callSingleTestFunction( basefn functionToCall, const char *functionName,
cl_device_id deviceToUse, int forceNoContextCreation, cl_device_id deviceToUse, int forceNoContextCreation,
int numElementsToUse, cl_command_queue_properties queueProps ); int numElementsToUse, cl_command_queue_properties queueProps );
///// Miscellaneous steps ///// Miscellaneous steps
// Given a pre-existing device type choice, check the environment for an override, then print what // Given a pre-existing device type choice, check the environment for an override, then print what
// choice was made and how (and return the overridden choice, if there is one) // choice was made and how (and return the overridden choice, if there is one)
extern void checkDeviceTypeOverride( cl_device_type *inOutType ); extern void checkDeviceTypeOverride( cl_device_type *inOutType );
// standard callback function for context pfn_notify // standard callback function for context pfn_notify
extern void CL_CALLBACK notify_callback(const char *errinfo, const void *private_info, size_t cb, void *user_data); extern void CL_CALLBACK notify_callback(const char *errinfo, const void *private_info, size_t cb, void *user_data);
extern cl_device_type GetDeviceType( cl_device_id ); extern cl_device_type GetDeviceType( cl_device_id );
// Given a device (most likely passed in by the harness, but not required), will attempt to find // Given a device (most likely passed in by the harness, but not required), will attempt to find
// a DIFFERENT device and return it. Useful for finding another device to run multi-device tests against. // a DIFFERENT device and return it. Useful for finding another device to run multi-device tests against.
// Note that returning NULL means an error was hit, but if no error was hit and the device passed in // Note that returning NULL means an error was hit, but if no error was hit and the device passed in
// is the only device available, the SAME device is returned, so check! // is the only device available, the SAME device is returned, so check!
extern cl_device_id GetOpposingDevice( cl_device_id device ); extern cl_device_id GetOpposingDevice( cl_device_id device );
extern int gFlushDenormsToZero; // This is set to 1 if the device does not support denorms (CL_FP_DENORM) extern int gFlushDenormsToZero; // This is set to 1 if the device does not support denorms (CL_FP_DENORM)
extern int gInfNanSupport; // This is set to 1 if the device supports infinities and NaNs extern int gInfNanSupport; // This is set to 1 if the device supports infinities and NaNs
extern int gIsEmbedded; // This is set to 1 if the device is an embedded device extern int gIsEmbedded; // This is set to 1 if the device is an embedded device
extern int gHasLong; // This is set to 1 if the device suppots long and ulong types in OpenCL C. extern int gHasLong; // This is set to 1 if the device suppots long and ulong types in OpenCL C.
extern int gIsOpenCL_C_1_0_Device; // This is set to 1 if the device supports only OpenCL C 1.0. extern int gIsOpenCL_C_1_0_Device; // This is set to 1 if the device supports only OpenCL C 1.0.
#if ! defined( __APPLE__ ) #if ! defined( __APPLE__ )
void memset_pattern4(void *, const void *, size_t); void memset_pattern4(void *, const void *, size_t);
#endif #endif
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
#endif // _testHarness_h #endif // _testHarness_h

View File

@@ -1,51 +1,51 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "mt19937.h" #include "mt19937.h"
#include <stdio.h> #include <stdio.h>
int main( void ) int main( void )
{ {
MTdata d = init_genrand(42); MTdata d = init_genrand(42);
int i; int i;
const cl_uint reference[16] = { 0x5fe1dc66, 0x8b255210, 0x0380b0c8, 0xc87d2ce4, const cl_uint reference[16] = { 0x5fe1dc66, 0x8b255210, 0x0380b0c8, 0xc87d2ce4,
0x55c31f24, 0x8bcd21ab, 0x14d5fef5, 0x9416d2b6, 0x55c31f24, 0x8bcd21ab, 0x14d5fef5, 0x9416d2b6,
0xdf875de9, 0x00517d76, 0xd861c944, 0xa7676404, 0xdf875de9, 0x00517d76, 0xd861c944, 0xa7676404,
0x5491aff4, 0x67616209, 0xc368b3fb, 0x929dfc92 }; 0x5491aff4, 0x67616209, 0xc368b3fb, 0x929dfc92 };
int errcount = 0; int errcount = 0;
for( i = 0; i < 65536; i++ ) for( i = 0; i < 65536; i++ )
{ {
cl_uint u = genrand_int32( d ); cl_uint u = genrand_int32( d );
if( 0 == (i & 4095) ) if( 0 == (i & 4095) )
{ {
if( u != reference[i>>12] ) if( u != reference[i>>12] )
{ {
printf("ERROR: expected *0x%8.8x at %d. Got 0x%8.8x\n", reference[i>>12], i, u ); printf("ERROR: expected *0x%8.8x at %d. Got 0x%8.8x\n", reference[i>>12], i, u );
errcount++; errcount++;
} }
} }
} }
free_mtdata(d); free_mtdata(d);
if( errcount ) if( errcount )
printf("mt19937 test failed.\n"); printf("mt19937 test failed.\n");
else else
printf("mt19937 test passed.\n"); printf("mt19937 test passed.\n");
return 0; return 0;
} }

View File

@@ -1,106 +1,100 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "threadTesting.h" #include "compat.h"
#include "errorHelpers.h" #include "threadTesting.h"
#include <stdio.h> #include "errorHelpers.h"
#include <stdlib.h> #include <stdio.h>
#include <string.h>
#if !defined(_WIN32)
#include <stdbool.h> #if !defined(_WIN32)
#endif #include <pthread.h>
#endif
#include <math.h>
#include <string.h> #if 0 // Disabed for now
#if !defined(_WIN32) typedef struct
#include <pthread.h> {
#endif basefn mFunction;
cl_device_id mDevice;
#if 0 // Disabed for now cl_context mContext;
int mNumElements;
typedef struct } TestFnArgs;
{
basefn mFunction; ////////////////////////////////////////////////////////////////////////////////
cl_device_id mDevice; // Thread-based testing. Spawns a new thread to run the given test function,
cl_context mContext; // then waits for it to complete. The entire idea is that, if the thread crashes,
int mNumElements; // we can catch it and report it as a failure instead of crashing the entire suite
} TestFnArgs; ////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////// void *test_thread_wrapper( void *data )
// Thread-based testing. Spawns a new thread to run the given test function, {
// then waits for it to complete. The entire idea is that, if the thread crashes, TestFnArgs *args;
// we can catch it and report it as a failure instead of crashing the entire suite int retVal;
//////////////////////////////////////////////////////////////////////////////// cl_context context;
void *test_thread_wrapper( void *data ) args = (TestFnArgs *)data;
{
TestFnArgs *args; /* Create a new context to use (contexts can't cross threads) */
int retVal; context = clCreateContext(NULL, args->mDeviceGroup);
cl_context context; if( context == NULL )
{
args = (TestFnArgs *)data; log_error("clCreateContext failed for new thread\n");
return (void *)(-1);
/* Create a new context to use (contexts can't cross threads) */ }
context = clCreateContext(NULL, args->mDeviceGroup);
if( context == NULL ) /* Call function */
{ retVal = args->mFunction( args->mDeviceGroup, args->mDevice, context, args->mNumElements );
log_error("clCreateContext failed for new thread\n");
return (void *)(-1); clReleaseContext( context );
}
return (void *)retVal;
/* Call function */ }
retVal = args->mFunction( args->mDeviceGroup, args->mDevice, context, args->mNumElements );
int test_threaded_function( basefn fnToTest, cl_device_id device, cl_context context, cl_command_queue queue, int numElements )
clReleaseContext( context ); {
int error;
return (void *)retVal; pthread_t threadHdl;
} void *retVal;
TestFnArgs args;
int test_threaded_function( basefn fnToTest, cl_device_id device, cl_context context, cl_command_queue queue, int numElements )
{
int error; args.mFunction = fnToTest;
pthread_t threadHdl; args.mDeviceGroup = deviceGroup;
void *retVal; args.mDevice = device;
TestFnArgs args; args.mContext = context;
args.mNumElements = numElements;
args.mFunction = fnToTest;
args.mDeviceGroup = deviceGroup; error = pthread_create( &threadHdl, NULL, test_thread_wrapper, (void *)&args );
args.mDevice = device; if( error != 0 )
args.mContext = context; {
args.mNumElements = numElements; log_error( "ERROR: Unable to create thread for testing!\n" );
return -1;
}
error = pthread_create( &threadHdl, NULL, test_thread_wrapper, (void *)&args );
if( error != 0 ) /* Thread has been started, now just wait for it to complete (or crash) */
{ error = pthread_join( threadHdl, &retVal );
log_error( "ERROR: Unable to create thread for testing!\n" ); if( error != 0 )
return -1; {
} log_error( "ERROR: Unable to join testing thread!\n" );
return -1;
/* Thread has been started, now just wait for it to complete (or crash) */ }
error = pthread_join( threadHdl, &retVal );
if( error != 0 ) return (int)((intptr_t)retVal);
{ }
log_error( "ERROR: Unable to join testing thread!\n" ); #endif
return -1;
}
return (int)((intptr_t)retVal);
}
#endif

View File

@@ -1,32 +1,32 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _threadTesting_h #ifndef _threadTesting_h
#define _threadTesting_h #define _threadTesting_h
#ifdef __APPLE__ #ifdef __APPLE__
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#else #else
#include <CL/opencl.h> #include <CL/opencl.h>
#endif #endif
#define TEST_NOT_IMPLEMENTED -99 #define TEST_NOT_IMPLEMENTED -99
typedef int (*basefn)(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); typedef int (*basefn)(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_threaded_function( basefn fnToTest, cl_device_id device, cl_context context, cl_command_queue queue, int numElements ); extern int test_threaded_function( basefn fnToTest, cl_device_id device, cl_context context, cl_command_queue queue, int numElements );
#endif // _threadTesting_h #endif // _threadTesting_h

View File

@@ -1,481 +1,481 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "typeWrappers.h" #include "typeWrappers.h"
#include "kernelHelpers.h" #include "kernelHelpers.h"
#include "errorHelpers.h" #include "errorHelpers.h"
#include <stdlib.h> #include <stdlib.h>
#include "clImageHelper.h" #include "clImageHelper.h"
#define ROUND_SIZE_UP( _size, _align ) (((size_t)(_size) + (size_t)(_align) - 1) & -((size_t)(_align))) #define ROUND_SIZE_UP( _size, _align ) (((size_t)(_size) + (size_t)(_align) - 1) & -((size_t)(_align)))
#if defined( __APPLE__ ) #if defined( __APPLE__ )
#define kPageSize 4096 #define kPageSize 4096
#include <sys/mman.h> #include <sys/mman.h>
#include <stdlib.h> #include <stdlib.h>
#elif defined(__linux__) #elif defined(__linux__)
#include <unistd.h> #include <unistd.h>
#define kPageSize (getpagesize()) #define kPageSize (getpagesize())
#endif #endif
clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, cl_int *errcode_ret ) clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, cl_int *errcode_ret )
{ {
cl_int err = Create( context, mem_flags, fmt, width ); cl_int err = Create( context, mem_flags, fmt, width );
if( errcode_ret != NULL ) if( errcode_ret != NULL )
*errcode_ret = err; *errcode_ret = err;
} }
cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width ) cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width )
{ {
cl_int error; cl_int error;
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int protect_pages = 1; int protect_pages = 1;
cl_device_id devices[16]; cl_device_id devices[16];
size_t number_of_devices; size_t number_of_devices;
error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices); error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices);
test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed"); test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed");
number_of_devices /= sizeof(cl_device_id); number_of_devices /= sizeof(cl_device_id);
for (int i=0; i<(int)number_of_devices; i++) { for (int i=0; i<(int)number_of_devices; i++) {
cl_device_type type; cl_device_type type;
error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed"); test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed");
if (type == CL_DEVICE_TYPE_GPU) { if (type == CL_DEVICE_TYPE_GPU) {
protect_pages = 0; protect_pages = 0;
break; break;
} }
} }
if (protect_pages) { if (protect_pages) {
size_t pixelBytes = get_pixel_bytes(fmt); size_t pixelBytes = get_pixel_bytes(fmt);
size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize ); size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize );
size_t rowStride = rowBytes + kPageSize; size_t rowStride = rowBytes + kPageSize;
// create backing store // create backing store
backingStoreSize = rowStride + 8 * rowStride; backingStoreSize = rowStride + 8 * rowStride;
backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0); backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0);
// add guard pages // add guard pages
size_t row; size_t row;
char *p = (char*) backingStore; char *p = (char*) backingStore;
char *imagePtr = (char*) backingStore + 4 * rowStride; char *imagePtr = (char*) backingStore + 4 * rowStride;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
p += rowBytes; p += rowBytes;
mprotect( p, kPageSize, PROT_NONE ); p += rowStride; mprotect( p, kPageSize, PROT_NONE ); p += rowStride;
p -= rowBytes; p -= rowBytes;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
if( getenv( "CL_ALIGN_RIGHT" ) ) if( getenv( "CL_ALIGN_RIGHT" ) )
{ {
static int spewEnv = 1; static int spewEnv = 1;
if(spewEnv) if(spewEnv)
{ {
log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" ); log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" );
spewEnv = 0; spewEnv = 0;
} }
imagePtr += rowBytes - pixelBytes * width; imagePtr += rowBytes - pixelBytes * width;
} }
image = create_image_1d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, rowStride, imagePtr, NULL, &error ); image = create_image_1d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, rowStride, imagePtr, NULL, &error );
} else { } else {
backingStore = NULL; backingStore = NULL;
image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error ); image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error );
} }
#else #else
backingStore = NULL; backingStore = NULL;
image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error ); image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error );
#endif #endif
return error; return error;
} }
clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, cl_int *errcode_ret ) clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, cl_int *errcode_ret )
{ {
cl_int err = Create( context, mem_flags, fmt, width, height ); cl_int err = Create( context, mem_flags, fmt, width, height );
if( errcode_ret != NULL ) if( errcode_ret != NULL )
*errcode_ret = err; *errcode_ret = err;
} }
cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height ) cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height )
{ {
cl_int error; cl_int error;
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int protect_pages = 1; int protect_pages = 1;
cl_device_id devices[16]; cl_device_id devices[16];
size_t number_of_devices; size_t number_of_devices;
error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices); error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices);
test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed"); test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed");
number_of_devices /= sizeof(cl_device_id); number_of_devices /= sizeof(cl_device_id);
for (int i=0; i<(int)number_of_devices; i++) { for (int i=0; i<(int)number_of_devices; i++) {
cl_device_type type; cl_device_type type;
error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed"); test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed");
if (type == CL_DEVICE_TYPE_GPU) { if (type == CL_DEVICE_TYPE_GPU) {
protect_pages = 0; protect_pages = 0;
break; break;
} }
} }
if (protect_pages) { if (protect_pages) {
size_t pixelBytes = get_pixel_bytes(fmt); size_t pixelBytes = get_pixel_bytes(fmt);
size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize ); size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize );
size_t rowStride = rowBytes + kPageSize; size_t rowStride = rowBytes + kPageSize;
// create backing store // create backing store
backingStoreSize = height * rowStride + 8 * rowStride; backingStoreSize = height * rowStride + 8 * rowStride;
backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0); backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0);
// add guard pages // add guard pages
size_t row; size_t row;
char *p = (char*) backingStore; char *p = (char*) backingStore;
char *imagePtr = (char*) backingStore + 4 * rowStride; char *imagePtr = (char*) backingStore + 4 * rowStride;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
p += rowBytes; p += rowBytes;
for( row = 0; row < height; row++ ) for( row = 0; row < height; row++ )
{ {
mprotect( p, kPageSize, PROT_NONE ); p += rowStride; mprotect( p, kPageSize, PROT_NONE ); p += rowStride;
} }
p -= rowBytes; p -= rowBytes;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
if( getenv( "CL_ALIGN_RIGHT" ) ) if( getenv( "CL_ALIGN_RIGHT" ) )
{ {
static int spewEnv = 1; static int spewEnv = 1;
if(spewEnv) if(spewEnv)
{ {
log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" ); log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" );
spewEnv = 0; spewEnv = 0;
} }
imagePtr += rowBytes - pixelBytes * width; imagePtr += rowBytes - pixelBytes * width;
} }
image = create_image_2d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, rowStride, imagePtr, &error ); image = create_image_2d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, rowStride, imagePtr, &error );
} else { } else {
backingStore = NULL; backingStore = NULL;
image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error ); image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error );
} }
#else #else
backingStore = NULL; backingStore = NULL;
image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error ); image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error );
#endif #endif
return error; return error;
} }
clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, cl_int *errcode_ret ) clProtectedImage::clProtectedImage( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, cl_int *errcode_ret )
{ {
cl_int err = Create( context, mem_flags, fmt, width, height, depth ); cl_int err = Create( context, mem_flags, fmt, width, height, depth );
if( errcode_ret != NULL ) if( errcode_ret != NULL )
*errcode_ret = err; *errcode_ret = err;
} }
cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth ) cl_int clProtectedImage::Create( cl_context context, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth )
{ {
cl_int error; cl_int error;
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int protect_pages = 1; int protect_pages = 1;
cl_device_id devices[16]; cl_device_id devices[16];
size_t number_of_devices; size_t number_of_devices;
error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices); error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices);
test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed"); test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed");
number_of_devices /= sizeof(cl_device_id); number_of_devices /= sizeof(cl_device_id);
for (int i=0; i<(int)number_of_devices; i++) { for (int i=0; i<(int)number_of_devices; i++) {
cl_device_type type; cl_device_type type;
error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed"); test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed");
if (type == CL_DEVICE_TYPE_GPU) { if (type == CL_DEVICE_TYPE_GPU) {
protect_pages = 0; protect_pages = 0;
break; break;
} }
} }
if (protect_pages) { if (protect_pages) {
size_t pixelBytes = get_pixel_bytes(fmt); size_t pixelBytes = get_pixel_bytes(fmt);
size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize ); size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize );
size_t rowStride = rowBytes + kPageSize; size_t rowStride = rowBytes + kPageSize;
// create backing store // create backing store
backingStoreSize = height * depth * rowStride + 8 * rowStride; backingStoreSize = height * depth * rowStride + 8 * rowStride;
backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0); backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0);
// add guard pages // add guard pages
size_t row; size_t row;
char *p = (char*) backingStore; char *p = (char*) backingStore;
char *imagePtr = (char*) backingStore + 4 * rowStride; char *imagePtr = (char*) backingStore + 4 * rowStride;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
p += rowBytes; p += rowBytes;
for( row = 0; row < height*depth; row++ ) for( row = 0; row < height*depth; row++ )
{ {
mprotect( p, kPageSize, PROT_NONE ); p += rowStride; mprotect( p, kPageSize, PROT_NONE ); p += rowStride;
} }
p -= rowBytes; p -= rowBytes;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
if( getenv( "CL_ALIGN_RIGHT" ) ) if( getenv( "CL_ALIGN_RIGHT" ) )
{ {
static int spewEnv = 1; static int spewEnv = 1;
if(spewEnv) if(spewEnv)
{ {
log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" ); log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" );
spewEnv = 0; spewEnv = 0;
} }
imagePtr += rowBytes - pixelBytes * width; imagePtr += rowBytes - pixelBytes * width;
} }
image = create_image_3d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, depth, rowStride, height*rowStride, imagePtr, &error ); image = create_image_3d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, depth, rowStride, height*rowStride, imagePtr, &error );
} else { } else {
backingStore = NULL; backingStore = NULL;
image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error ); image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );
} }
#else #else
backingStore = NULL; backingStore = NULL;
image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error ); image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );
#endif #endif
return error; return error;
} }
clProtectedImage::clProtectedImage( cl_context context, cl_mem_object_type imageType, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize, cl_int *errcode_ret ) clProtectedImage::clProtectedImage( cl_context context, cl_mem_object_type imageType, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize, cl_int *errcode_ret )
{ {
cl_int err = Create( context, imageType, mem_flags, fmt, width, height, depth, arraySize ); cl_int err = Create( context, imageType, mem_flags, fmt, width, height, depth, arraySize );
if( errcode_ret != NULL ) if( errcode_ret != NULL )
*errcode_ret = err; *errcode_ret = err;
} }
cl_int clProtectedImage::Create( cl_context context, cl_mem_object_type imageType, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize ) cl_int clProtectedImage::Create( cl_context context, cl_mem_object_type imageType, cl_mem_flags mem_flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize )
{ {
cl_int error; cl_int error;
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int protect_pages = 1; int protect_pages = 1;
cl_device_id devices[16]; cl_device_id devices[16];
size_t number_of_devices; size_t number_of_devices;
error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices); error = clGetContextInfo(context, CL_CONTEXT_DEVICES, sizeof(devices), devices, &number_of_devices);
test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed"); test_error(error, "clGetContextInfo for CL_CONTEXT_DEVICES failed");
number_of_devices /= sizeof(cl_device_id); number_of_devices /= sizeof(cl_device_id);
for (int i=0; i<(int)number_of_devices; i++) { for (int i=0; i<(int)number_of_devices; i++) {
cl_device_type type; cl_device_type type;
error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL); error = clGetDeviceInfo(devices[i], CL_DEVICE_TYPE, sizeof(type), &type, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed"); test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed");
if (type == CL_DEVICE_TYPE_GPU) { if (type == CL_DEVICE_TYPE_GPU) {
protect_pages = 0; protect_pages = 0;
break; break;
} }
} }
if (protect_pages) { if (protect_pages) {
size_t pixelBytes = get_pixel_bytes(fmt); size_t pixelBytes = get_pixel_bytes(fmt);
size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize ); size_t rowBytes = ROUND_SIZE_UP( width * pixelBytes, kPageSize );
size_t rowStride = rowBytes + kPageSize; size_t rowStride = rowBytes + kPageSize;
// create backing store // create backing store
switch (imageType) switch (imageType)
{ {
case CL_MEM_OBJECT_IMAGE1D: case CL_MEM_OBJECT_IMAGE1D:
backingStoreSize = rowStride + 8 * rowStride; backingStoreSize = rowStride + 8 * rowStride;
break; break;
case CL_MEM_OBJECT_IMAGE2D: case CL_MEM_OBJECT_IMAGE2D:
backingStoreSize = height * rowStride + 8 * rowStride; backingStoreSize = height * rowStride + 8 * rowStride;
break; break;
case CL_MEM_OBJECT_IMAGE3D: case CL_MEM_OBJECT_IMAGE3D:
backingStoreSize = height * depth * rowStride + 8 * rowStride; backingStoreSize = height * depth * rowStride + 8 * rowStride;
break; break;
case CL_MEM_OBJECT_IMAGE1D_ARRAY: case CL_MEM_OBJECT_IMAGE1D_ARRAY:
backingStoreSize = arraySize * rowStride + 8 * rowStride; backingStoreSize = arraySize * rowStride + 8 * rowStride;
break; break;
case CL_MEM_OBJECT_IMAGE2D_ARRAY: case CL_MEM_OBJECT_IMAGE2D_ARRAY:
backingStoreSize = height * arraySize * rowStride + 8 * rowStride; backingStoreSize = height * arraySize * rowStride + 8 * rowStride;
break; break;
} }
backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0); backingStore = mmap(0, backingStoreSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0);
// add guard pages // add guard pages
size_t row; size_t row;
char *p = (char*) backingStore; char *p = (char*) backingStore;
char *imagePtr = (char*) backingStore + 4 * rowStride; char *imagePtr = (char*) backingStore + 4 * rowStride;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
p += rowBytes; p += rowBytes;
size_t sz = (height > 0 ? height : 1) * (depth > 0 ? depth : 1) * (arraySize > 0 ? arraySize : 1); size_t sz = (height > 0 ? height : 1) * (depth > 0 ? depth : 1) * (arraySize > 0 ? arraySize : 1);
for( row = 0; row < sz; row++ ) for( row = 0; row < sz; row++ )
{ {
mprotect( p, kPageSize, PROT_NONE ); p += rowStride; mprotect( p, kPageSize, PROT_NONE ); p += rowStride;
} }
p -= rowBytes; p -= rowBytes;
for( row = 0; row < 4; row++ ) for( row = 0; row < 4; row++ )
{ {
mprotect( p, rowStride, PROT_NONE ); p += rowStride; mprotect( p, rowStride, PROT_NONE ); p += rowStride;
} }
if( getenv( "CL_ALIGN_RIGHT" ) ) if( getenv( "CL_ALIGN_RIGHT" ) )
{ {
static int spewEnv = 1; static int spewEnv = 1;
if(spewEnv) if(spewEnv)
{ {
log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" ); log_info( "***CL_ALIGN_RIGHT is set. Aligning images at right edge of page\n" );
spewEnv = 0; spewEnv = 0;
} }
imagePtr += rowBytes - pixelBytes * width; imagePtr += rowBytes - pixelBytes * width;
} }
switch (imageType) switch (imageType)
{ {
case CL_MEM_OBJECT_IMAGE1D: case CL_MEM_OBJECT_IMAGE1D:
image = create_image_1d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, rowStride, imagePtr, NULL, &error ); image = create_image_1d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, rowStride, imagePtr, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D: case CL_MEM_OBJECT_IMAGE2D:
image = create_image_2d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, rowStride, imagePtr, &error ); image = create_image_2d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, rowStride, imagePtr, &error );
break; break;
case CL_MEM_OBJECT_IMAGE3D: case CL_MEM_OBJECT_IMAGE3D:
image = create_image_3d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, depth, rowStride, height*rowStride, imagePtr, &error ); image = create_image_3d( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, depth, rowStride, height*rowStride, imagePtr, &error );
break; break;
case CL_MEM_OBJECT_IMAGE1D_ARRAY: case CL_MEM_OBJECT_IMAGE1D_ARRAY:
image = create_image_1d_array( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, arraySize, rowStride, rowStride, imagePtr, &error ); image = create_image_1d_array( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, arraySize, rowStride, rowStride, imagePtr, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D_ARRAY: case CL_MEM_OBJECT_IMAGE2D_ARRAY:
image = create_image_2d_array( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, arraySize, rowStride, height*rowStride, imagePtr, &error ); image = create_image_2d_array( context, mem_flags | CL_MEM_USE_HOST_PTR, fmt, width, height, arraySize, rowStride, height*rowStride, imagePtr, &error );
break; break;
} }
} else { } else {
backingStore = NULL; backingStore = NULL;
switch (imageType) switch (imageType)
{ {
case CL_MEM_OBJECT_IMAGE1D: case CL_MEM_OBJECT_IMAGE1D:
image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error ); image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D: case CL_MEM_OBJECT_IMAGE2D:
image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error ); image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE3D: case CL_MEM_OBJECT_IMAGE3D:
image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );; image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );;
break; break;
case CL_MEM_OBJECT_IMAGE1D_ARRAY: case CL_MEM_OBJECT_IMAGE1D_ARRAY:
image = create_image_1d_array( context, mem_flags, fmt, width, arraySize, 0, 0, NULL, &error ); image = create_image_1d_array( context, mem_flags, fmt, width, arraySize, 0, 0, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D_ARRAY: case CL_MEM_OBJECT_IMAGE2D_ARRAY:
image = create_image_2d_array( context, mem_flags, fmt, width, height, arraySize, 0, 0, NULL, &error ); image = create_image_2d_array( context, mem_flags, fmt, width, height, arraySize, 0, 0, NULL, &error );
break; break;
} }
} }
#else #else
backingStore = NULL; backingStore = NULL;
switch (imageType) switch (imageType)
{ {
case CL_MEM_OBJECT_IMAGE1D: case CL_MEM_OBJECT_IMAGE1D:
image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error ); image = create_image_1d( context, mem_flags, fmt, width, 0, NULL, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D: case CL_MEM_OBJECT_IMAGE2D:
image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error ); image = create_image_2d( context, mem_flags, fmt, width, height, 0, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE3D: case CL_MEM_OBJECT_IMAGE3D:
image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );; image = create_image_3d( context, mem_flags, fmt, width, height, depth, 0, 0, NULL, &error );;
break; break;
case CL_MEM_OBJECT_IMAGE1D_ARRAY: case CL_MEM_OBJECT_IMAGE1D_ARRAY:
image = create_image_1d_array( context, mem_flags, fmt, width, arraySize, 0, 0, NULL, &error ); image = create_image_1d_array( context, mem_flags, fmt, width, arraySize, 0, 0, NULL, &error );
break; break;
case CL_MEM_OBJECT_IMAGE2D_ARRAY: case CL_MEM_OBJECT_IMAGE2D_ARRAY:
image = create_image_2d_array( context, mem_flags, fmt, width, height, arraySize, 0, 0, NULL, &error ); image = create_image_2d_array( context, mem_flags, fmt, width, height, arraySize, 0, 0, NULL, &error );
break; break;
} }
#endif #endif
return error; return error;
} }
/******* /*******
* clProtectedArray implementation * clProtectedArray implementation
*******/ *******/
clProtectedArray::clProtectedArray() clProtectedArray::clProtectedArray()
{ {
mBuffer = mValidBuffer = NULL; mBuffer = mValidBuffer = NULL;
} }
clProtectedArray::clProtectedArray( size_t sizeInBytes ) clProtectedArray::clProtectedArray( size_t sizeInBytes )
{ {
mBuffer = mValidBuffer = NULL; mBuffer = mValidBuffer = NULL;
Allocate( sizeInBytes ); Allocate( sizeInBytes );
} }
clProtectedArray::~clProtectedArray() clProtectedArray::~clProtectedArray()
{ {
if( mBuffer != NULL ) { if( mBuffer != NULL ) {
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int error = munmap( mBuffer, mRealSize ); int error = munmap( mBuffer, mRealSize );
if (error) log_error("WARNING: munmap failed in clProtectedArray.\n"); if (error) log_error("WARNING: munmap failed in clProtectedArray.\n");
#else #else
free( mBuffer ); free( mBuffer );
#endif #endif
} }
} }
void clProtectedArray::Allocate( size_t sizeInBytes ) void clProtectedArray::Allocate( size_t sizeInBytes )
{ {
#if defined( __APPLE__ ) #if defined( __APPLE__ )
// Allocate enough space to: round up our actual allocation to an even number of pages // Allocate enough space to: round up our actual allocation to an even number of pages
// and allocate two pages on either side // and allocate two pages on either side
mRoundedSize = ROUND_SIZE_UP( sizeInBytes, kPageSize ); mRoundedSize = ROUND_SIZE_UP( sizeInBytes, kPageSize );
mRealSize = mRoundedSize + kPageSize * 2; mRealSize = mRoundedSize + kPageSize * 2;
// Use mmap here to ensure we start on a page boundary, so the mprotect calls will work OK // Use mmap here to ensure we start on a page boundary, so the mprotect calls will work OK
mBuffer = (char *)mmap(0, mRealSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0); mBuffer = (char *)mmap(0, mRealSize, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, 0, 0);
mValidBuffer = mBuffer + kPageSize; mValidBuffer = mBuffer + kPageSize;
// Protect guard area from access // Protect guard area from access
mprotect( mValidBuffer - kPageSize, kPageSize, PROT_NONE ); mprotect( mValidBuffer - kPageSize, kPageSize, PROT_NONE );
mprotect( mValidBuffer + mRoundedSize, kPageSize, PROT_NONE ); mprotect( mValidBuffer + mRoundedSize, kPageSize, PROT_NONE );
#else #else
mRoundedSize = mRealSize = sizeInBytes; mRoundedSize = mRealSize = sizeInBytes;
mBuffer = mValidBuffer = (char *)calloc(1, mRealSize); mBuffer = mValidBuffer = (char *)calloc(1, mRealSize);
#endif #endif
} }

View File

@@ -1,333 +1,333 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _typeWrappers_h #ifndef _typeWrappers_h
#define _typeWrappers_h #define _typeWrappers_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <sys/mman.h> #include <sys/mman.h>
#endif #endif
#include "compat.h" #include "compat.h"
#include <stdio.h> #include <stdio.h>
#include "mt19937.h" #include "mt19937.h"
#include "errorHelpers.h" #include "errorHelpers.h"
#include "kernelHelpers.h" #include "kernelHelpers.h"
extern "C" cl_uint gReSeed; extern "C" cl_uint gReSeed;
extern "C" cl_uint gRandomSeed; extern "C" cl_uint gRandomSeed;
/* cl_context wrapper */ /* cl_context wrapper */
class clContextWrapper class clContextWrapper
{ {
public: public:
clContextWrapper() { mContext = NULL; } clContextWrapper() { mContext = NULL; }
clContextWrapper( cl_context program ) { mContext = program; } clContextWrapper( cl_context program ) { mContext = program; }
~clContextWrapper() { if( mContext != NULL ) clReleaseContext( mContext ); } ~clContextWrapper() { if( mContext != NULL ) clReleaseContext( mContext ); }
clContextWrapper & operator=( const cl_context &rhs ) { mContext = rhs; return *this; } clContextWrapper & operator=( const cl_context &rhs ) { mContext = rhs; return *this; }
operator cl_context() { return mContext; } operator cl_context() { return mContext; }
cl_context * operator&() { return &mContext; } cl_context * operator&() { return &mContext; }
bool operator==( const cl_context &rhs ) { return mContext == rhs; } bool operator==( const cl_context &rhs ) { return mContext == rhs; }
protected: protected:
cl_context mContext; cl_context mContext;
}; };
/* cl_program wrapper */ /* cl_program wrapper */
class clProgramWrapper class clProgramWrapper
{ {
public: public:
clProgramWrapper() { mProgram = NULL; } clProgramWrapper() { mProgram = NULL; }
clProgramWrapper( cl_program program ) { mProgram = program; } clProgramWrapper( cl_program program ) { mProgram = program; }
~clProgramWrapper() { if( mProgram != NULL ) clReleaseProgram( mProgram ); } ~clProgramWrapper() { if( mProgram != NULL ) clReleaseProgram( mProgram ); }
clProgramWrapper & operator=( const cl_program &rhs ) { mProgram = rhs; return *this; } clProgramWrapper & operator=( const cl_program &rhs ) { mProgram = rhs; return *this; }
operator cl_program() { return mProgram; } operator cl_program() { return mProgram; }
cl_program * operator&() { return &mProgram; } cl_program * operator&() { return &mProgram; }
bool operator==( const cl_program &rhs ) { return mProgram == rhs; } bool operator==( const cl_program &rhs ) { return mProgram == rhs; }
protected: protected:
cl_program mProgram; cl_program mProgram;
}; };
/* cl_kernel wrapper */ /* cl_kernel wrapper */
class clKernelWrapper class clKernelWrapper
{ {
public: public:
clKernelWrapper() { mKernel = NULL; } clKernelWrapper() { mKernel = NULL; }
clKernelWrapper( cl_kernel kernel ) { mKernel = kernel; } clKernelWrapper( cl_kernel kernel ) { mKernel = kernel; }
~clKernelWrapper() { if( mKernel != NULL ) clReleaseKernel( mKernel ); } ~clKernelWrapper() { if( mKernel != NULL ) clReleaseKernel( mKernel ); }
clKernelWrapper & operator=( const cl_kernel &rhs ) { mKernel = rhs; return *this; } clKernelWrapper & operator=( const cl_kernel &rhs ) { mKernel = rhs; return *this; }
operator cl_kernel() { return mKernel; } operator cl_kernel() { return mKernel; }
cl_kernel * operator&() { return &mKernel; } cl_kernel * operator&() { return &mKernel; }
bool operator==( const cl_kernel &rhs ) { return mKernel == rhs; } bool operator==( const cl_kernel &rhs ) { return mKernel == rhs; }
protected: protected:
cl_kernel mKernel; cl_kernel mKernel;
}; };
/* cl_mem (stream) wrapper */ /* cl_mem (stream) wrapper */
class clMemWrapper class clMemWrapper
{ {
public: public:
clMemWrapper() { mMem = NULL; } clMemWrapper() { mMem = NULL; }
clMemWrapper( cl_mem mem ) { mMem = mem; } clMemWrapper( cl_mem mem ) { mMem = mem; }
~clMemWrapper() { if( mMem != NULL ) clReleaseMemObject( mMem ); } ~clMemWrapper() { if( mMem != NULL ) clReleaseMemObject( mMem ); }
clMemWrapper & operator=( const cl_mem &rhs ) { mMem = rhs; return *this; } clMemWrapper & operator=( const cl_mem &rhs ) { mMem = rhs; return *this; }
operator cl_mem() { return mMem; } operator cl_mem() { return mMem; }
cl_mem * operator&() { return &mMem; } cl_mem * operator&() { return &mMem; }
bool operator==( const cl_mem &rhs ) { return mMem == rhs; } bool operator==( const cl_mem &rhs ) { return mMem == rhs; }
protected: protected:
cl_mem mMem; cl_mem mMem;
}; };
class clProtectedImage class clProtectedImage
{ {
public: public:
clProtectedImage() { image = NULL; backingStore = NULL; } clProtectedImage() { image = NULL; backingStore = NULL; }
clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, cl_int *errcode_ret ); clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, cl_int *errcode_ret );
clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, cl_int *errcode_ret ); clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, cl_int *errcode_ret );
clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, cl_int *errcode_ret ); clProtectedImage( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, cl_int *errcode_ret );
clProtectedImage( cl_context context, cl_mem_object_type imageType, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize, cl_int *errcode_ret ); clProtectedImage( cl_context context, cl_mem_object_type imageType, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize, cl_int *errcode_ret );
~clProtectedImage() ~clProtectedImage()
{ {
if( image != NULL ) if( image != NULL )
clReleaseMemObject( image ); clReleaseMemObject( image );
#if defined( __APPLE__ ) #if defined( __APPLE__ )
if(backingStore) if(backingStore)
munmap(backingStore, backingStoreSize); munmap(backingStore, backingStoreSize);
#endif #endif
} }
cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width ); cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width );
cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height ); cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height );
cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth ); cl_int Create( cl_context context, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth );
cl_int Create( cl_context context, cl_mem_object_type imageType, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize ); cl_int Create( cl_context context, cl_mem_object_type imageType, cl_mem_flags flags, const cl_image_format *fmt, size_t width, size_t height, size_t depth, size_t arraySize );
clProtectedImage & operator=( const cl_mem &rhs ) { image = rhs; backingStore = NULL; return *this; } clProtectedImage & operator=( const cl_mem &rhs ) { image = rhs; backingStore = NULL; return *this; }
operator cl_mem() { return image; } operator cl_mem() { return image; }
cl_mem * operator&() { return &image; } cl_mem * operator&() { return &image; }
bool operator==( const cl_mem &rhs ) { return image == rhs; } bool operator==( const cl_mem &rhs ) { return image == rhs; }
protected: protected:
void *backingStore; void *backingStore;
size_t backingStoreSize; size_t backingStoreSize;
cl_mem image; cl_mem image;
}; };
/* cl_command_queue wrapper */ /* cl_command_queue wrapper */
class clCommandQueueWrapper class clCommandQueueWrapper
{ {
public: public:
clCommandQueueWrapper() { mMem = NULL; } clCommandQueueWrapper() { mMem = NULL; }
clCommandQueueWrapper( cl_command_queue mem ) { mMem = mem; } clCommandQueueWrapper( cl_command_queue mem ) { mMem = mem; }
~clCommandQueueWrapper() { if( mMem != NULL ) {int error = clFinish(mMem); if (error) print_error(error, "clFinish failed"); clReleaseCommandQueue( mMem );} } ~clCommandQueueWrapper() { if( mMem != NULL ) {int error = clFinish(mMem); if (error) print_error(error, "clFinish failed"); clReleaseCommandQueue( mMem );} }
clCommandQueueWrapper & operator=( const cl_command_queue &rhs ) { mMem = rhs; return *this; } clCommandQueueWrapper & operator=( const cl_command_queue &rhs ) { mMem = rhs; return *this; }
operator cl_command_queue() { return mMem; } operator cl_command_queue() { return mMem; }
cl_command_queue * operator&() { return &mMem; } cl_command_queue * operator&() { return &mMem; }
bool operator==( const cl_command_queue &rhs ) { return mMem == rhs; } bool operator==( const cl_command_queue &rhs ) { return mMem == rhs; }
protected: protected:
cl_command_queue mMem; cl_command_queue mMem;
}; };
/* cl_sampler wrapper */ /* cl_sampler wrapper */
class clSamplerWrapper class clSamplerWrapper
{ {
public: public:
clSamplerWrapper() { mMem = NULL; } clSamplerWrapper() { mMem = NULL; }
clSamplerWrapper( cl_sampler mem ) { mMem = mem; } clSamplerWrapper( cl_sampler mem ) { mMem = mem; }
~clSamplerWrapper() { if( mMem != NULL ) clReleaseSampler( mMem ); } ~clSamplerWrapper() { if( mMem != NULL ) clReleaseSampler( mMem ); }
clSamplerWrapper & operator=( const cl_sampler &rhs ) { mMem = rhs; return *this; } clSamplerWrapper & operator=( const cl_sampler &rhs ) { mMem = rhs; return *this; }
operator cl_sampler() { return mMem; } operator cl_sampler() { return mMem; }
cl_sampler * operator&() { return &mMem; } cl_sampler * operator&() { return &mMem; }
bool operator==( const cl_sampler &rhs ) { return mMem == rhs; } bool operator==( const cl_sampler &rhs ) { return mMem == rhs; }
protected: protected:
cl_sampler mMem; cl_sampler mMem;
}; };
/* cl_event wrapper */ /* cl_event wrapper */
class clEventWrapper class clEventWrapper
{ {
public: public:
clEventWrapper() { mMem = NULL; } clEventWrapper() { mMem = NULL; }
clEventWrapper( cl_event mem ) { mMem = mem; } clEventWrapper( cl_event mem ) { mMem = mem; }
~clEventWrapper() { if( mMem != NULL ) clReleaseEvent( mMem ); } ~clEventWrapper() { if( mMem != NULL ) clReleaseEvent( mMem ); }
clEventWrapper & operator=( const cl_event &rhs ) { mMem = rhs; return *this; } clEventWrapper & operator=( const cl_event &rhs ) { mMem = rhs; return *this; }
operator cl_event() { return mMem; } operator cl_event() { return mMem; }
cl_event * operator&() { return &mMem; } cl_event * operator&() { return &mMem; }
bool operator==( const cl_event &rhs ) { return mMem == rhs; } bool operator==( const cl_event &rhs ) { return mMem == rhs; }
protected: protected:
cl_event mMem; cl_event mMem;
}; };
/* Generic protected memory buffer, for verifying access within bounds */ /* Generic protected memory buffer, for verifying access within bounds */
class clProtectedArray class clProtectedArray
{ {
public: public:
clProtectedArray(); clProtectedArray();
clProtectedArray( size_t sizeInBytes ); clProtectedArray( size_t sizeInBytes );
virtual ~clProtectedArray(); virtual ~clProtectedArray();
void Allocate( size_t sizeInBytes ); void Allocate( size_t sizeInBytes );
operator void *() { return (void *)mValidBuffer; } operator void *() { return (void *)mValidBuffer; }
operator const void *() const { return (const void *)mValidBuffer; } operator const void *() const { return (const void *)mValidBuffer; }
protected: protected:
char * mBuffer; char * mBuffer;
char * mValidBuffer; char * mValidBuffer;
size_t mRealSize, mRoundedSize; size_t mRealSize, mRoundedSize;
}; };
class RandomSeed class RandomSeed
{ {
public: public:
RandomSeed( cl_uint seed ){ if(seed) log_info( "(seed = %10.10u) ", seed ); mtData = init_genrand(seed); } RandomSeed( cl_uint seed ){ if(seed) log_info( "(seed = %10.10u) ", seed ); mtData = init_genrand(seed); }
~RandomSeed() ~RandomSeed()
{ {
if( gReSeed ) if( gReSeed )
gRandomSeed = genrand_int32( mtData ); gRandomSeed = genrand_int32( mtData );
free_mtdata(mtData); free_mtdata(mtData);
} }
operator MTdata () {return mtData;} operator MTdata () {return mtData;}
protected: protected:
MTdata mtData; MTdata mtData;
}; };
template <typename T> class BufferOwningPtr template <typename T> class BufferOwningPtr
{ {
BufferOwningPtr(BufferOwningPtr const &); // do not implement BufferOwningPtr(BufferOwningPtr const &); // do not implement
void operator=(BufferOwningPtr const &); // do not implement void operator=(BufferOwningPtr const &); // do not implement
void *ptr; void *ptr;
void *map; void *map;
size_t mapsize; // Bytes allocated total, pointed to by map. size_t mapsize; // Bytes allocated total, pointed to by map.
size_t allocsize; // Bytes allocated in unprotected pages, pointed to by ptr. size_t allocsize; // Bytes allocated in unprotected pages, pointed to by ptr.
bool aligned; bool aligned;
public: public:
explicit BufferOwningPtr(void *p = 0) : ptr(p), map(0), mapsize(0), allocsize(0), aligned(false) {} explicit BufferOwningPtr(void *p = 0) : ptr(p), map(0), mapsize(0), allocsize(0), aligned(false) {}
explicit BufferOwningPtr(void *p, void *m, size_t s) explicit BufferOwningPtr(void *p, void *m, size_t s)
: ptr(p), map(m), mapsize(s), allocsize(0), aligned(false) : ptr(p), map(m), mapsize(s), allocsize(0), aligned(false)
{ {
#if ! defined( __APPLE__ ) #if ! defined( __APPLE__ )
if(m) if(m)
{ {
log_error( "ERROR: unhandled code path. BufferOwningPtr allocated with mapped buffer!" ); log_error( "ERROR: unhandled code path. BufferOwningPtr allocated with mapped buffer!" );
abort(); abort();
} }
#endif #endif
} }
~BufferOwningPtr() { ~BufferOwningPtr() {
if (map) { if (map) {
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int error = munmap(map, mapsize); int error = munmap(map, mapsize);
if (error) log_error("WARNING: munmap failed in BufferOwningPtr.\n"); if (error) log_error("WARNING: munmap failed in BufferOwningPtr.\n");
#endif #endif
} else { } else {
if ( aligned ) if ( aligned )
{ {
align_free(ptr); align_free(ptr);
} }
else else
{ {
free(ptr); free(ptr);
} }
} }
} }
void reset(void *p, void *m = 0, size_t mapsize_ = 0, size_t allocsize_ = 0, bool aligned_ = false) { void reset(void *p, void *m = 0, size_t mapsize_ = 0, size_t allocsize_ = 0, bool aligned_ = false) {
if (map){ if (map){
#if defined( __APPLE__ ) #if defined( __APPLE__ )
int error = munmap(map, mapsize); int error = munmap(map, mapsize);
if (error) log_error("WARNING: munmap failed in BufferOwningPtr.\n"); if (error) log_error("WARNING: munmap failed in BufferOwningPtr.\n");
#else #else
log_error( "ERROR: unhandled code path. BufferOwningPtr reset with mapped buffer!" ); log_error( "ERROR: unhandled code path. BufferOwningPtr reset with mapped buffer!" );
abort(); abort();
#endif #endif
} else { } else {
if ( aligned ) if ( aligned )
{ {
align_free(ptr); align_free(ptr);
} }
else else
{ {
free(ptr); free(ptr);
} }
} }
ptr = p; ptr = p;
map = m; map = m;
mapsize = mapsize_; mapsize = mapsize_;
allocsize = allocsize_; allocsize = allocsize_;
aligned = aligned_; aligned = aligned_;
#if ! defined( __APPLE__ ) #if ! defined( __APPLE__ )
if(m) if(m)
{ {
log_error( "ERROR: unhandled code path. BufferOwningPtr allocated with mapped buffer!" ); log_error( "ERROR: unhandled code path. BufferOwningPtr allocated with mapped buffer!" );
abort(); abort();
} }
#endif #endif
} }
operator T*() { return (T*)ptr; } operator T*() { return (T*)ptr; }
size_t getSize() const { return allocsize; }; size_t getSize() const { return allocsize; };
}; };
#endif // _typeWrappers_h #endif // _typeWrappers_h

View File

@@ -0,0 +1,8 @@
set(TARGET_NAME miniz)
add_library(
${TARGET_NAME}
STATIC
miniz.c
miniz.h
)

4153
test_common/miniz/miniz.c Normal file

File diff suppressed because it is too large Load Diff

749
test_common/miniz/miniz.h Normal file
View File

@@ -0,0 +1,749 @@
#ifndef MINIZ_HEADER_INCLUDED
#define MINIZ_HEADER_INCLUDED
#include <stdlib.h>
#if defined(__TINYC__) && (defined(__linux) || defined(__linux__))
// TODO: Work around "error: include file 'sys\utime.h' when compiling with tcc on Linux
#define MINIZ_NO_TIME
#endif
#if !defined(MINIZ_NO_TIME) && !defined(MINIZ_NO_ARCHIVE_APIS)
#include <time.h>
#endif
#if defined(_M_IX86) || defined(_M_X64) || defined(__i386__) || defined(__i386) || defined(__i486__) || defined(__i486) || defined(i386) || defined(__ia64__) || defined(__x86_64__)
// MINIZ_X86_OR_X64_CPU is only used to help set the below macros.
#define MINIZ_X86_OR_X64_CPU 1
#endif
#if (__BYTE_ORDER__==__ORDER_LITTLE_ENDIAN__) || MINIZ_X86_OR_X64_CPU
// Set MINIZ_LITTLE_ENDIAN to 1 if the processor is little endian.
#define MINIZ_LITTLE_ENDIAN 1
#endif
#if MINIZ_X86_OR_X64_CPU
// Set MINIZ_USE_UNALIGNED_LOADS_AND_STORES to 1 on CPU's that permit efficient integer loads and stores from unaligned addresses.
#define MINIZ_USE_UNALIGNED_LOADS_AND_STORES 1
#endif
#if defined(_M_X64) || defined(_WIN64) || defined(__MINGW64__) || defined(_LP64) || defined(__LP64__) || defined(__ia64__) || defined(__x86_64__)
// Set MINIZ_HAS_64BIT_REGISTERS to 1 if operations on 64-bit integers are reasonably fast (and don't involve compiler generated calls to helper functions).
#define MINIZ_HAS_64BIT_REGISTERS 1
#endif
// Return status codes. MZ_PARAM_ERROR is non-standard.
enum {
MZ_OK = 0,
MZ_STREAM_END = 1,
MZ_NEED_DICT = 2,
MZ_ERRNO = -1,
MZ_STREAM_ERROR = -2,
MZ_DATA_ERROR = -3,
MZ_MEM_ERROR = -4,
MZ_BUF_ERROR = -5,
MZ_VERSION_ERROR = -6,
MZ_PARAM_ERROR = -10000
};
typedef unsigned long mz_ulong;
#ifdef __cplusplus
extern "C" {
#endif
// ------------------- zlib-style API Definitions.
// mz_free() internally uses the MZ_FREE() macro (which by default calls free() unless you've modified the MZ_MALLOC macro) to release a block allocated from the heap.
void mz_free(void *p);
#define MZ_ADLER32_INIT (1)
// mz_adler32() returns the initial adler-32 value to use when called with ptr==NULL.
mz_ulong mz_adler32(mz_ulong adler, const unsigned char *ptr, size_t buf_len);
#define MZ_CRC32_INIT (0)
// mz_crc32() returns the initial CRC-32 value to use when called with ptr==NULL.
mz_ulong mz_crc32(mz_ulong crc, const unsigned char *ptr, size_t buf_len);
// Compression strategies.
enum { MZ_DEFAULT_STRATEGY = 0, MZ_FILTERED = 1, MZ_HUFFMAN_ONLY = 2, MZ_RLE = 3, MZ_FIXED = 4 };
// Method
#define MZ_DEFLATED 8
#ifndef MINIZ_NO_ZLIB_APIS
// Heap allocation callbacks.
// Note that mz_alloc_func parameter types purpsosely differ from zlib's: items/size is size_t, not unsigned long.
typedef void *(*mz_alloc_func)(void *opaque, size_t items, size_t size);
typedef void (*mz_free_func)(void *opaque, void *address);
typedef void *(*mz_realloc_func)(void *opaque, void *address, size_t items, size_t size);
#define MZ_VERSION "9.1.15"
#define MZ_VERNUM 0x91F0
#define MZ_VER_MAJOR 9
#define MZ_VER_MINOR 1
#define MZ_VER_REVISION 15
#define MZ_VER_SUBREVISION 0
// Flush values. For typical usage you only need MZ_NO_FLUSH and MZ_FINISH. The other values are for advanced use (refer to the zlib docs).
enum { MZ_NO_FLUSH = 0, MZ_PARTIAL_FLUSH = 1, MZ_SYNC_FLUSH = 2, MZ_FULL_FLUSH = 3, MZ_FINISH = 4, MZ_BLOCK = 5 };
// Compression levels: 0-9 are the standard zlib-style levels, 10 is best possible compression (not zlib compatible, and may be very slow), MZ_DEFAULT_COMPRESSION=MZ_DEFAULT_LEVEL.
enum { MZ_NO_COMPRESSION = 0, MZ_BEST_SPEED = 1, MZ_BEST_COMPRESSION = 9, MZ_UBER_COMPRESSION = 10, MZ_DEFAULT_LEVEL = 6, MZ_DEFAULT_COMPRESSION = -1 };
// Window bits
#define MZ_DEFAULT_WINDOW_BITS 15
struct mz_internal_state;
// Compression/decompression stream struct.
typedef struct mz_stream_s
{
const unsigned char *next_in; // pointer to next byte to read
unsigned int avail_in; // number of bytes available at next_in
mz_ulong total_in; // total number of bytes consumed so far
unsigned char *next_out; // pointer to next byte to write
unsigned int avail_out; // number of bytes that can be written to next_out
mz_ulong total_out; // total number of bytes produced so far
char *msg; // error msg (unused)
struct mz_internal_state *state; // internal state, allocated by zalloc/zfree
mz_alloc_func zalloc; // optional heap allocation function (defaults to malloc)
mz_free_func zfree; // optional heap free function (defaults to free)
void *opaque; // heap alloc function user pointer
int data_type; // data_type (unused)
mz_ulong adler; // adler32 of the source or uncompressed data
mz_ulong reserved; // not used
} mz_stream;
typedef mz_stream *mz_streamp;
// Returns the version string of miniz.c.
const char *mz_version(void);
// mz_deflateInit() initializes a compressor with default options:
// Parameters:
// pStream must point to an initialized mz_stream struct.
// level must be between [MZ_NO_COMPRESSION, MZ_BEST_COMPRESSION].
// level 1 enables a specially optimized compression function that's been optimized purely for performance, not ratio.
// (This special func. is currently only enabled when MINIZ_USE_UNALIGNED_LOADS_AND_STORES and MINIZ_LITTLE_ENDIAN are defined.)
// Return values:
// MZ_OK on success.
// MZ_STREAM_ERROR if the stream is bogus.
// MZ_PARAM_ERROR if the input parameters are bogus.
// MZ_MEM_ERROR on out of memory.
int mz_deflateInit(mz_streamp pStream, int level);
// mz_deflateInit2() is like mz_deflate(), except with more control:
// Additional parameters:
// method must be MZ_DEFLATED
// window_bits must be MZ_DEFAULT_WINDOW_BITS (to wrap the deflate stream with zlib header/adler-32 footer) or -MZ_DEFAULT_WINDOW_BITS (raw deflate/no header or footer)
// mem_level must be between [1, 9] (it's checked but ignored by miniz.c)
int mz_deflateInit2(mz_streamp pStream, int level, int method, int window_bits, int mem_level, int strategy);
// Quickly resets a compressor without having to reallocate anything. Same as calling mz_deflateEnd() followed by mz_deflateInit()/mz_deflateInit2().
int mz_deflateReset(mz_streamp pStream);
// mz_deflate() compresses the input to output, consuming as much of the input and producing as much output as possible.
// Parameters:
// pStream is the stream to read from and write to. You must initialize/update the next_in, avail_in, next_out, and avail_out members.
// flush may be MZ_NO_FLUSH, MZ_PARTIAL_FLUSH/MZ_SYNC_FLUSH, MZ_FULL_FLUSH, or MZ_FINISH.
// Return values:
// MZ_OK on success (when flushing, or if more input is needed but not available, and/or there's more output to be written but the output buffer is full).
// MZ_STREAM_END if all input has been consumed and all output bytes have been written. Don't call mz_deflate() on the stream anymore.
// MZ_STREAM_ERROR if the stream is bogus.
// MZ_PARAM_ERROR if one of the parameters is invalid.
// MZ_BUF_ERROR if no forward progress is possible because the input and/or output buffers are empty. (Fill up the input buffer or free up some output space and try again.)
int mz_deflate(mz_streamp pStream, int flush);
// mz_deflateEnd() deinitializes a compressor:
// Return values:
// MZ_OK on success.
// MZ_STREAM_ERROR if the stream is bogus.
int mz_deflateEnd(mz_streamp pStream);
// mz_deflateBound() returns a (very) conservative upper bound on the amount of data that could be generated by deflate(), assuming flush is set to only MZ_NO_FLUSH or MZ_FINISH.
mz_ulong mz_deflateBound(mz_streamp pStream, mz_ulong source_len);
// Single-call compression functions mz_compress() and mz_compress2():
// Returns MZ_OK on success, or one of the error codes from mz_deflate() on failure.
int mz_compress(unsigned char *pDest, mz_ulong *pDest_len, const unsigned char *pSource, mz_ulong source_len);
int mz_compress2(unsigned char *pDest, mz_ulong *pDest_len, const unsigned char *pSource, mz_ulong source_len, int level);
// mz_compressBound() returns a (very) conservative upper bound on the amount of data that could be generated by calling mz_compress().
mz_ulong mz_compressBound(mz_ulong source_len);
// Initializes a decompressor.
int mz_inflateInit(mz_streamp pStream);
// mz_inflateInit2() is like mz_inflateInit() with an additional option that controls the window size and whether or not the stream has been wrapped with a zlib header/footer:
// window_bits must be MZ_DEFAULT_WINDOW_BITS (to parse zlib header/footer) or -MZ_DEFAULT_WINDOW_BITS (raw deflate).
int mz_inflateInit2(mz_streamp pStream, int window_bits);
// Decompresses the input stream to the output, consuming only as much of the input as needed, and writing as much to the output as possible.
// Parameters:
// pStream is the stream to read from and write to. You must initialize/update the next_in, avail_in, next_out, and avail_out members.
// flush may be MZ_NO_FLUSH, MZ_SYNC_FLUSH, or MZ_FINISH.
// On the first call, if flush is MZ_FINISH it's assumed the input and output buffers are both sized large enough to decompress the entire stream in a single call (this is slightly faster).
// MZ_FINISH implies that there are no more source bytes available beside what's already in the input buffer, and that the output buffer is large enough to hold the rest of the decompressed data.
// Return values:
// MZ_OK on success. Either more input is needed but not available, and/or there's more output to be written but the output buffer is full.
// MZ_STREAM_END if all needed input has been consumed and all output bytes have been written. For zlib streams, the adler-32 of the decompressed data has also been verified.
// MZ_STREAM_ERROR if the stream is bogus.
// MZ_DATA_ERROR if the deflate stream is invalid.
// MZ_PARAM_ERROR if one of the parameters is invalid.
// MZ_BUF_ERROR if no forward progress is possible because the input buffer is empty but the inflater needs more input to continue, or if the output buffer is not large enough. Call mz_inflate() again
// with more input data, or with more room in the output buffer (except when using single call decompression, described above).
int mz_inflate(mz_streamp pStream, int flush);
// Deinitializes a decompressor.
int mz_inflateEnd(mz_streamp pStream);
// Single-call decompression.
// Returns MZ_OK on success, or one of the error codes from mz_inflate() on failure.
int mz_uncompress(unsigned char *pDest, mz_ulong *pDest_len, const unsigned char *pSource, mz_ulong source_len);
// Returns a string description of the specified error code, or NULL if the error code is invalid.
const char *mz_error(int err);
// Redefine zlib-compatible names to miniz equivalents, so miniz.c can be used as a drop-in replacement for the subset of zlib that miniz.c supports.
// Define MINIZ_NO_ZLIB_COMPATIBLE_NAMES to disable zlib-compatibility if you use zlib in the same project.
#ifndef MINIZ_NO_ZLIB_COMPATIBLE_NAMES
typedef unsigned char Byte;
typedef unsigned int uInt;
typedef mz_ulong uLong;
typedef Byte Bytef;
typedef uInt uIntf;
typedef char charf;
typedef int intf;
typedef void *voidpf;
typedef uLong uLongf;
typedef void *voidp;
typedef void *const voidpc;
#define Z_NULL 0
#define Z_NO_FLUSH MZ_NO_FLUSH
#define Z_PARTIAL_FLUSH MZ_PARTIAL_FLUSH
#define Z_SYNC_FLUSH MZ_SYNC_FLUSH
#define Z_FULL_FLUSH MZ_FULL_FLUSH
#define Z_FINISH MZ_FINISH
#define Z_BLOCK MZ_BLOCK
#define Z_OK MZ_OK
#define Z_STREAM_END MZ_STREAM_END
#define Z_NEED_DICT MZ_NEED_DICT
#define Z_ERRNO MZ_ERRNO
#define Z_STREAM_ERROR MZ_STREAM_ERROR
#define Z_DATA_ERROR MZ_DATA_ERROR
#define Z_MEM_ERROR MZ_MEM_ERROR
#define Z_BUF_ERROR MZ_BUF_ERROR
#define Z_VERSION_ERROR MZ_VERSION_ERROR
#define Z_PARAM_ERROR MZ_PARAM_ERROR
#define Z_NO_COMPRESSION MZ_NO_COMPRESSION
#define Z_BEST_SPEED MZ_BEST_SPEED
#define Z_BEST_COMPRESSION MZ_BEST_COMPRESSION
#define Z_DEFAULT_COMPRESSION MZ_DEFAULT_COMPRESSION
#define Z_DEFAULT_STRATEGY MZ_DEFAULT_STRATEGY
#define Z_FILTERED MZ_FILTERED
#define Z_HUFFMAN_ONLY MZ_HUFFMAN_ONLY
#define Z_RLE MZ_RLE
#define Z_FIXED MZ_FIXED
#define Z_DEFLATED MZ_DEFLATED
#define Z_DEFAULT_WINDOW_BITS MZ_DEFAULT_WINDOW_BITS
#define alloc_func mz_alloc_func
#define free_func mz_free_func
#define internal_state mz_internal_state
#define z_stream mz_stream
#define deflateInit mz_deflateInit
#define deflateInit2 mz_deflateInit2
#define deflateReset mz_deflateReset
#define deflate mz_deflate
#define deflateEnd mz_deflateEnd
#define deflateBound mz_deflateBound
#define compress mz_compress
#define compress2 mz_compress2
#define compressBound mz_compressBound
#define inflateInit mz_inflateInit
#define inflateInit2 mz_inflateInit2
#define inflate mz_inflate
#define inflateEnd mz_inflateEnd
#define uncompress mz_uncompress
#define crc32 mz_crc32
#define adler32 mz_adler32
#define MAX_WBITS 15
#define MAX_MEM_LEVEL 9
#define zError mz_error
#define ZLIB_VERSION MZ_VERSION
#define ZLIB_VERNUM MZ_VERNUM
#define ZLIB_VER_MAJOR MZ_VER_MAJOR
#define ZLIB_VER_MINOR MZ_VER_MINOR
#define ZLIB_VER_REVISION MZ_VER_REVISION
#define ZLIB_VER_SUBREVISION MZ_VER_SUBREVISION
#define zlibVersion mz_version
#define zlib_version mz_version()
#endif // #ifndef MINIZ_NO_ZLIB_COMPATIBLE_NAMES
#endif // MINIZ_NO_ZLIB_APIS
// ------------------- Types and macros
typedef unsigned char mz_uint8;
typedef signed short mz_int16;
typedef unsigned short mz_uint16;
typedef unsigned int mz_uint32;
typedef unsigned int mz_uint;
typedef long long mz_int64;
typedef unsigned long long mz_uint64;
typedef int mz_bool;
#define MZ_FALSE (0)
#define MZ_TRUE (1)
// An attempt to work around MSVC's spammy "warning C4127: conditional expression is constant" message.
#ifdef _MSC_VER
#define MZ_MACRO_END while (0, 0)
#else
#define MZ_MACRO_END while (0)
#endif
// ------------------- ZIP archive reading/writing
#ifndef MINIZ_NO_ARCHIVE_APIS
enum
{
MZ_ZIP_MAX_IO_BUF_SIZE = 64*1024,
MZ_ZIP_MAX_ARCHIVE_FILENAME_SIZE = 260,
MZ_ZIP_MAX_ARCHIVE_FILE_COMMENT_SIZE = 256
};
typedef struct
{
mz_uint32 m_file_index;
mz_uint32 m_central_dir_ofs;
mz_uint16 m_version_made_by;
mz_uint16 m_version_needed;
mz_uint16 m_bit_flag;
mz_uint16 m_method;
#ifndef MINIZ_NO_TIME
time_t m_time;
#endif
mz_uint32 m_crc32;
mz_uint64 m_comp_size;
mz_uint64 m_uncomp_size;
mz_uint16 m_internal_attr;
mz_uint32 m_external_attr;
mz_uint64 m_local_header_ofs;
mz_uint32 m_comment_size;
char m_filename[MZ_ZIP_MAX_ARCHIVE_FILENAME_SIZE];
char m_comment[MZ_ZIP_MAX_ARCHIVE_FILE_COMMENT_SIZE];
} mz_zip_archive_file_stat;
typedef size_t (*mz_file_read_func)(void *pOpaque, mz_uint64 file_ofs, void *pBuf, size_t n);
typedef size_t (*mz_file_write_func)(void *pOpaque, mz_uint64 file_ofs, const void *pBuf, size_t n);
struct mz_zip_internal_state_tag;
typedef struct mz_zip_internal_state_tag mz_zip_internal_state;
typedef enum
{
MZ_ZIP_MODE_INVALID = 0,
MZ_ZIP_MODE_READING = 1,
MZ_ZIP_MODE_WRITING = 2,
MZ_ZIP_MODE_WRITING_HAS_BEEN_FINALIZED = 3
} mz_zip_mode;
typedef struct mz_zip_archive_tag
{
mz_uint64 m_archive_size;
mz_uint64 m_central_directory_file_ofs;
mz_uint m_total_files;
mz_zip_mode m_zip_mode;
mz_uint m_file_offset_alignment;
mz_alloc_func m_pAlloc;
mz_free_func m_pFree;
mz_realloc_func m_pRealloc;
void *m_pAlloc_opaque;
mz_file_read_func m_pRead;
mz_file_write_func m_pWrite;
void *m_pIO_opaque;
mz_zip_internal_state *m_pState;
} mz_zip_archive;
typedef enum
{
MZ_ZIP_FLAG_CASE_SENSITIVE = 0x0100,
MZ_ZIP_FLAG_IGNORE_PATH = 0x0200,
MZ_ZIP_FLAG_COMPRESSED_DATA = 0x0400,
MZ_ZIP_FLAG_DO_NOT_SORT_CENTRAL_DIRECTORY = 0x0800
} mz_zip_flags;
// ZIP archive reading
// Inits a ZIP archive reader.
// These functions read and validate the archive's central directory.
mz_bool mz_zip_reader_init(mz_zip_archive *pZip, mz_uint64 size, mz_uint32 flags);
mz_bool mz_zip_reader_init_mem(mz_zip_archive *pZip, const void *pMem, size_t size, mz_uint32 flags);
#ifndef MINIZ_NO_STDIO
mz_bool mz_zip_reader_init_file(mz_zip_archive *pZip, const char *pFilename, mz_uint32 flags);
#endif
// Returns the total number of files in the archive.
mz_uint mz_zip_reader_get_num_files(mz_zip_archive *pZip);
// Returns detailed information about an archive file entry.
mz_bool mz_zip_reader_file_stat(mz_zip_archive *pZip, mz_uint file_index, mz_zip_archive_file_stat *pStat);
// Determines if an archive file entry is a directory entry.
mz_bool mz_zip_reader_is_file_a_directory(mz_zip_archive *pZip, mz_uint file_index);
mz_bool mz_zip_reader_is_file_encrypted(mz_zip_archive *pZip, mz_uint file_index);
// Retrieves the filename of an archive file entry.
// Returns the number of bytes written to pFilename, or if filename_buf_size is 0 this function returns the number of bytes needed to fully store the filename.
mz_uint mz_zip_reader_get_filename(mz_zip_archive *pZip, mz_uint file_index, char *pFilename, mz_uint filename_buf_size);
// Attempts to locates a file in the archive's central directory.
// Valid flags: MZ_ZIP_FLAG_CASE_SENSITIVE, MZ_ZIP_FLAG_IGNORE_PATH
// Returns -1 if the file cannot be found.
int mz_zip_reader_locate_file(mz_zip_archive *pZip, const char *pName, const char *pComment, mz_uint flags);
// Extracts a archive file to a memory buffer using no memory allocation.
mz_bool mz_zip_reader_extract_to_mem_no_alloc(mz_zip_archive *pZip, mz_uint file_index, void *pBuf, size_t buf_size, mz_uint flags, void *pUser_read_buf, size_t user_read_buf_size);
mz_bool mz_zip_reader_extract_file_to_mem_no_alloc(mz_zip_archive *pZip, const char *pFilename, void *pBuf, size_t buf_size, mz_uint flags, void *pUser_read_buf, size_t user_read_buf_size);
// Extracts a archive file to a memory buffer.
mz_bool mz_zip_reader_extract_to_mem(mz_zip_archive *pZip, mz_uint file_index, void *pBuf, size_t buf_size, mz_uint flags);
mz_bool mz_zip_reader_extract_file_to_mem(mz_zip_archive *pZip, const char *pFilename, void *pBuf, size_t buf_size, mz_uint flags);
// Extracts a archive file to a dynamically allocated heap buffer.
void *mz_zip_reader_extract_to_heap(mz_zip_archive *pZip, mz_uint file_index, size_t *pSize, mz_uint flags);
void *mz_zip_reader_extract_file_to_heap(mz_zip_archive *pZip, const char *pFilename, size_t *pSize, mz_uint flags);
// Extracts a archive file using a callback function to output the file's data.
mz_bool mz_zip_reader_extract_to_callback(mz_zip_archive *pZip, mz_uint file_index, mz_file_write_func pCallback, void *pOpaque, mz_uint flags);
mz_bool mz_zip_reader_extract_file_to_callback(mz_zip_archive *pZip, const char *pFilename, mz_file_write_func pCallback, void *pOpaque, mz_uint flags);
#ifndef MINIZ_NO_STDIO
// Extracts a archive file to a disk file and sets its last accessed and modified times.
// This function only extracts files, not archive directory records.
mz_bool mz_zip_reader_extract_to_file(mz_zip_archive *pZip, mz_uint file_index, const char *pDst_filename, mz_uint flags);
mz_bool mz_zip_reader_extract_file_to_file(mz_zip_archive *pZip, const char *pArchive_filename, const char *pDst_filename, mz_uint flags);
#endif
// Ends archive reading, freeing all allocations, and closing the input archive file if mz_zip_reader_init_file() was used.
mz_bool mz_zip_reader_end(mz_zip_archive *pZip);
// ZIP archive writing
#ifndef MINIZ_NO_ARCHIVE_WRITING_APIS
// Inits a ZIP archive writer.
mz_bool mz_zip_writer_init(mz_zip_archive *pZip, mz_uint64 existing_size);
mz_bool mz_zip_writer_init_heap(mz_zip_archive *pZip, size_t size_to_reserve_at_beginning, size_t initial_allocation_size);
#ifndef MINIZ_NO_STDIO
mz_bool mz_zip_writer_init_file(mz_zip_archive *pZip, const char *pFilename, mz_uint64 size_to_reserve_at_beginning);
#endif
// Converts a ZIP archive reader object into a writer object, to allow efficient in-place file appends to occur on an existing archive.
// For archives opened using mz_zip_reader_init_file, pFilename must be the archive's filename so it can be reopened for writing. If the file can't be reopened, mz_zip_reader_end() will be called.
// For archives opened using mz_zip_reader_init_mem, the memory block must be growable using the realloc callback (which defaults to realloc unless you've overridden it).
// Finally, for archives opened using mz_zip_reader_init, the mz_zip_archive's user provided m_pWrite function cannot be NULL.
// Note: In-place archive modification is not recommended unless you know what you're doing, because if execution stops or something goes wrong before
// the archive is finalized the file's central directory will be hosed.
mz_bool mz_zip_writer_init_from_reader(mz_zip_archive *pZip, const char *pFilename);
// Adds the contents of a memory buffer to an archive. These functions record the current local time into the archive.
// To add a directory entry, call this method with an archive name ending in a forwardslash with empty buffer.
// level_and_flags - compression level (0-10, see MZ_BEST_SPEED, MZ_BEST_COMPRESSION, etc.) logically OR'd with zero or more mz_zip_flags, or just set to MZ_DEFAULT_COMPRESSION.
mz_bool mz_zip_writer_add_mem(mz_zip_archive *pZip, const char *pArchive_name, const void *pBuf, size_t buf_size, mz_uint level_and_flags);
mz_bool mz_zip_writer_add_mem_ex(mz_zip_archive *pZip, const char *pArchive_name, const void *pBuf, size_t buf_size, const void *pComment, mz_uint16 comment_size, mz_uint level_and_flags, mz_uint64 uncomp_size, mz_uint32 uncomp_crc32);
#ifndef MINIZ_NO_STDIO
// Adds the contents of a disk file to an archive. This function also records the disk file's modified time into the archive.
// level_and_flags - compression level (0-10, see MZ_BEST_SPEED, MZ_BEST_COMPRESSION, etc.) logically OR'd with zero or more mz_zip_flags, or just set to MZ_DEFAULT_COMPRESSION.
mz_bool mz_zip_writer_add_file(mz_zip_archive *pZip, const char *pArchive_name, const char *pSrc_filename, const void *pComment, mz_uint16 comment_size, mz_uint level_and_flags);
#endif
// Adds a file to an archive by fully cloning the data from another archive.
// This function fully clones the source file's compressed data (no recompression), along with its full filename, extra data, and comment fields.
mz_bool mz_zip_writer_add_from_zip_reader(mz_zip_archive *pZip, mz_zip_archive *pSource_zip, mz_uint file_index);
// Finalizes the archive by writing the central directory records followed by the end of central directory record.
// After an archive is finalized, the only valid call on the mz_zip_archive struct is mz_zip_writer_end().
// An archive must be manually finalized by calling this function for it to be valid.
mz_bool mz_zip_writer_finalize_archive(mz_zip_archive *pZip);
mz_bool mz_zip_writer_finalize_heap_archive(mz_zip_archive *pZip, void **pBuf, size_t *pSize);
// Ends archive writing, freeing all allocations, and closing the output file if mz_zip_writer_init_file() was used.
// Note for the archive to be valid, it must have been finalized before ending.
mz_bool mz_zip_writer_end(mz_zip_archive *pZip);
// Misc. high-level helper functions:
// mz_zip_add_mem_to_archive_file_in_place() efficiently (but not atomically) appends a memory blob to a ZIP archive.
// level_and_flags - compression level (0-10, see MZ_BEST_SPEED, MZ_BEST_COMPRESSION, etc.) logically OR'd with zero or more mz_zip_flags, or just set to MZ_DEFAULT_COMPRESSION.
mz_bool mz_zip_add_mem_to_archive_file_in_place(const char *pZip_filename, const char *pArchive_name, const void *pBuf, size_t buf_size, const void *pComment, mz_uint16 comment_size, mz_uint level_and_flags);
// Reads a single file from an archive into a heap block.
// Returns NULL on failure.
void *mz_zip_extract_archive_file_to_heap(const char *pZip_filename, const char *pArchive_name, size_t *pSize, mz_uint zip_flags);
#endif // #ifndef MINIZ_NO_ARCHIVE_WRITING_APIS
#endif // #ifndef MINIZ_NO_ARCHIVE_APIS
// ------------------- Low-level Decompression API Definitions
// Decompression flags used by tinfl_decompress().
// TINFL_FLAG_PARSE_ZLIB_HEADER: If set, the input has a valid zlib header and ends with an adler32 checksum (it's a valid zlib stream). Otherwise, the input is a raw deflate stream.
// TINFL_FLAG_HAS_MORE_INPUT: If set, there are more input bytes available beyond the end of the supplied input buffer. If clear, the input buffer contains all remaining input.
// TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF: If set, the output buffer is large enough to hold the entire decompressed stream. If clear, the output buffer is at least the size of the dictionary (typically 32KB).
// TINFL_FLAG_COMPUTE_ADLER32: Force adler-32 checksum computation of the decompressed bytes.
enum
{
TINFL_FLAG_PARSE_ZLIB_HEADER = 1,
TINFL_FLAG_HAS_MORE_INPUT = 2,
TINFL_FLAG_USING_NON_WRAPPING_OUTPUT_BUF = 4,
TINFL_FLAG_COMPUTE_ADLER32 = 8
};
// High level decompression functions:
// tinfl_decompress_mem_to_heap() decompresses a block in memory to a heap block allocated via malloc().
// On entry:
// pSrc_buf, src_buf_len: Pointer and size of the Deflate or zlib source data to decompress.
// On return:
// Function returns a pointer to the decompressed data, or NULL on failure.
// *pOut_len will be set to the decompressed data's size, which could be larger than src_buf_len on uncompressible data.
// The caller must call mz_free() on the returned block when it's no longer needed.
void *tinfl_decompress_mem_to_heap(const void *pSrc_buf, size_t src_buf_len, size_t *pOut_len, int flags);
// tinfl_decompress_mem_to_mem() decompresses a block in memory to another block in memory.
// Returns TINFL_DECOMPRESS_MEM_TO_MEM_FAILED on failure, or the number of bytes written on success.
#define TINFL_DECOMPRESS_MEM_TO_MEM_FAILED ((size_t)(-1))
size_t tinfl_decompress_mem_to_mem(void *pOut_buf, size_t out_buf_len, const void *pSrc_buf, size_t src_buf_len, int flags);
// tinfl_decompress_mem_to_callback() decompresses a block in memory to an internal 32KB buffer, and a user provided callback function will be called to flush the buffer.
// Returns 1 on success or 0 on failure.
typedef int (*tinfl_put_buf_func_ptr)(const void* pBuf, int len, void *pUser);
int tinfl_decompress_mem_to_callback(const void *pIn_buf, size_t *pIn_buf_size, tinfl_put_buf_func_ptr pPut_buf_func, void *pPut_buf_user, int flags);
struct tinfl_decompressor_tag; typedef struct tinfl_decompressor_tag tinfl_decompressor;
// Max size of LZ dictionary.
#define TINFL_LZ_DICT_SIZE 32768
// Return status.
typedef enum
{
TINFL_STATUS_BAD_PARAM = -3,
TINFL_STATUS_ADLER32_MISMATCH = -2,
TINFL_STATUS_FAILED = -1,
TINFL_STATUS_DONE = 0,
TINFL_STATUS_NEEDS_MORE_INPUT = 1,
TINFL_STATUS_HAS_MORE_OUTPUT = 2
} tinfl_status;
// Initializes the decompressor to its initial state.
#define tinfl_init(r) do { (r)->m_state = 0; } MZ_MACRO_END
#define tinfl_get_adler32(r) (r)->m_check_adler32
// Main low-level decompressor coroutine function. This is the only function actually needed for decompression. All the other functions are just high-level helpers for improved usability.
// This is a universal API, i.e. it can be used as a building block to build any desired higher level decompression API. In the limit case, it can be called once per every byte input or output.
tinfl_status tinfl_decompress(tinfl_decompressor *r, const mz_uint8 *pIn_buf_next, size_t *pIn_buf_size, mz_uint8 *pOut_buf_start, mz_uint8 *pOut_buf_next, size_t *pOut_buf_size, const mz_uint32 decomp_flags);
// Internal/private bits follow.
enum
{
TINFL_MAX_HUFF_TABLES = 3, TINFL_MAX_HUFF_SYMBOLS_0 = 288, TINFL_MAX_HUFF_SYMBOLS_1 = 32, TINFL_MAX_HUFF_SYMBOLS_2 = 19,
TINFL_FAST_LOOKUP_BITS = 10, TINFL_FAST_LOOKUP_SIZE = 1 << TINFL_FAST_LOOKUP_BITS
};
typedef struct
{
mz_uint8 m_code_size[TINFL_MAX_HUFF_SYMBOLS_0];
mz_int16 m_look_up[TINFL_FAST_LOOKUP_SIZE], m_tree[TINFL_MAX_HUFF_SYMBOLS_0 * 2];
} tinfl_huff_table;
#if MINIZ_HAS_64BIT_REGISTERS
#define TINFL_USE_64BIT_BITBUF 1
#endif
#if TINFL_USE_64BIT_BITBUF
typedef mz_uint64 tinfl_bit_buf_t;
#define TINFL_BITBUF_SIZE (64)
#else
typedef mz_uint32 tinfl_bit_buf_t;
#define TINFL_BITBUF_SIZE (32)
#endif
struct tinfl_decompressor_tag
{
mz_uint32 m_state, m_num_bits, m_zhdr0, m_zhdr1, m_z_adler32, m_final, m_type, m_check_adler32, m_dist, m_counter, m_num_extra, m_table_sizes[TINFL_MAX_HUFF_TABLES];
tinfl_bit_buf_t m_bit_buf;
size_t m_dist_from_out_buf_start;
tinfl_huff_table m_tables[TINFL_MAX_HUFF_TABLES];
mz_uint8 m_raw_header[4], m_len_codes[TINFL_MAX_HUFF_SYMBOLS_0 + TINFL_MAX_HUFF_SYMBOLS_1 + 137];
};
// ------------------- Low-level Compression API Definitions
// Set TDEFL_LESS_MEMORY to 1 to use less memory (compression will be slightly slower, and raw/dynamic blocks will be output more frequently).
#define TDEFL_LESS_MEMORY 0
// tdefl_init() compression flags logically OR'd together (low 12 bits contain the max. number of probes per dictionary search):
// TDEFL_DEFAULT_MAX_PROBES: The compressor defaults to 128 dictionary probes per dictionary search. 0=Huffman only, 1=Huffman+LZ (fastest/crap compression), 4095=Huffman+LZ (slowest/best compression).
enum
{
TDEFL_HUFFMAN_ONLY = 0, TDEFL_DEFAULT_MAX_PROBES = 128, TDEFL_MAX_PROBES_MASK = 0xFFF
};
// TDEFL_WRITE_ZLIB_HEADER: If set, the compressor outputs a zlib header before the deflate data, and the Adler-32 of the source data at the end. Otherwise, you'll get raw deflate data.
// TDEFL_COMPUTE_ADLER32: Always compute the adler-32 of the input data (even when not writing zlib headers).
// TDEFL_GREEDY_PARSING_FLAG: Set to use faster greedy parsing, instead of more efficient lazy parsing.
// TDEFL_NONDETERMINISTIC_PARSING_FLAG: Enable to decrease the compressor's initialization time to the minimum, but the output may vary from run to run given the same input (depending on the contents of memory).
// TDEFL_RLE_MATCHES: Only look for RLE matches (matches with a distance of 1)
// TDEFL_FILTER_MATCHES: Discards matches <= 5 chars if enabled.
// TDEFL_FORCE_ALL_STATIC_BLOCKS: Disable usage of optimized Huffman tables.
// TDEFL_FORCE_ALL_RAW_BLOCKS: Only use raw (uncompressed) deflate blocks.
// The low 12 bits are reserved to control the max # of hash probes per dictionary lookup (see TDEFL_MAX_PROBES_MASK).
enum
{
TDEFL_WRITE_ZLIB_HEADER = 0x01000,
TDEFL_COMPUTE_ADLER32 = 0x02000,
TDEFL_GREEDY_PARSING_FLAG = 0x04000,
TDEFL_NONDETERMINISTIC_PARSING_FLAG = 0x08000,
TDEFL_RLE_MATCHES = 0x10000,
TDEFL_FILTER_MATCHES = 0x20000,
TDEFL_FORCE_ALL_STATIC_BLOCKS = 0x40000,
TDEFL_FORCE_ALL_RAW_BLOCKS = 0x80000
};
// High level compression functions:
// tdefl_compress_mem_to_heap() compresses a block in memory to a heap block allocated via malloc().
// On entry:
// pSrc_buf, src_buf_len: Pointer and size of source block to compress.
// flags: The max match finder probes (default is 128) logically OR'd against the above flags. Higher probes are slower but improve compression.
// On return:
// Function returns a pointer to the compressed data, or NULL on failure.
// *pOut_len will be set to the compressed data's size, which could be larger than src_buf_len on uncompressible data.
// The caller must free() the returned block when it's no longer needed.
void *tdefl_compress_mem_to_heap(const void *pSrc_buf, size_t src_buf_len, size_t *pOut_len, int flags);
// tdefl_compress_mem_to_mem() compresses a block in memory to another block in memory.
// Returns 0 on failure.
size_t tdefl_compress_mem_to_mem(void *pOut_buf, size_t out_buf_len, const void *pSrc_buf, size_t src_buf_len, int flags);
// Compresses an image to a compressed PNG file in memory.
// On entry:
// pImage, w, h, and num_chans describe the image to compress. num_chans may be 1, 2, 3, or 4.
// The image pitch in bytes per scanline will be w*num_chans. The leftmost pixel on the top scanline is stored first in memory.
// level may range from [0,10], use MZ_NO_COMPRESSION, MZ_BEST_SPEED, MZ_BEST_COMPRESSION, etc. or a decent default is MZ_DEFAULT_LEVEL
// If flip is true, the image will be flipped on the Y axis (useful for OpenGL apps).
// On return:
// Function returns a pointer to the compressed data, or NULL on failure.
// *pLen_out will be set to the size of the PNG image file.
// The caller must mz_free() the returned heap block (which will typically be larger than *pLen_out) when it's no longer needed.
void *tdefl_write_image_to_png_file_in_memory_ex(const void *pImage, int w, int h, int num_chans, size_t *pLen_out, mz_uint level, mz_bool flip);
void *tdefl_write_image_to_png_file_in_memory(const void *pImage, int w, int h, int num_chans, size_t *pLen_out);
// Output stream interface. The compressor uses this interface to write compressed data. It'll typically be called TDEFL_OUT_BUF_SIZE at a time.
typedef mz_bool (*tdefl_put_buf_func_ptr)(const void* pBuf, int len, void *pUser);
// tdefl_compress_mem_to_output() compresses a block to an output stream. The above helpers use this function internally.
mz_bool tdefl_compress_mem_to_output(const void *pBuf, size_t buf_len, tdefl_put_buf_func_ptr pPut_buf_func, void *pPut_buf_user, int flags);
enum { TDEFL_MAX_HUFF_TABLES = 3, TDEFL_MAX_HUFF_SYMBOLS_0 = 288, TDEFL_MAX_HUFF_SYMBOLS_1 = 32, TDEFL_MAX_HUFF_SYMBOLS_2 = 19, TDEFL_LZ_DICT_SIZE = 32768, TDEFL_LZ_DICT_SIZE_MASK = TDEFL_LZ_DICT_SIZE - 1, TDEFL_MIN_MATCH_LEN = 3, TDEFL_MAX_MATCH_LEN = 258 };
// TDEFL_OUT_BUF_SIZE MUST be large enough to hold a single entire compressed output block (using static/fixed Huffman codes).
#if TDEFL_LESS_MEMORY
enum { TDEFL_LZ_CODE_BUF_SIZE = 24 * 1024, TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13 ) / 10, TDEFL_MAX_HUFF_SYMBOLS = 288, TDEFL_LZ_HASH_BITS = 12, TDEFL_LEVEL1_HASH_SIZE_MASK = 4095, TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3, TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
#else
enum { TDEFL_LZ_CODE_BUF_SIZE = 64 * 1024, TDEFL_OUT_BUF_SIZE = (TDEFL_LZ_CODE_BUF_SIZE * 13 ) / 10, TDEFL_MAX_HUFF_SYMBOLS = 288, TDEFL_LZ_HASH_BITS = 15, TDEFL_LEVEL1_HASH_SIZE_MASK = 4095, TDEFL_LZ_HASH_SHIFT = (TDEFL_LZ_HASH_BITS + 2) / 3, TDEFL_LZ_HASH_SIZE = 1 << TDEFL_LZ_HASH_BITS };
#endif
// The low-level tdefl functions below may be used directly if the above helper functions aren't flexible enough. The low-level functions don't make any heap allocations, unlike the above helper functions.
typedef enum
{
TDEFL_STATUS_BAD_PARAM = -2,
TDEFL_STATUS_PUT_BUF_FAILED = -1,
TDEFL_STATUS_OKAY = 0,
TDEFL_STATUS_DONE = 1,
} tdefl_status;
// Must map to MZ_NO_FLUSH, MZ_SYNC_FLUSH, etc. enums
typedef enum
{
TDEFL_NO_FLUSH = 0,
TDEFL_SYNC_FLUSH = 2,
TDEFL_FULL_FLUSH = 3,
TDEFL_FINISH = 4
} tdefl_flush;
// tdefl's compression state structure.
typedef struct
{
tdefl_put_buf_func_ptr m_pPut_buf_func;
void *m_pPut_buf_user;
mz_uint m_flags, m_max_probes[2];
int m_greedy_parsing;
mz_uint m_adler32, m_lookahead_pos, m_lookahead_size, m_dict_size;
mz_uint8 *m_pLZ_code_buf, *m_pLZ_flags, *m_pOutput_buf, *m_pOutput_buf_end;
mz_uint m_num_flags_left, m_total_lz_bytes, m_lz_code_buf_dict_pos, m_bits_in, m_bit_buffer;
mz_uint m_saved_match_dist, m_saved_match_len, m_saved_lit, m_output_flush_ofs, m_output_flush_remaining, m_finished, m_block_index, m_wants_to_finish;
tdefl_status m_prev_return_status;
const void *m_pIn_buf;
void *m_pOut_buf;
size_t *m_pIn_buf_size, *m_pOut_buf_size;
tdefl_flush m_flush;
const mz_uint8 *m_pSrc;
size_t m_src_buf_left, m_out_buf_ofs;
mz_uint8 m_dict[TDEFL_LZ_DICT_SIZE + TDEFL_MAX_MATCH_LEN - 1];
mz_uint16 m_huff_count[TDEFL_MAX_HUFF_TABLES][TDEFL_MAX_HUFF_SYMBOLS];
mz_uint16 m_huff_codes[TDEFL_MAX_HUFF_TABLES][TDEFL_MAX_HUFF_SYMBOLS];
mz_uint8 m_huff_code_sizes[TDEFL_MAX_HUFF_TABLES][TDEFL_MAX_HUFF_SYMBOLS];
mz_uint8 m_lz_code_buf[TDEFL_LZ_CODE_BUF_SIZE];
mz_uint16 m_next[TDEFL_LZ_DICT_SIZE];
mz_uint16 m_hash[TDEFL_LZ_HASH_SIZE];
mz_uint8 m_output_buf[TDEFL_OUT_BUF_SIZE];
} tdefl_compressor;
// Initializes the compressor.
// There is no corresponding deinit() function because the tdefl API's do not dynamically allocate memory.
// pBut_buf_func: If NULL, output data will be supplied to the specified callback. In this case, the user should call the tdefl_compress_buffer() API for compression.
// If pBut_buf_func is NULL the user should always call the tdefl_compress() API.
// flags: See the above enums (TDEFL_HUFFMAN_ONLY, TDEFL_WRITE_ZLIB_HEADER, etc.)
tdefl_status tdefl_init(tdefl_compressor *d, tdefl_put_buf_func_ptr pPut_buf_func, void *pPut_buf_user, int flags);
// Compresses a block of data, consuming as much of the specified input buffer as possible, and writing as much compressed data to the specified output buffer as possible.
tdefl_status tdefl_compress(tdefl_compressor *d, const void *pIn_buf, size_t *pIn_buf_size, void *pOut_buf, size_t *pOut_buf_size, tdefl_flush flush);
// tdefl_compress_buffer() is only usable when the tdefl_init() is called with a non-NULL tdefl_put_buf_func_ptr.
// tdefl_compress_buffer() always consumes the entire input buffer.
tdefl_status tdefl_compress_buffer(tdefl_compressor *d, const void *pIn_buf, size_t in_buf_size, tdefl_flush flush);
tdefl_status tdefl_get_prev_return_status(tdefl_compressor *d);
mz_uint32 tdefl_get_adler32(tdefl_compressor *d);
// Can't use tdefl_create_comp_flags_from_zip_params if MINIZ_NO_ZLIB_APIS isn't defined, because it uses some of its macros.
#ifndef MINIZ_NO_ZLIB_APIS
// Create tdefl_compress() flags given zlib-style compression parameters.
// level may range from [0,10] (where 10 is absolute max compression, but may be much slower on some files)
// window_bits may be -15 (raw deflate) or 15 (zlib)
// strategy may be either MZ_DEFAULT_STRATEGY, MZ_FILTERED, MZ_HUFFMAN_ONLY, MZ_RLE, or MZ_FIXED
mz_uint tdefl_create_comp_flags_from_zip_params(int level, int window_bits, int strategy);
#endif // #ifndef MINIZ_NO_ZLIB_APIS
#ifdef __cplusplus
}
#endif
#endif // MINIZ_HEADER_INCLUDED

View File

@@ -65,3 +65,6 @@ add_subdirectory(select)
add_subdirectory(thread_dimensions) add_subdirectory(thread_dimensions)
add_subdirectory(vec_align) add_subdirectory(vec_align)
add_subdirectory(vec_step) add_subdirectory(vec_step)
# Add any extension folders
add_subdirectory(spir)

View File

@@ -1,24 +1,24 @@
project project
: requirements : requirements
<library>/harness//harness <library>/harness//harness
<warnings>off <warnings>off
; ;
use-project /harness : ../test_common/harness ; use-project /harness : ../test_common/harness ;
proj_lst = allocations api atomics basic buffers commonfns compiler proj_lst = allocations api atomics basic buffers commonfns compiler
computeinfo contractions conversions events geometrics gl computeinfo contractions conversions events geometrics gl
half images integer_ops math_brute_force multiple_device_context half images integer_ops math_brute_force multiple_device_context
profiling relationals select thread_dimensions ; profiling relationals select thread_dimensions ;
for proj in $(proj_lst) for proj in $(proj_lst)
{ {
build-project $(proj) ; build-project $(proj) ;
} }
install data install data
: [ glob *.csv ] [ glob *.py ] : [ glob *.csv ] [ glob *.py ]
: <variant>debug:<location>$(DIST)/debug/tests/test_conformance : <variant>debug:<location>$(DIST)/debug/tests/test_conformance
<variant>release:<location>$(DIST)/release/tests/test_conformance <variant>release:<location>$(DIST)/release/tests/test_conformance
; ;

View File

@@ -1,53 +1,53 @@
PRODUCTS = \ PRODUCTS = \
allocations/ \ allocations/ \
api/ \ api/ \
atomics/ \ atomics/ \
basic/ \ basic/ \
buffers/ \ buffers/ \
commonfns/ \ commonfns/ \
compiler/ \ compiler/ \
computeinfo/ \ computeinfo/ \
contractions/ \ contractions/ \
conversions/ \ conversions/ \
device_partition/ \ device_partition/ \
events/ \ events/ \
geometrics/ \ geometrics/ \
gl/ \ gl/ \
half/ \ half/ \
headers/ \ headers/ \
images/ \ images/ \
integer_ops/ \ integer_ops/ \
math_brute_force/ \ math_brute_force/ \
mem_host_flags/ \ mem_host_flags/ \
multiple_device_context/ \ multiple_device_context/ \
printf/ \ printf/ \
profiling/ \ profiling/ \
relationals/ \ relationals/ \
select/ \ select/ \
thread_dimensions/ \ thread_dimensions/ \
vec_align/ \ vec_align/ \
vec_step/ vec_step/
TOP=$(shell pwd) TOP=$(shell pwd)
all: $(PRODUCTS) all: $(PRODUCTS)
clean: clean:
@for testdir in $(dir $(PRODUCTS)) ; \ @for testdir in $(dir $(PRODUCTS)) ; \
do ( \ do ( \
echo "==================================================================================" ; \ echo "==================================================================================" ; \
echo "Cleaning $$testdir" ; \ echo "Cleaning $$testdir" ; \
echo "==================================================================================" ; \ echo "==================================================================================" ; \
cd $$testdir && make clean \ cd $$testdir && make clean \
); \ ); \
done \ done \
$(PRODUCTS): $(PRODUCTS):
@echo "==================================================================================" ; @echo "==================================================================================" ;
@echo "(`date "+%H:%M:%S"`) Make $@" ; @echo "(`date "+%H:%M:%S"`) Make $@" ;
@echo "==================================================================================" ; @echo "==================================================================================" ;
cd $(dir $@) && make -i cd $(dir $@) && make -i
.PHONY: clean $(PRODUCTS) all .PHONY: clean $(PRODUCTS) all

View File

@@ -1,19 +1,19 @@
project project
: requirements : requirements
# <toolset>gcc:<cflags>-xc++ # <toolset>gcc:<cflags>-xc++
# <toolset>msvc:<cflags>"/TP" # <toolset>msvc:<cflags>"/TP"
; ;
exe test_allocations exe test_allocations
: allocation_execute.cpp : allocation_execute.cpp
allocation_fill.cpp allocation_fill.cpp
allocation_functions.cpp allocation_functions.cpp
allocation_utils.cpp allocation_utils.cpp
main.cpp main.cpp
; ;
install dist install dist
: test_allocations : test_allocations
: <variant>debug:<location>$(DIST)/debug/tests/test_conformance/allocations : <variant>debug:<location>$(DIST)/debug/tests/test_conformance/allocations
<variant>release:<location>$(DIST)/release/tests/test_conformance/allocations <variant>release:<location>$(DIST)/release/tests/test_conformance/allocations
; ;

View File

@@ -1,46 +1,46 @@
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
SRCS = main.cpp \ SRCS = main.cpp \
allocation_functions.cpp \ allocation_functions.cpp \
allocation_fill.cpp \ allocation_fill.cpp \
allocation_utils.cpp \ allocation_utils.cpp \
allocation_execute.cpp \ allocation_execute.cpp \
../../test_common/harness/errorHelpers.c \ ../../test_common/harness/errorHelpers.c \
../../test_common/harness/threadTesting.c \ ../../test_common/harness/threadTesting.c \
../../test_common/harness/kernelHelpers.c \ ../../test_common/harness/kernelHelpers.c \
../../test_common/harness/testHarness.c \ ../../test_common/harness/testHarness.c \
../../test_common/harness/mt19937.c \ ../../test_common/harness/mt19937.c \
../../test_common/harness/typeWrappers.cpp ../../test_common/harness/typeWrappers.cpp
DEFINES = DONT_TEST_GARBAGE_POINTERS DEFINES = DONT_TEST_GARBAGE_POINTERS
SOURCES = $(abspath $(SRCS)) SOURCES = $(abspath $(SRCS))
LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries
LIBPATH += -L. LIBPATH += -L.
FRAMEWORK = $(SOURCES) FRAMEWORK = $(SOURCES)
HEADERS = HEADERS =
TARGET = test_allocations TARGET = test_allocations
INCLUDE = INCLUDE =
COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32 -Os COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32 -Os
CC = c++ CC = c++
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF} LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF}
OBJECTS := ${SOURCES:.c=.o} OBJECTS := ${SOURCES:.c=.o}
OBJECTS := ${OBJECTS:.cpp=.o} OBJECTS := ${OBJECTS:.cpp=.o}
TARGETOBJECT = TARGETOBJECT =
all: $(TARGET) all: $(TARGET)
$(TARGET): $(OBJECTS) $(TARGET): $(OBJECTS)
$(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES) $(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES)
clean: clean:
rm -f $(TARGET) $(OBJECTS) rm -f $(TARGET) $(OBJECTS)
.DEFAULT: .DEFAULT:
@echo The target \"$@\" does not exist in Makefile. @echo The target \"$@\" does not exist in Makefile.

View File

@@ -1,333 +1,333 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "allocation_execute.h" #include "allocation_execute.h"
#include "allocation_functions.h" #include "allocation_functions.h"
const char *buffer_kernel_pattern = { const char *buffer_kernel_pattern = {
"__kernel void sample_test(%s __global uint *result, __global uint *array_sizes, uint per_item)\n" "__kernel void sample_test(%s __global uint *result, __global uint *array_sizes, uint per_item)\n"
"{\n" "{\n"
"\tint tid = get_global_id(0);\n" "\tint tid = get_global_id(0);\n"
"\tuint r = 0;\n" "\tuint r = 0;\n"
"\tulong i;\n" "\tulong i;\n"
"\tfor(i=tid*per_item; i<(1+tid)*per_item; i++) {\n" "\tfor(i=tid*per_item; i<(1+tid)*per_item; i++) {\n"
"%s" "%s"
"\t}\n" "\t}\n"
"\tresult[tid] = r;\n" "\tresult[tid] = r;\n"
"}\n" }; "}\n" };
const char *image_kernel_pattern = { const char *image_kernel_pattern = {
"__kernel void sample_test(%s __global uint *result)\n" "__kernel void sample_test(%s __global uint *result)\n"
"{\n" "{\n"
"\tuint4 color;\n" "\tuint4 color;\n"
"\tcolor = (uint4)(0);\n" "\tcolor = (uint4)(0);\n"
"%s" "%s"
"\tint x, y;\n" "\tint x, y;\n"
"%s" "%s"
"\tresult[get_global_id(0)] += color.x + color.y + color.z + color.w;\n" "\tresult[get_global_id(0)] += color.x + color.y + color.z + color.w;\n"
"}\n" }; "}\n" };
const char *read_pattern = { const char *read_pattern = {
"\tfor(y=0; y<get_image_height(image%d); y++)\n" "\tfor(y=0; y<get_image_height(image%d); y++)\n"
"\t\tif (y %s get_global_size(0) == get_global_id(0))\n" "\t\tif (y %s get_global_size(0) == get_global_id(0))\n"
"\t\t\tfor (x=0; x<get_image_width(image%d); x++) {\n" "\t\t\tfor (x=0; x<get_image_width(image%d); x++) {\n"
"\t\t\t\tcolor += read_imageui(image%d, sampler, (int2)(x,y));\n" "\t\t\t\tcolor += read_imageui(image%d, sampler, (int2)(x,y));\n"
"\t\t\t}\n" "\t\t\t}\n"
}; };
const char *offset_pattern = const char *offset_pattern =
"\tconst uint4 offset = (uint4)(0,1,2,3);\n"; "\tconst uint4 offset = (uint4)(0,1,2,3);\n";
const char *sampler_pattern = const char *sampler_pattern =
"\tconst sampler_t sampler = CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST | CLK_NORMALIZED_COORDS_FALSE;\n"; "\tconst sampler_t sampler = CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST | CLK_NORMALIZED_COORDS_FALSE;\n";
const char *write_pattern = { const char *write_pattern = {
"\tfor(y=0; y<get_image_height(image%d); y++)\n" "\tfor(y=0; y<get_image_height(image%d); y++)\n"
"\t\tif (y %s get_global_size(0) == get_global_id(0))\n" "\t\tif (y %s get_global_size(0) == get_global_id(0))\n"
"\t\t\tfor (x=0; x<get_image_width(image%d); x++) {\n" "\t\t\tfor (x=0; x<get_image_width(image%d); x++) {\n"
"\t\t\t\tcolor = (uint4)x*(uint4)y+offset;\n" "\t\t\t\tcolor = (uint4)x*(uint4)y+offset;\n"
"\t\t\t\twrite_imageui(image%d, (int2)(x,y), color);\n" "\t\t\t\twrite_imageui(image%d, (int2)(x,y), color);\n"
"\t\t\t}\n" "\t\t\t}\n"
"\tbarrier(CLK_LOCAL_MEM_FENCE);\n" "\tbarrier(CLK_LOCAL_MEM_FENCE);\n"
}; };
int check_image(cl_command_queue queue, cl_mem mem) { int check_image(cl_command_queue queue, cl_mem mem) {
int error; int error;
cl_mem_object_type type; cl_mem_object_type type;
size_t width, height; size_t width, height;
size_t origin[3], region[3], x, j; size_t origin[3], region[3], x, j;
cl_uint *data; cl_uint *data;
error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL); error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_MEM_TYPE."); print_error(error, "clGetMemObjectInfo failed for CL_MEM_TYPE.");
return -1; return -1;
} }
if (type == CL_MEM_OBJECT_BUFFER) { if (type == CL_MEM_OBJECT_BUFFER) {
log_error("Expected image object, not buffer.\n"); log_error("Expected image object, not buffer.\n");
return -1; return -1;
} else if (type == CL_MEM_OBJECT_IMAGE2D) { } else if (type == CL_MEM_OBJECT_IMAGE2D) {
error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL); error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_WIDTH."); print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_WIDTH.");
return -1; return -1;
} }
error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL); error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_HEIGHT."); print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_HEIGHT.");
return -1; return -1;
} }
} }
data = (cl_uint*)malloc(width*4*sizeof(cl_uint)); data = (cl_uint*)malloc(width*4*sizeof(cl_uint));
if (data == NULL) { if (data == NULL) {
log_error("Failed to malloc host buffer for writing into image.\n"); log_error("Failed to malloc host buffer for writing into image.\n");
return FAILED_ABORT; return FAILED_ABORT;
} }
origin[0] = 0; origin[0] = 0;
origin[1] = 0; origin[1] = 0;
origin[2] = 0; origin[2] = 0;
region[0] = width; region[0] = width;
region[1] = 1; region[1] = 1;
region[2] = 1; region[2] = 1;
for (origin[1] = 0; origin[1] < height; origin[1]++) { for (origin[1] = 0; origin[1] < height; origin[1]++) {
error = clEnqueueReadImage(queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL); error = clEnqueueReadImage(queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL);
if (error) { if (error) {
print_error(error, "clEnqueueReadImage failed"); print_error(error, "clEnqueueReadImage failed");
free(data); free(data);
return error; return error;
} }
for (x=0; x<width; x++) { for (x=0; x<width; x++) {
for (j=0; j<4; j++) { for (j=0; j<4; j++) {
if (data[x*4+j] != (cl_uint)(x*origin[1]+j)) { if (data[x*4+j] != (cl_uint)(x*origin[1]+j)) {
log_error("Pixel %d, %d, component %d, expected %u, got %u.\n", log_error("Pixel %d, %d, component %d, expected %u, got %u.\n",
(int)x, (int)origin[1], (int)j, (cl_uint)(x*origin[1]+j), data[x*4+j]); (int)x, (int)origin[1], (int)j, (cl_uint)(x*origin[1]+j), data[x*4+j]);
return -1; return -1;
} }
} }
} }
} }
free(data); free(data);
return 0; return 0;
} }
#define NUM_OF_WORK_ITEMS 8192*2 #define NUM_OF_WORK_ITEMS 8192*2
int execute_kernel(cl_context context, cl_command_queue *queue, cl_device_id device_id, int test, cl_mem mems[], int number_of_mems_used, int verify_checksum) { int execute_kernel(cl_context context, cl_command_queue *queue, cl_device_id device_id, int test, cl_mem mems[], int number_of_mems_used, int verify_checksum) {
char *argument_string; char *argument_string;
char *access_string; char *access_string;
char *kernel_string; char *kernel_string;
int i, error, result; int i, error, result;
clKernelWrapper kernel; clKernelWrapper kernel;
clProgramWrapper program; clProgramWrapper program;
clMemWrapper result_mem; clMemWrapper result_mem;
char *ptr; char *ptr;
size_t global_dims[3]; size_t global_dims[3];
cl_ulong per_item; cl_ulong per_item;
cl_uint per_item_uint; cl_uint per_item_uint;
cl_uint returned_results[NUM_OF_WORK_ITEMS], final_result; cl_uint returned_results[NUM_OF_WORK_ITEMS], final_result;
clEventWrapper event; clEventWrapper event;
cl_int event_status; cl_int event_status;
// Allocate memory for the kernel source // Allocate memory for the kernel source
argument_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*64); argument_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*64);
access_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*(strlen(read_pattern)+10)); access_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*(strlen(read_pattern)+10));
kernel_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*(strlen(read_pattern)+10+64)+1024); kernel_string = (char*)malloc(sizeof(char)*MAX_NUMBER_TO_ALLOCATE*(strlen(read_pattern)+10+64)+1024);
argument_string[0] = '\0'; argument_string[0] = '\0';
access_string[0] = '\0'; access_string[0] = '\0';
kernel_string[0] = '\0'; kernel_string[0] = '\0';
// Zero the results. // Zero the results.
for (i=0; i<NUM_OF_WORK_ITEMS; i++) for (i=0; i<NUM_OF_WORK_ITEMS; i++)
returned_results[i] = 0; returned_results[i] = 0;
// Build the kernel source // Build the kernel source
if (test == BUFFER || test == BUFFER_NON_BLOCKING) { if (test == BUFFER || test == BUFFER_NON_BLOCKING) {
for(i=0; i<number_of_mems_used; i++) { for(i=0; i<number_of_mems_used; i++) {
sprintf(argument_string + strlen(argument_string), " __global uint *buffer%d, ", i); sprintf(argument_string + strlen(argument_string), " __global uint *buffer%d, ", i);
sprintf(access_string + strlen( access_string), "\t\tif (i<array_sizes[%d]) r += buffer%d[i];\n", i, i); sprintf(access_string + strlen( access_string), "\t\tif (i<array_sizes[%d]) r += buffer%d[i];\n", i, i);
} }
sprintf(kernel_string, buffer_kernel_pattern, argument_string, access_string); sprintf(kernel_string, buffer_kernel_pattern, argument_string, access_string);
} }
else if (test == IMAGE_READ || test == IMAGE_READ_NON_BLOCKING) { else if (test == IMAGE_READ || test == IMAGE_READ_NON_BLOCKING) {
for(i=0; i<number_of_mems_used; i++) { for(i=0; i<number_of_mems_used; i++) {
sprintf(argument_string + strlen(argument_string), " read_only image2d_t image%d, ", i); sprintf(argument_string + strlen(argument_string), " read_only image2d_t image%d, ", i);
sprintf(access_string + strlen(access_string), read_pattern, i, "%", i, i); sprintf(access_string + strlen(access_string), read_pattern, i, "%", i, i);
} }
sprintf(kernel_string, image_kernel_pattern, argument_string, sampler_pattern, access_string); sprintf(kernel_string, image_kernel_pattern, argument_string, sampler_pattern, access_string);
} }
else if (test == IMAGE_WRITE || test == IMAGE_WRITE_NON_BLOCKING) { else if (test == IMAGE_WRITE || test == IMAGE_WRITE_NON_BLOCKING) {
for(i=0; i<number_of_mems_used; i++) { for(i=0; i<number_of_mems_used; i++) {
sprintf(argument_string + strlen(argument_string), " write_only image2d_t image%d, ", i); sprintf(argument_string + strlen(argument_string), " write_only image2d_t image%d, ", i);
sprintf(access_string + strlen( access_string), write_pattern, i, "%", i, i); sprintf(access_string + strlen( access_string), write_pattern, i, "%", i, i);
} }
sprintf(kernel_string, image_kernel_pattern, argument_string, offset_pattern, access_string); sprintf(kernel_string, image_kernel_pattern, argument_string, offset_pattern, access_string);
} }
ptr = kernel_string; ptr = kernel_string;
// Create the kernel // Create the kernel
error = create_single_kernel_helper( context, &program, &kernel, 1, (const char **)&ptr, "sample_test" ); error = create_single_kernel_helper( context, &program, &kernel, 1, (const char **)&ptr, "sample_test" );
free(argument_string); free(argument_string);
free(access_string); free(access_string);
free(kernel_string); free(kernel_string);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
if (result == FAILED_TOO_BIG) if (result == FAILED_TOO_BIG)
log_info("\t\tCreate kernel failed: %s.\n", IGetErrorString(error)); log_info("\t\tCreate kernel failed: %s.\n", IGetErrorString(error));
else else
print_error(error, "Create kernel and program failed"); print_error(error, "Create kernel and program failed");
return result; return result;
} }
// Set the arguments // Set the arguments
for (i=0; i<number_of_mems_used; i++) { for (i=0; i<number_of_mems_used; i++) {
error = clSetKernelArg(kernel, i, sizeof(cl_mem), &mems[i]); error = clSetKernelArg(kernel, i, sizeof(cl_mem), &mems[i]);
test_error(error, "clSetKernelArg failed"); test_error(error, "clSetKernelArg failed");
} }
// Set the result // Set the result
result_mem = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*NUM_OF_WORK_ITEMS, &returned_results, &error); result_mem = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*NUM_OF_WORK_ITEMS, &returned_results, &error);
test_error(error, "clCreateBuffer failed"); test_error(error, "clCreateBuffer failed");
error = clSetKernelArg(kernel, i, sizeof(result_mem), &result_mem); error = clSetKernelArg(kernel, i, sizeof(result_mem), &result_mem);
test_error(error, "clSetKernelArg failed"); test_error(error, "clSetKernelArg failed");
// Thread dimensions for execution // Thread dimensions for execution
global_dims[0] = NUM_OF_WORK_ITEMS; global_dims[1] = 1; global_dims[2] = 1; global_dims[0] = NUM_OF_WORK_ITEMS; global_dims[1] = 1; global_dims[2] = 1;
// We have extra arguments for the buffer kernel because we need to pass in the buffer sizes // We have extra arguments for the buffer kernel because we need to pass in the buffer sizes
cl_uint *sizes = (cl_uint*)malloc(sizeof(cl_uint)*number_of_mems_used); cl_uint *sizes = (cl_uint*)malloc(sizeof(cl_uint)*number_of_mems_used);
cl_uint max_size = 0; cl_uint max_size = 0;
clMemWrapper buffer_sizes; clMemWrapper buffer_sizes;
if (test == BUFFER || test == BUFFER_NON_BLOCKING) { if (test == BUFFER || test == BUFFER_NON_BLOCKING) {
for (i=0; i<number_of_mems_used; i++) { for (i=0; i<number_of_mems_used; i++) {
size_t size; size_t size;
error = clGetMemObjectInfo(mems[i], CL_MEM_SIZE, sizeof(size), &size, NULL); error = clGetMemObjectInfo(mems[i], CL_MEM_SIZE, sizeof(size), &size, NULL);
test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_SIZE."); test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_SIZE.");
sizes[i] = (cl_uint)(size/sizeof(cl_uint)); sizes[i] = (cl_uint)(size/sizeof(cl_uint));
if (size/sizeof(cl_uint) > max_size) if (size/sizeof(cl_uint) > max_size)
max_size = (cl_uint)(size/sizeof(cl_uint)); max_size = (cl_uint)(size/sizeof(cl_uint));
} }
buffer_sizes = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*number_of_mems_used, sizes, &error); buffer_sizes = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_uint)*number_of_mems_used, sizes, &error);
test_error_abort(error, "clCreateBuffer failed"); test_error_abort(error, "clCreateBuffer failed");
error = clSetKernelArg(kernel, number_of_mems_used+1, sizeof(cl_mem), &buffer_sizes); error = clSetKernelArg(kernel, number_of_mems_used+1, sizeof(cl_mem), &buffer_sizes);
test_error(error, "clSetKernelArg failed"); test_error(error, "clSetKernelArg failed");
per_item = (cl_ulong)ceil((double)max_size/global_dims[0]); per_item = (cl_ulong)ceil((double)max_size/global_dims[0]);
if (per_item > CL_UINT_MAX) if (per_item > CL_UINT_MAX)
log_error("Size is too large for a uint parameter to the kernel. Expect invalid results.\n"); log_error("Size is too large for a uint parameter to the kernel. Expect invalid results.\n");
per_item_uint = (cl_uint)per_item; per_item_uint = (cl_uint)per_item;
error = clSetKernelArg(kernel, number_of_mems_used+2, sizeof(per_item_uint), &per_item_uint); error = clSetKernelArg(kernel, number_of_mems_used+2, sizeof(per_item_uint), &per_item_uint);
test_error(error, "clSetKernelArg failed"); test_error(error, "clSetKernelArg failed");
free(sizes); free(sizes);
} }
size_t local_dims[3] = {1,1,1}; size_t local_dims[3] = {1,1,1};
error = get_max_common_work_group_size(context, kernel, global_dims[0], &local_dims[0]); error = get_max_common_work_group_size(context, kernel, global_dims[0], &local_dims[0]);
test_error(error, "get_max_common_work_group_size failed"); test_error(error, "get_max_common_work_group_size failed");
// Execute the kernel // Execute the kernel
error = clEnqueueNDRangeKernel(*queue, kernel, 1, NULL, global_dims, local_dims, 0, NULL, &event); error = clEnqueueNDRangeKernel(*queue, kernel, 1, NULL, global_dims, local_dims, 0, NULL, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
if (result == FAILED_TOO_BIG) if (result == FAILED_TOO_BIG)
log_info("\t\tExecute kernel failed: %s (global dim: %ld, local dim: %ld)\n", IGetErrorString(error), global_dims[0], local_dims[0]); log_info("\t\tExecute kernel failed: %s (global dim: %ld, local dim: %ld)\n", IGetErrorString(error), global_dims[0], local_dims[0]);
else else
print_error(error, "clEnqueueNDRangeKernel failed"); print_error(error, "clEnqueueNDRangeKernel failed");
return result; return result;
} }
// Finish the test // Finish the test
error = clFinish(*queue); error = clFinish(*queue);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
if (result == FAILED_TOO_BIG) if (result == FAILED_TOO_BIG)
log_info("\t\tclFinish failed: %s.\n", IGetErrorString(error)); log_info("\t\tclFinish failed: %s.\n", IGetErrorString(error));
else else
print_error(error, "clFinish failed"); print_error(error, "clFinish failed");
return result; return result;
} }
// Verify that the event from the execution did not have an error // Verify that the event from the execution did not have an error
error = clGetEventInfo(event, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(event_status), &event_status, NULL); error = clGetEventInfo(event, CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(event_status), &event_status, NULL);
test_error_abort(error, "clGetEventInfo for CL_EVENT_COMMAND_EXECUTION_STATUS failed"); test_error_abort(error, "clGetEventInfo for CL_EVENT_COMMAND_EXECUTION_STATUS failed");
if (event_status < 0) { if (event_status < 0) {
result = check_allocation_error(context, device_id, event_status, queue); result = check_allocation_error(context, device_id, event_status, queue);
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
if (result == FAILED_TOO_BIG) if (result == FAILED_TOO_BIG)
log_info("\t\tEvent returned from kernel execution indicates failure: %s.\n", IGetErrorString(event_status)); log_info("\t\tEvent returned from kernel execution indicates failure: %s.\n", IGetErrorString(event_status));
else else
print_error(event_status, "clEnqueueNDRangeKernel failed"); print_error(event_status, "clEnqueueNDRangeKernel failed");
return result; return result;
} }
} }
// If we are not verifying the checksum return here // If we are not verifying the checksum return here
if (!verify_checksum) { if (!verify_checksum) {
log_info("Note: Allocations were not initialized so kernel execution can not verify correct results.\n"); log_info("Note: Allocations were not initialized so kernel execution can not verify correct results.\n");
return SUCCEEDED; return SUCCEEDED;
} }
// Verify the checksum. // Verify the checksum.
// Read back the result // Read back the result
error = clEnqueueReadBuffer(*queue, result_mem, CL_TRUE, 0, sizeof(cl_uint)*NUM_OF_WORK_ITEMS, &returned_results, 0, NULL, NULL); error = clEnqueueReadBuffer(*queue, result_mem, CL_TRUE, 0, sizeof(cl_uint)*NUM_OF_WORK_ITEMS, &returned_results, 0, NULL, NULL);
test_error_abort(error, "clEnqueueReadBuffer failed"); test_error_abort(error, "clEnqueueReadBuffer failed");
final_result = 0; final_result = 0;
if (test == BUFFER || test == IMAGE_READ || test == BUFFER_NON_BLOCKING || test == IMAGE_READ_NON_BLOCKING) { if (test == BUFFER || test == IMAGE_READ || test == BUFFER_NON_BLOCKING || test == IMAGE_READ_NON_BLOCKING) {
// For buffers or read images we are just looking at the sum of what each thread summed up // For buffers or read images we are just looking at the sum of what each thread summed up
for (i=0; i<NUM_OF_WORK_ITEMS; i++) { for (i=0; i<NUM_OF_WORK_ITEMS; i++) {
final_result += returned_results[i]; final_result += returned_results[i];
} }
if (final_result != checksum) { if (final_result != checksum) {
log_error("\t\tChecksum failed to verify. Expected %u got %u.\n", checksum, final_result); log_error("\t\tChecksum failed to verify. Expected %u got %u.\n", checksum, final_result);
return FAILED_ABORT; return FAILED_ABORT;
} }
log_info("\t\tChecksum verified (%u == %u).\n", checksum, final_result); log_info("\t\tChecksum verified (%u == %u).\n", checksum, final_result);
} else { } else {
// For write images we need to verify the values // For write images we need to verify the values
for (i=0; i<number_of_mems_used; i++) { for (i=0; i<number_of_mems_used; i++) {
if (check_image(*queue, mems[i])) { if (check_image(*queue, mems[i])) {
log_error("\t\tImage contents failed to verify for image %d.\n", (int)i); log_error("\t\tImage contents failed to verify for image %d.\n", (int)i);
return FAILED_ABORT; return FAILED_ABORT;
} }
} }
log_info("\t\tImage contents verified.\n"); log_info("\t\tImage contents verified.\n");
} }
// Finish the test // Finish the test
error = clFinish(*queue); error = clFinish(*queue);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
if (result == FAILED_TOO_BIG) if (result == FAILED_TOO_BIG)
log_info("\t\tclFinish failed: %s.\n", IGetErrorString(error)); log_info("\t\tclFinish failed: %s.\n", IGetErrorString(error));
else else
print_error(error, "clFinish failed"); print_error(error, "clFinish failed");
return result; return result;
} }
return SUCCEEDED; return SUCCEEDED;
} }

View File

@@ -1,22 +1,22 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "allocation_utils.h" #include "allocation_utils.h"
int execute_kernel(cl_context context, cl_command_queue *queue, cl_device_id device_id, int test, cl_mem mems[], int number_of_mems_used, int verify_checksum); int execute_kernel(cl_context context, cl_command_queue *queue, cl_device_id device_id, int test, cl_mem mems[], int number_of_mems_used, int verify_checksum);

View File

@@ -1,312 +1,312 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "allocation_fill.h" #include "allocation_fill.h"
#define BUFFER_CHUNK_SIZE 8*1024*1024 #define BUFFER_CHUNK_SIZE 8*1024*1024
#define IMAGE_LINES 8 #define IMAGE_LINES 8
#include "../../test_common/harness/compat.h" #include "../../test_common/harness/compat.h"
int fill_buffer_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, size_t size, MTdata d, cl_bool blocking_write) { int fill_buffer_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, size_t size, MTdata d, cl_bool blocking_write) {
size_t i, j; size_t i, j;
cl_uint *data; cl_uint *data;
int error, result; int error, result;
cl_uint checksum_delta = 0; cl_uint checksum_delta = 0;
cl_event event; cl_event event;
size_t size_to_use = BUFFER_CHUNK_SIZE; size_t size_to_use = BUFFER_CHUNK_SIZE;
if (size_to_use > size) if (size_to_use > size)
size_to_use = size; size_to_use = size;
data = (cl_uint*)malloc(size_to_use); data = (cl_uint*)malloc(size_to_use);
if (data == NULL) { if (data == NULL) {
log_error("Failed to malloc host buffer for writing into buffer.\n"); log_error("Failed to malloc host buffer for writing into buffer.\n");
return FAILED_ABORT; return FAILED_ABORT;
} }
for (i=0; i<size-size_to_use; i+=size_to_use) { for (i=0; i<size-size_to_use; i+=size_to_use) {
// Put values in the data, and keep a checksum as we go along. // Put values in the data, and keep a checksum as we go along.
for (j=0; j<size_to_use/sizeof(cl_uint); j++) { for (j=0; j<size_to_use/sizeof(cl_uint); j++) {
data[j] = genrand_int32(d); data[j] = genrand_int32(d);
checksum_delta += data[j]; checksum_delta += data[j];
} }
if (blocking_write) { if (blocking_write) {
error = clEnqueueWriteBuffer(*queue, mem, CL_TRUE, i, size_to_use, data, 0, NULL, NULL); error = clEnqueueWriteBuffer(*queue, mem, CL_TRUE, i, size_to_use, data, 0, NULL, NULL);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteBuffer failed."); print_error(error, "clEnqueueWriteBuffer failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
} else { } else {
error = clEnqueueWriteBuffer(*queue, mem, CL_FALSE, i, size_to_use, data, 0, NULL, &event); error = clEnqueueWriteBuffer(*queue, mem, CL_FALSE, i, size_to_use, data, 0, NULL, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteBuffer failed."); print_error(error, "clEnqueueWriteBuffer failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
error = clWaitForEvents(1, &event); error = clWaitForEvents(1, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clWaitForEvents failed."); print_error(error, "clWaitForEvents failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseEvent(event); clReleaseEvent(event);
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
clReleaseEvent(event); clReleaseEvent(event);
} }
} }
// Deal with any leftover bits // Deal with any leftover bits
if (i < size) { if (i < size) {
// Put values in the data, and keep a checksum as we go along. // Put values in the data, and keep a checksum as we go along.
for (j=0; j<(size-i)/sizeof(cl_uint); j++) { for (j=0; j<(size-i)/sizeof(cl_uint); j++) {
data[j] = (cl_uint)genrand_int32(d); data[j] = (cl_uint)genrand_int32(d);
checksum_delta += data[j]; checksum_delta += data[j];
} }
if (blocking_write) { if (blocking_write) {
error = clEnqueueWriteBuffer(*queue, mem, CL_TRUE, i, size-i, data, 0, NULL, NULL); error = clEnqueueWriteBuffer(*queue, mem, CL_TRUE, i, size-i, data, 0, NULL, NULL);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteBuffer failed."); print_error(error, "clEnqueueWriteBuffer failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
} else { } else {
error = clEnqueueWriteBuffer(*queue, mem, CL_FALSE, i, size-i, data, 0, NULL, &event); error = clEnqueueWriteBuffer(*queue, mem, CL_FALSE, i, size-i, data, 0, NULL, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteBuffer failed."); print_error(error, "clEnqueueWriteBuffer failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
error = clWaitForEvents(1, &event); error = clWaitForEvents(1, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clWaitForEvents failed."); print_error(error, "clWaitForEvents failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseEvent(event); clReleaseEvent(event);
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
clReleaseEvent(event); clReleaseEvent(event);
} }
} }
free(data); free(data);
// Only update the checksum if this succeeded. // Only update the checksum if this succeeded.
checksum += checksum_delta; checksum += checksum_delta;
return SUCCEEDED; return SUCCEEDED;
} }
int fill_image_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, size_t width, size_t height, MTdata d, cl_bool blocking_write) { int fill_image_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, size_t width, size_t height, MTdata d, cl_bool blocking_write) {
size_t origin[3], region[3], j; size_t origin[3], region[3], j;
int error, result; int error, result;
cl_uint *data; cl_uint *data;
cl_uint checksum_delta = 0; cl_uint checksum_delta = 0;
cl_event event; cl_event event;
size_t image_lines_to_use; size_t image_lines_to_use;
image_lines_to_use = IMAGE_LINES; image_lines_to_use = IMAGE_LINES;
if (image_lines_to_use > height) if (image_lines_to_use > height)
image_lines_to_use = height; image_lines_to_use = height;
data = (cl_uint*)malloc(width*4*sizeof(cl_uint)*IMAGE_LINES); data = (cl_uint*)malloc(width*4*sizeof(cl_uint)*IMAGE_LINES);
if (data == NULL) { if (data == NULL) {
log_error("Failed to malloc host buffer for writing into image.\n"); log_error("Failed to malloc host buffer for writing into image.\n");
return FAILED_ABORT; return FAILED_ABORT;
} }
origin[0] = 0; origin[0] = 0;
origin[1] = 0; origin[1] = 0;
origin[2] = 0; origin[2] = 0;
region[0] = width; region[0] = width;
region[1] = IMAGE_LINES; region[1] = IMAGE_LINES;
region[2] = 1; region[2] = 1;
for (origin[1] = 0; origin[1] < height - IMAGE_LINES; origin[1] += IMAGE_LINES) { for (origin[1] = 0; origin[1] < height - IMAGE_LINES; origin[1] += IMAGE_LINES) {
// Put values in the data, and keep a checksum as we go along. // Put values in the data, and keep a checksum as we go along.
for (j=0; j<width*4*IMAGE_LINES; j++) { for (j=0; j<width*4*IMAGE_LINES; j++) {
data[j] = (cl_uint)genrand_int32(d); data[j] = (cl_uint)genrand_int32(d);
checksum_delta += data[j]; checksum_delta += data[j];
} }
if (blocking_write) { if (blocking_write) {
error = clEnqueueWriteImage(*queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL); error = clEnqueueWriteImage(*queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteImage failed."); print_error(error, "clEnqueueWriteImage failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
} else { } else {
error = clEnqueueWriteImage(*queue, mem, CL_FALSE, origin, region, 0, 0, data, 0, NULL, &event); error = clEnqueueWriteImage(*queue, mem, CL_FALSE, origin, region, 0, 0, data, 0, NULL, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteImage failed."); print_error(error, "clEnqueueWriteImage failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
error = clWaitForEvents(1, &event); error = clWaitForEvents(1, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clWaitForEvents failed."); print_error(error, "clWaitForEvents failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseEvent(event); clReleaseEvent(event);
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
clReleaseEvent(event); clReleaseEvent(event);
} }
} }
// Deal with any leftover bits // Deal with any leftover bits
if (origin[1] < height) { if (origin[1] < height) {
// Put values in the data, and keep a checksum as we go along. // Put values in the data, and keep a checksum as we go along.
for (j=0; j<width*4*(height-origin[1]); j++) { for (j=0; j<width*4*(height-origin[1]); j++) {
data[j] = (cl_uint)genrand_int32(d); data[j] = (cl_uint)genrand_int32(d);
checksum_delta += data[j]; checksum_delta += data[j];
} }
region[1] = height-origin[1]; region[1] = height-origin[1];
if(blocking_write) { if(blocking_write) {
error = clEnqueueWriteImage(*queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL); error = clEnqueueWriteImage(*queue, mem, CL_TRUE, origin, region, 0, 0, data, 0, NULL, NULL);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteImage failed."); print_error(error, "clEnqueueWriteImage failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
} else { } else {
error = clEnqueueWriteImage(*queue, mem, CL_FALSE, origin, region, 0, 0, data, 0, NULL, &event); error = clEnqueueWriteImage(*queue, mem, CL_FALSE, origin, region, 0, 0, data, 0, NULL, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clEnqueueWriteImage failed."); print_error(error, "clEnqueueWriteImage failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseMemObject(mem); clReleaseMemObject(mem);
free(data); free(data);
return result; return result;
} }
error = clWaitForEvents(1, &event); error = clWaitForEvents(1, &event);
result = check_allocation_error(context, device_id, error, queue); result = check_allocation_error(context, device_id, error, queue);
if (result == FAILED_ABORT) { if (result == FAILED_ABORT) {
print_error(error, "clWaitForEvents failed."); print_error(error, "clWaitForEvents failed.");
} }
if (result != SUCCEEDED) { if (result != SUCCEEDED) {
clReleaseEvent(event); clReleaseEvent(event);
free(data); free(data);
clReleaseMemObject(mem); clReleaseMemObject(mem);
return result; return result;
} }
clReleaseEvent(event); clReleaseEvent(event);
} }
} }
free(data); free(data);
// Only update the checksum if this succeeded. // Only update the checksum if this succeeded.
checksum += checksum_delta; checksum += checksum_delta;
return SUCCEEDED; return SUCCEEDED;
} }
int fill_mem_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, MTdata d, cl_bool blocking_write) { int fill_mem_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, MTdata d, cl_bool blocking_write) {
int error; int error;
cl_mem_object_type type; cl_mem_object_type type;
size_t size, width, height; size_t size, width, height;
error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL); error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL);
test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_TYPE."); test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_TYPE.");
if (type == CL_MEM_OBJECT_BUFFER) { if (type == CL_MEM_OBJECT_BUFFER) {
error = clGetMemObjectInfo(mem, CL_MEM_SIZE, sizeof(size), &size, NULL); error = clGetMemObjectInfo(mem, CL_MEM_SIZE, sizeof(size), &size, NULL);
test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_SIZE."); test_error_abort(error, "clGetMemObjectInfo failed for CL_MEM_SIZE.");
return fill_buffer_with_data(context, device_id, queue, mem, size, d, blocking_write); return fill_buffer_with_data(context, device_id, queue, mem, size, d, blocking_write);
} else if (type == CL_MEM_OBJECT_IMAGE2D) { } else if (type == CL_MEM_OBJECT_IMAGE2D) {
error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL); error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL);
test_error_abort(error, "clGetImageInfo failed for CL_IMAGE_WIDTH."); test_error_abort(error, "clGetImageInfo failed for CL_IMAGE_WIDTH.");
error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL); error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL);
test_error_abort(error, "clGetImageInfo failed for CL_IMAGE_HEIGHT."); test_error_abort(error, "clGetImageInfo failed for CL_IMAGE_HEIGHT.");
return fill_image_with_data(context, device_id, queue, mem, width, height, d, blocking_write); return fill_image_with_data(context, device_id, queue, mem, width, height, d, blocking_write);
} }
log_error("Invalid CL_MEM_TYPE: %d\n", type); log_error("Invalid CL_MEM_TYPE: %d\n", type);
return FAILED_ABORT; return FAILED_ABORT;
} }

View File

@@ -1,19 +1,19 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "allocation_utils.h" #include "allocation_utils.h"
int fill_mem_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, MTdata d, cl_bool blocking_write); int fill_mem_with_data(cl_context context, cl_device_id device_id, cl_command_queue *queue, cl_mem mem, MTdata d, cl_bool blocking_write);

View File

@@ -1,246 +1,286 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "allocation_functions.h" #include "allocation_functions.h"
#include "allocation_fill.h" #include "allocation_fill.h"
static cl_image_format image_format = { CL_RGBA, CL_UNSIGNED_INT32 }; static cl_image_format image_format = { CL_RGBA, CL_UNSIGNED_INT32 };
int allocate_buffer(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) { int allocate_buffer(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) {
int error; int error;
log_info("\t\tAttempting to allocate a %gMB array and fill with %s writes.\n", (size_to_allocate/(1024.0*1024.0)), (blocking_write ? "blocking" : "non-blocking")); log_info("\t\tAttempting to allocate a %gMB array and fill with %s writes.\n", (size_to_allocate/(1024.0*1024.0)), (blocking_write ? "blocking" : "non-blocking"));
*mem = clCreateBuffer(context, CL_MEM_READ_WRITE, size_to_allocate, NULL, &error); *mem = clCreateBuffer(context, CL_MEM_READ_WRITE, size_to_allocate, NULL, &error);
return check_allocation_error(context, device_id, error, queue); return check_allocation_error(context, device_id, error, queue);
} }
int find_good_image_size(cl_device_id device_id, size_t size_to_allocate, size_t *width, size_t *height) { int find_good_image_size(cl_device_id device_id, size_t size_to_allocate, size_t *width, size_t *height, size_t* max_size) {
size_t max_width, max_height, num_pixels, found_width, found_height; size_t max_width, max_height, num_pixels, found_width, found_height;
int error; int error;
if (checkForImageSupport(device_id)) { if (checkForImageSupport(device_id)) {
log_info("Can not allocate an image on this device because it does not support images."); log_info("Can not allocate an image on this device because it does not support images.");
return FAILED_ABORT; return FAILED_ABORT;
} }
if (size_to_allocate == 0) { if (size_to_allocate == 0) {
log_error("Trying to allcoate a zero sized image.\n"); log_error("Trying to allcoate a zero sized image.\n");
return FAILED_ABORT; return FAILED_ABORT;
} }
error = clGetDeviceInfo( device_id, CL_DEVICE_IMAGE2D_MAX_WIDTH, sizeof( max_width ), &max_width, NULL ); error = clGetDeviceInfo( device_id, CL_DEVICE_IMAGE2D_MAX_WIDTH, sizeof( max_width ), &max_width, NULL );
test_error_abort(error, "clGetDeviceInfo failed."); test_error_abort(error, "clGetDeviceInfo failed.");
error = clGetDeviceInfo( device_id, CL_DEVICE_IMAGE2D_MAX_HEIGHT, sizeof( max_height ), &max_height, NULL ); error = clGetDeviceInfo( device_id, CL_DEVICE_IMAGE2D_MAX_HEIGHT, sizeof( max_height ), &max_height, NULL );
test_error_abort(error, "clGetDeviceInfo failed."); test_error_abort(error, "clGetDeviceInfo failed.");
num_pixels = size_to_allocate / (sizeof(cl_uint)*4); num_pixels = size_to_allocate / (sizeof(cl_uint)*4);
if (num_pixels > (max_width*max_height)) if (num_pixels > (max_width*max_height)) {
return FAILED_TOO_BIG; if(NULL != max_size) {
*max_size = max_width * max_height * sizeof(cl_uint) * 4;
// We want a close-to-square aspect ratio. }
// Note that this implicitly assumes that max width >= max height return FAILED_TOO_BIG;
found_width = (int)sqrt( (double) num_pixels ); }
if (found_width == 0)
found_width = 1; // We want a close-to-square aspect ratio.
if( found_width > max_width ) { // Note that this implicitly assumes that max width >= max height
found_width = max_width; found_width = (int)sqrt( (double) num_pixels );
} if( found_width > max_width ) {
found_height = (size_t)num_pixels/found_width; found_width = max_width;
if (found_height > max_height) { }
found_height = max_height; if (found_width == 0)
} found_width = 1;
*width = found_width; found_height = (size_t)num_pixels/found_width;
*height = found_height; if (found_height > max_height) {
found_height = max_height;
return SUCCEEDED; }
} if (found_height == 0)
found_height = 1;
int allocate_image2d_read(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) { *width = found_width;
size_t width, height; *height = found_height;
int error;
if(NULL != max_size) {
error = find_good_image_size(device_id, size_to_allocate, &width, &height); *max_size = found_width * found_height * sizeof(cl_uint) * 4;
if (error != SUCCEEDED) }
return error;
return SUCCEEDED;
log_info("\t\tAttempting to allocate a %gMB read-only image (%d x %d) and fill with %s writes.\n", }
(size_to_allocate/(1024.0*1024.0)), (int)width, (int)height, (blocking_write ? "blocking" : "non-blocking"));
*mem = create_image_2d(context, CL_MEM_READ_ONLY, &image_format, width, height, 0, NULL, &error); int allocate_image2d_read(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) {
size_t width, height;
return check_allocation_error(context, device_id, error, queue); int error;
}
error = find_good_image_size(device_id, size_to_allocate, &width, &height, NULL);
if (error != SUCCEEDED)
int allocate_image2d_write(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) { return error;
size_t width, height;
int error; log_info("\t\tAttempting to allocate a %gMB read-only image (%d x %d) and fill with %s writes.\n",
(size_to_allocate/(1024.0*1024.0)), (int)width, (int)height, (blocking_write ? "blocking" : "non-blocking"));
error = find_good_image_size(device_id, size_to_allocate, &width, &height); *mem = create_image_2d(context, CL_MEM_READ_ONLY, &image_format, width, height, 0, NULL, &error);
if (error != SUCCEEDED)
return error; return check_allocation_error(context, device_id, error, queue);
}
log_info("\t\tAttempting to allocate a %gMB write-only image (%d x %d) and fill with %s writes.\n",
(size_to_allocate/(1024.0*1024.0)), (int)width, (int)height, (blocking_write ? "blocking" : "non-blocking"));
*mem = create_image_2d(context, CL_MEM_WRITE_ONLY, &image_format, width, height, 0, NULL, &error); int allocate_image2d_write(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate, cl_bool blocking_write) {
size_t width, height;
return check_allocation_error(context, device_id, error, queue); int error;
}
error = find_good_image_size(device_id, size_to_allocate, &width, &height, NULL);
int do_allocation(cl_context context, cl_command_queue *queue, cl_device_id device_id, size_t size_to_allocate, int type, cl_mem *mem) { if (error != SUCCEEDED)
if (type == BUFFER) return allocate_buffer(context, queue, device_id, mem, size_to_allocate, true); return error;
if (type == IMAGE_READ) return allocate_image2d_read(context, queue, device_id, mem, size_to_allocate, true);
if (type == IMAGE_WRITE) return allocate_image2d_write(context, queue, device_id, mem, size_to_allocate, true); log_info("\t\tAttempting to allocate a %gMB write-only image (%d x %d) and fill with %s writes.\n",
if (type == BUFFER_NON_BLOCKING) return allocate_buffer(context, queue, device_id, mem, size_to_allocate, false); (size_to_allocate/(1024.0*1024.0)), (int)width, (int)height, (blocking_write ? "blocking" : "non-blocking"));
if (type == IMAGE_READ_NON_BLOCKING) return allocate_image2d_read(context, queue, device_id, mem, size_to_allocate, false); *mem = create_image_2d(context, CL_MEM_WRITE_ONLY, &image_format, width, height, 0, NULL, &error);
if (type == IMAGE_WRITE_NON_BLOCKING) return allocate_image2d_write(context, queue, device_id, mem, size_to_allocate, false);
log_error("Invalid allocation type: %d\n", type); return check_allocation_error(context, device_id, error, queue);
return FAILED_ABORT; }
}
int do_allocation(cl_context context, cl_command_queue *queue, cl_device_id device_id, size_t size_to_allocate, int type, cl_mem *mem) {
if (type == BUFFER) return allocate_buffer(context, queue, device_id, mem, size_to_allocate, true);
int allocate_size(cl_context context, cl_command_queue *queue, cl_device_id device_id, int multiple_allocations, size_t size_to_allocate, if (type == IMAGE_READ) return allocate_image2d_read(context, queue, device_id, mem, size_to_allocate, true);
int type, cl_mem mems[], int *number_of_mems, size_t *final_size, int force_fill, MTdata d) { if (type == IMAGE_WRITE) return allocate_image2d_write(context, queue, device_id, mem, size_to_allocate, true);
if (type == BUFFER_NON_BLOCKING) return allocate_buffer(context, queue, device_id, mem, size_to_allocate, false);
cl_ulong max_individual_allocation_size, global_mem_size; if (type == IMAGE_READ_NON_BLOCKING) return allocate_image2d_read(context, queue, device_id, mem, size_to_allocate, false);
int error, result; if (type == IMAGE_WRITE_NON_BLOCKING) return allocate_image2d_write(context, queue, device_id, mem, size_to_allocate, false);
size_t amount_allocated; log_error("Invalid allocation type: %d\n", type);
size_t reduction_amount; return FAILED_ABORT;
size_t min_allocation_allowed; }
int current_allocation;
size_t allocation_this_time, actual_allocation;
int allocate_size(cl_context context, cl_command_queue *queue, cl_device_id device_id, int multiple_allocations, size_t size_to_allocate,
// Set the number of mems used to 0 so if we fail to create even a single one we don't end up returning a garbage value int type, cl_mem mems[], int *number_of_mems, size_t *final_size, int force_fill, MTdata d) {
*number_of_mems = 0;
cl_ulong max_individual_allocation_size, global_mem_size;
error = clGetDeviceInfo(device_id, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(max_individual_allocation_size), &max_individual_allocation_size, NULL); int error, result;
test_error_abort( error, "clGetDeviceInfo failed for CL_DEVICE_MAX_MEM_ALLOC_SIZE"); size_t amount_allocated;
error = clGetDeviceInfo(device_id, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(global_mem_size), &global_mem_size, NULL); size_t reduction_amount;
test_error_abort( error, "clGetDeviceInfo failed for CL_DEVICE_GLOBAL_MEM_SIZE"); int current_allocation;
size_t allocation_this_time, actual_allocation;
// log_info("Device reports CL_DEVICE_MAX_MEM_ALLOC_SIZE=%llu bytes (%gMB), CL_DEVICE_GLOBAL_MEM_SIZE=%llu bytes (%gMB).\n",
// max_individual_allocation_size, toMB(max_individual_allocation_size), // Set the number of mems used to 0 so if we fail to create even a single one we don't end up returning a garbage value
// global_mem_size, toMB(global_mem_size)); *number_of_mems = 0;
if (size_to_allocate > global_mem_size) { error = clGetDeviceInfo(device_id, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(max_individual_allocation_size), &max_individual_allocation_size, NULL);
log_error("Can not allocate more than the global memory size.\n"); test_error_abort(error, "clGetDeviceInfo failed for CL_DEVICE_MAX_MEM_ALLOC_SIZE");
return FAILED_ABORT; error = clGetDeviceInfo(device_id, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(global_mem_size), &global_mem_size, NULL);
} test_error_abort(error, "clGetDeviceInfo failed for CL_DEVICE_GLOBAL_MEM_SIZE");
amount_allocated = 0; if (global_mem_size > (cl_ulong)SIZE_MAX) {
current_allocation = 0; global_mem_size = (cl_ulong)SIZE_MAX;
reduction_amount = (size_t)max_individual_allocation_size/16; }
min_allocation_allowed = (size_t)max_individual_allocation_size/4;
if (min_allocation_allowed > size_to_allocate) // log_info("Device reports CL_DEVICE_MAX_MEM_ALLOC_SIZE=%llu bytes (%gMB), CL_DEVICE_GLOBAL_MEM_SIZE=%llu bytes (%gMB).\n",
min_allocation_allowed = size_to_allocate/4; // max_individual_allocation_size, toMB(max_individual_allocation_size),
// global_mem_size, toMB(global_mem_size));
if (type == BUFFER || type == BUFFER_NON_BLOCKING) log_info("\tAttempting to allocate a buffer of size %gMB.\n", toMB(size_to_allocate));
else if (type == IMAGE_READ || type == IMAGE_READ_NON_BLOCKING) log_info("\tAttempting to allocate a read-only image of size %gMB.\n", toMB(size_to_allocate)); if (size_to_allocate > global_mem_size) {
else if (type == IMAGE_WRITE || type == IMAGE_WRITE_NON_BLOCKING) log_info("\tAttempting to allocate a write-only image of size %gMB.\n", toMB(size_to_allocate)); log_error("Can not allocate more than the global memory size.\n");
return FAILED_ABORT;
// log_info("\t\t(Reduction size is %gMB per iteration, minimum allowable individual allocation size is %gMB.)\n", }
// toMB(reduction_amount), toMB(min_allocation_allowed));
// if (force_fill && type != IMAGE_WRITE && type != IMAGE_WRITE_NON_BLOCKING) log_info("\t\t(Allocations will be filled with random data for checksum calculation.)\n"); amount_allocated = 0;
current_allocation = 0;
// If we are only doing a single allocation, only allow 1
int max_to_allocate = multiple_allocations ? MAX_NUMBER_TO_ALLOCATE : 1; // If allocating for images, reduce the maximum allocation size to the maximum image size.
// If we don't do this, then the value of CL_DEVICE_MAX_MEM_ALLOC_SIZE / 4 can be higher
// Make sure that the maximum number of images allocated is constrained by the // than the maximum image size on systems with 16GB or RAM or more. In this case, we
// maximum that may be passed to a kernel // succeed in allocating an image but its size is less than CL_DEVICE_MAX_MEM_ALLOC_SIZE / 4
if (type != BUFFER && type != BUFFER_NON_BLOCKING) { // (min_allocation_allowed) and thus we fail the allocation below.
cl_device_info param_name = (type == IMAGE_READ || type == IMAGE_READ_NON_BLOCKING) ? if (type == IMAGE_READ || type == IMAGE_READ_NON_BLOCKING || type == IMAGE_WRITE || type == IMAGE_WRITE_NON_BLOCKING) {
CL_DEVICE_MAX_READ_IMAGE_ARGS : CL_DEVICE_MAX_WRITE_IMAGE_ARGS; size_t width;
size_t height;
cl_uint max_image_args; size_t max_size;
error = clGetDeviceInfo(device_id, param_name, sizeof(max_image_args), &max_image_args, NULL); error = find_good_image_size(device_id, size_to_allocate, &width, &height, &max_size);
test_error( error, "clGetDeviceInfo failed for CL_DEVICE_MAX IMAGE_ARGS"); if (!(error == SUCCEEDED || error == FAILED_TOO_BIG))
return error;
if ((int)max_image_args < max_to_allocate) { if (max_size < max_individual_allocation_size)
log_info("\t\tMaximum number of images per kernel limited to %d\n",(int)max_image_args); max_individual_allocation_size = max_size;
max_to_allocate = max_image_args; }
}
} reduction_amount = (size_t)max_individual_allocation_size / 16;
if (type == BUFFER || type == BUFFER_NON_BLOCKING) log_info("\tAttempting to allocate a buffer of size %gMB.\n", toMB(size_to_allocate));
// Try to allocate the requested amount. else if (type == IMAGE_READ || type == IMAGE_READ_NON_BLOCKING) log_info("\tAttempting to allocate a read-only image of size %gMB.\n", toMB(size_to_allocate));
while (amount_allocated != size_to_allocate && current_allocation < max_to_allocate) { else if (type == IMAGE_WRITE || type == IMAGE_WRITE_NON_BLOCKING) log_info("\tAttempting to allocate a write-only image of size %gMB.\n", toMB(size_to_allocate));
allocation_this_time = size_to_allocate - amount_allocated;
if (allocation_this_time > max_individual_allocation_size) // log_info("\t\t(Reduction size is %gMB per iteration, minimum allowable individual allocation size is %gMB.)\n",
allocation_this_time = (size_t)max_individual_allocation_size; // toMB(reduction_amount), toMB(min_allocation_allowed));
// if (force_fill && type != IMAGE_WRITE && type != IMAGE_WRITE_NON_BLOCKING) log_info("\t\t(Allocations will be filled with random data for checksum calculation.)\n");
// Try to allocate a chunk of memory
result = FAILED_TOO_BIG; // If we are only doing a single allocation, only allow 1
//log_info("\t\tTrying sub-allocation %d at size %gMB.\n", current_allocation, toMB(allocation_this_time)); int max_to_allocate = multiple_allocations ? MAX_NUMBER_TO_ALLOCATE : 1;
while (result == FAILED_TOO_BIG && allocation_this_time != 0) {
result = do_allocation(context, queue, device_id, allocation_this_time, type, &mems[current_allocation]); // Make sure that the maximum number of images allocated is constrained by the
if (result == SUCCEEDED) { // maximum that may be passed to a kernel
// Allocation succeeded, another memory object was added to the array if (type != BUFFER && type != BUFFER_NON_BLOCKING) {
*number_of_mems = (current_allocation+1); cl_device_info param_name = (type == IMAGE_READ || type == IMAGE_READ_NON_BLOCKING) ?
// Verify the size is correct to within 1MB. CL_DEVICE_MAX_READ_IMAGE_ARGS : CL_DEVICE_MAX_WRITE_IMAGE_ARGS;
actual_allocation = get_actual_allocation_size(mems[current_allocation]);
if (fabs((double)(allocation_this_time - actual_allocation)) > 1024.0*1024.0) { cl_uint max_image_args;
log_error("Allocation not of expected size. Expected %gMB, got %gMB.\n", toMB(allocation_this_time), toMB( actual_allocation)); error = clGetDeviceInfo(device_id, param_name, sizeof(max_image_args), &max_image_args, NULL);
return FAILED_ABORT; test_error(error, "clGetDeviceInfo failed for CL_DEVICE_MAX IMAGE_ARGS");
}
// If we are filling the allocation for verification do so if ((int)max_image_args < max_to_allocate) {
if (force_fill) { log_info("\t\tMaximum number of images per kernel limited to %d\n", (int)max_image_args);
//log_info("\t\t\tWriting random values to object and calculating checksum.\n"); max_to_allocate = max_image_args;
cl_bool blocking_write = true; }
if (type == BUFFER_NON_BLOCKING || type == IMAGE_READ_NON_BLOCKING || type == IMAGE_WRITE_NON_BLOCKING) { }
blocking_write = false;
}
result = fill_mem_with_data(context, device_id, queue, mems[current_allocation], d, blocking_write); // Try to allocate the requested amount.
} while (amount_allocated != size_to_allocate && current_allocation < max_to_allocate) {
}
if (result == FAILED_TOO_BIG) { // Determine how much more is needed
//log_info("\t\t\tAllocation %d failed at size %gMB. Trying smaller.\n", current_allocation, toMB(allocation_this_time)); allocation_this_time = size_to_allocate - amount_allocated;
if (allocation_this_time > reduction_amount)
allocation_this_time -= reduction_amount; // Bound by the individual allocation size
else { if (allocation_this_time > max_individual_allocation_size)
allocation_this_time = 0; allocation_this_time = (size_t)max_individual_allocation_size;
}
// Allocate the largest object possible
} result = FAILED_TOO_BIG;
} //log_info("\t\tTrying sub-allocation %d at size %gMB.\n", current_allocation, toMB(allocation_this_time));
while (result == FAILED_TOO_BIG && allocation_this_time != 0) {
if (result == FAILED_ABORT) {
log_error("\t\tAllocation failed.\n"); // Create the object
return FAILED_ABORT; result = do_allocation(context, queue, device_id, allocation_this_time, type, &mems[current_allocation]);
} if (result == SUCCEEDED) {
// Allocation succeeded, another memory object was added to the array
if (allocation_this_time < min_allocation_allowed && allocation_this_time < (size_to_allocate-amount_allocated)) { *number_of_mems = (current_allocation + 1);
log_info("\t\tFailed to allocate an individual allocation of more than %gMB.\n", toMB(min_allocation_allowed));
return FAILED_TOO_BIG; // Verify the size is correct to within 1MB.
} actual_allocation = get_actual_allocation_size(mems[current_allocation]);
if (fabs((double)allocation_this_time - (double)actual_allocation) > 1024.0*1024.0) {
// Otherwise we succeeded log_error("Allocation not of expected size. Expected %gMB, got %gMB.\n", toMB(allocation_this_time), toMB(actual_allocation));
if (result != SUCCEEDED) { return FAILED_ABORT;
log_error("Test logic error."); }
test_finish();
exit(-1); // If we are filling the allocation for verification do so
} if (force_fill) {
amount_allocated += allocation_this_time; //log_info("\t\t\tWriting random values to object and calculating checksum.\n");
cl_bool blocking_write = true;
*final_size = amount_allocated; if (type == BUFFER_NON_BLOCKING || type == IMAGE_READ_NON_BLOCKING || type == IMAGE_WRITE_NON_BLOCKING) {
blocking_write = false;
current_allocation++; }
} result = fill_mem_with_data(context, device_id, queue, mems[current_allocation], d, blocking_write);
}
log_info("\t\tSucceeded in allocating %gMB using %d memory objects.\n", toMB(amount_allocated), current_allocation); }
return SUCCEEDED;
} // If creation failed, try to create a smaller object
if (result == FAILED_TOO_BIG) {
//log_info("\t\t\tAllocation %d failed at size %gMB. Trying smaller.\n", current_allocation, toMB(allocation_this_time));
if (allocation_this_time > reduction_amount)
allocation_this_time -= reduction_amount;
else if (reduction_amount > 1) {
reduction_amount /= 2;
}
else {
allocation_this_time = 0;
}
}
}
if (result == FAILED_ABORT) {
log_error("\t\tAllocation failed.\n");
return FAILED_ABORT;
}
if (!allocation_this_time) {
log_info("\t\tFailed to allocate %gMB across several objects.\n", toMB(size_to_allocate));
return FAILED_TOO_BIG;
}
// Otherwise we succeeded
if (result != SUCCEEDED) {
log_error("Test logic error.");
test_finish();
exit(-1);
}
amount_allocated += allocation_this_time;
*final_size = amount_allocated;
current_allocation++;
}
log_info("\t\tSucceeded in allocating %gMB using %d memory objects.\n", toMB(amount_allocated), current_allocation);
return SUCCEEDED;
}

View File

@@ -1,24 +1,24 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "allocation_utils.h" #include "allocation_utils.h"
int do_allocation(cl_context context, cl_command_queue *queue, cl_device_id device_id, size_t size_to_allocate, int type, cl_mem *mem); int do_allocation(cl_context context, cl_command_queue *queue, cl_device_id device_id, size_t size_to_allocate, int type, cl_mem *mem);
int allocate_buffer(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate); int allocate_buffer(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate);
int allocate_image2d_read(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate); int allocate_image2d_read(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate);
int allocate_image2d_write(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate); int allocate_image2d_write(cl_context context, cl_command_queue *queue, cl_device_id device_id, cl_mem *mem, size_t size_to_allocate);
int allocate_size(cl_context context, cl_command_queue *queue, cl_device_id device_id, int multiple_allocations, size_t size_to_allocate, int allocate_size(cl_context context, cl_command_queue *queue, cl_device_id device_id, int multiple_allocations, size_t size_to_allocate,
int type, cl_mem mems[], int *number_of_mems, size_t *final_size, int force_fill, MTdata d); int type, cl_mem mems[], int *number_of_mems, size_t *final_size, int force_fill, MTdata d);

View File

@@ -1,87 +1,87 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "allocation_utils.h" #include "allocation_utils.h"
cl_command_queue reset_queue(cl_context context, cl_device_id device_id, cl_command_queue *queue, int *error) cl_command_queue reset_queue(cl_context context, cl_device_id device_id, cl_command_queue *queue, int *error)
{ {
log_info("Invalid command queue. Releasing and recreating the command queue.\n"); log_info("Invalid command queue. Releasing and recreating the command queue.\n");
clReleaseCommandQueue(*queue); clReleaseCommandQueue(*queue);
*queue = clCreateCommandQueue(context, device_id, 0, error); *queue = clCreateCommandQueue(context, device_id, 0, error);
return *queue; return *queue;
} }
int check_allocation_error(cl_context context, cl_device_id device_id, int error, cl_command_queue *queue) { int check_allocation_error(cl_context context, cl_device_id device_id, int error, cl_command_queue *queue) {
//log_info("check_allocation_error context=%p device_id=%p error=%d *queue=%p\n", context, device_id, error, *queue); //log_info("check_allocation_error context=%p device_id=%p error=%d *queue=%p\n", context, device_id, error, *queue);
if ((error == CL_MEM_OBJECT_ALLOCATION_FAILURE ) || (error == CL_OUT_OF_RESOURCES ) || (error == CL_OUT_OF_HOST_MEMORY) || (error == CL_INVALID_IMAGE_SIZE)) { if ((error == CL_MEM_OBJECT_ALLOCATION_FAILURE ) || (error == CL_OUT_OF_RESOURCES ) || (error == CL_OUT_OF_HOST_MEMORY) || (error == CL_INVALID_IMAGE_SIZE)) {
return FAILED_TOO_BIG; return FAILED_TOO_BIG;
} else if (error == CL_INVALID_COMMAND_QUEUE) { } else if (error == CL_INVALID_COMMAND_QUEUE) {
*queue = reset_queue(context, device_id, queue, &error); *queue = reset_queue(context, device_id, queue, &error);
if (CL_SUCCESS != error) if (CL_SUCCESS != error)
{ {
log_error("Failed to reset command queue after corrupted queue: %s\n", IGetErrorString(error)); log_error("Failed to reset command queue after corrupted queue: %s\n", IGetErrorString(error));
return FAILED_ABORT; return FAILED_ABORT;
} }
// Try again with smaller resources. // Try again with smaller resources.
return FAILED_TOO_BIG; return FAILED_TOO_BIG;
} else if (error != CL_SUCCESS) { } else if (error != CL_SUCCESS) {
log_error("Allocation failed with %s.\n", IGetErrorString(error)); log_error("Allocation failed with %s.\n", IGetErrorString(error));
return FAILED_ABORT; return FAILED_ABORT;
} }
return SUCCEEDED; return SUCCEEDED;
} }
double toMB(cl_ulong size_in) { double toMB(cl_ulong size_in) {
return (double)size_in/(1024.0*1024.0); return (double)size_in/(1024.0*1024.0);
} }
size_t get_actual_allocation_size(cl_mem mem) { size_t get_actual_allocation_size(cl_mem mem) {
int error; int error;
cl_mem_object_type type; cl_mem_object_type type;
size_t size, width, height; size_t size, width, height;
error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL); error = clGetMemObjectInfo(mem, CL_MEM_TYPE, sizeof(type), &type, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_MEM_TYPE."); print_error(error, "clGetMemObjectInfo failed for CL_MEM_TYPE.");
return 0; return 0;
} }
if (type == CL_MEM_OBJECT_BUFFER) { if (type == CL_MEM_OBJECT_BUFFER) {
error = clGetMemObjectInfo(mem, CL_MEM_SIZE, sizeof(size), &size, NULL); error = clGetMemObjectInfo(mem, CL_MEM_SIZE, sizeof(size), &size, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_MEM_SIZE."); print_error(error, "clGetMemObjectInfo failed for CL_MEM_SIZE.");
return 0; return 0;
} }
return size; return size;
} else if (type == CL_MEM_OBJECT_IMAGE2D) { } else if (type == CL_MEM_OBJECT_IMAGE2D) {
error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL); error = clGetImageInfo(mem, CL_IMAGE_WIDTH, sizeof(width), &width, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_WIDTH."); print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_WIDTH.");
return 0; return 0;
} }
error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL); error = clGetImageInfo(mem, CL_IMAGE_HEIGHT, sizeof(height), &height, NULL);
if (error) { if (error) {
print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_HEIGHT."); print_error(error, "clGetMemObjectInfo failed for CL_IMAGE_HEIGHT.");
return 0; return 0;
} }
return width*height*4*sizeof(cl_uint); return width*height*4*sizeof(cl_uint);
} }
log_error("Invalid CL_MEM_TYPE: %d\n", type); log_error("Invalid CL_MEM_TYPE: %d\n", type);
return 0; return 0;
} }

View File

@@ -1,24 +1,24 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
extern cl_uint checksum; extern cl_uint checksum;
int check_allocation_error(cl_context context, cl_device_id device_id, int error, cl_command_queue *queue); int check_allocation_error(cl_context context, cl_device_id device_id, int error, cl_command_queue *queue);
double toMB(cl_ulong size_in); double toMB(cl_ulong size_in);
size_t get_actual_allocation_size(cl_mem mem); size_t get_actual_allocation_size(cl_mem mem);

View File

@@ -1,354 +1,354 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "allocation_functions.h" #include "allocation_functions.h"
#include "allocation_fill.h" #include "allocation_fill.h"
#include "allocation_execute.h" #include "allocation_execute.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#include <time.h> #include <time.h>
cl_device_id g_device_id; cl_device_id g_device_id;
cl_device_type g_device_type = CL_DEVICE_TYPE_DEFAULT; cl_device_type g_device_type = CL_DEVICE_TYPE_DEFAULT;
clContextWrapper g_context; clContextWrapper g_context;
clCommandQueueWrapper g_queue; clCommandQueueWrapper g_queue;
int g_repetition_count = 1; int g_repetition_count = 1;
int g_tests_to_run = 0; int g_tests_to_run = 0;
int g_reduction_percentage = 100; int g_reduction_percentage = 100;
int g_write_allocations = 1; int g_write_allocations = 1;
int g_multiple_allocations = 0; int g_multiple_allocations = 0;
int g_execute_kernel = 1; int g_execute_kernel = 1;
cl_uint checksum; cl_uint checksum;
void printUsage( const char *execName ) void printUsage( const char *execName )
{ {
const char *p = strrchr( execName, '/' ); const char *p = strrchr( execName, '/' );
if( p != NULL ) if( p != NULL )
execName = p + 1; execName = p + 1;
log_info( "Usage: %s [single|multiple] [numReps] [reduction%%] allocType\n", execName ); log_info( "Usage: %s [single|multiple] [numReps] [reduction%%] allocType\n", execName );
log_info( "Where:\n" ); log_info( "Where:\n" );
log_info( "\tsingle - Tests using a single allocation as large as possible\n" ); log_info( "\tsingle - Tests using a single allocation as large as possible\n" );
log_info( "\tmultiple - Tests using as many allocations as possible\n" ); log_info( "\tmultiple - Tests using as many allocations as possible\n" );
log_info( "\n" ); log_info( "\n" );
log_info( "\tnumReps - Optional integer specifying the number of repetitions to run and average the result (defaults to 1)\n" ); log_info( "\tnumReps - Optional integer specifying the number of repetitions to run and average the result (defaults to 1)\n" );
log_info( "\treduction%% - Optional integer, followed by a %% sign, that acts as a multiplier for the target amount of memory.\n" ); log_info( "\treduction%% - Optional integer, followed by a %% sign, that acts as a multiplier for the target amount of memory.\n" );
log_info( "\t Example: target amount of 512MB and a reduction of 75%% will result in a target of 384MB.\n" ); log_info( "\t Example: target amount of 512MB and a reduction of 75%% will result in a target of 384MB.\n" );
log_info( "\n" ); log_info( "\n" );
log_info( "\tallocType - Allocation type to test with. Can be one of the following:\n" ); log_info( "\tallocType - Allocation type to test with. Can be one of the following:\n" );
log_info( "\t\tbuffer\n"); log_info( "\t\tbuffer\n");
log_info( "\t\timage2d_read\n"); log_info( "\t\timage2d_read\n");
log_info( "\t\timage2d_write\n"); log_info( "\t\timage2d_write\n");
log_info( "\t\tbuffer_non_blocking\n"); log_info( "\t\tbuffer_non_blocking\n");
log_info( "\t\timage2d_read_non_blocking\n"); log_info( "\t\timage2d_read_non_blocking\n");
log_info( "\t\timage2d_write_non_blocking\n"); log_info( "\t\timage2d_write_non_blocking\n");
log_info( "\t\tall (runs all of the above in sequence)\n" ); log_info( "\t\tall (runs all of the above in sequence)\n" );
log_info( "\tdo_not_force_fill - Disable explicitly write data to all memory objects after creating them.\n" ); log_info( "\tdo_not_force_fill - Disable explicitly write data to all memory objects after creating them.\n" );
log_info( "\t Without this, the kernel execution can not verify its checksum.\n" ); log_info( "\t Without this, the kernel execution can not verify its checksum.\n" );
log_info( "\tdo_not_execute - Disable executing a kernel that accesses all of the memory objects.\n" ); log_info( "\tdo_not_execute - Disable executing a kernel that accesses all of the memory objects.\n" );
} }
int init_cl() { int init_cl() {
cl_platform_id platform; cl_platform_id platform;
int error; int error;
error = clGetPlatformIDs(1, &platform, NULL); error = clGetPlatformIDs(1, &platform, NULL);
test_error(error, "clGetPlatformIDs failed"); test_error(error, "clGetPlatformIDs failed");
error = clGetDeviceIDs(platform, g_device_type, 1, &g_device_id, NULL); error = clGetDeviceIDs(platform, g_device_type, 1, &g_device_id, NULL);
test_error(error, "clGetDeviceIDs failed"); test_error(error, "clGetDeviceIDs failed");
/* Create a context */ /* Create a context */
g_context = clCreateContext( NULL, 1, &g_device_id, notify_callback, NULL, &error ); g_context = clCreateContext( NULL, 1, &g_device_id, notify_callback, NULL, &error );
test_error(error, "clCreateContext failed"); test_error(error, "clCreateContext failed");
/* Create command queue */ /* Create command queue */
g_queue = clCreateCommandQueue( g_context, g_device_id, 0, &error ); g_queue = clCreateCommandQueue( g_context, g_device_id, 0, &error );
test_error(error, "clCreateCommandQueue failed"); test_error(error, "clCreateCommandQueue failed");
return error; return error;
} }
int main(int argc, const char *argv[]) int main(int argc, const char *argv[])
{ {
int error; int error;
int count; int count;
cl_mem mems[MAX_NUMBER_TO_ALLOCATE]; cl_mem mems[MAX_NUMBER_TO_ALLOCATE];
cl_ulong max_individual_allocation_size, global_mem_size; cl_ulong max_individual_allocation_size, global_mem_size;
char str[ 128 ], *endPtr; char str[ 128 ], *endPtr;
int r; int r;
int number_of_mems_used; int number_of_mems_used;
int failure_counts = 0; int failure_counts = 0;
int test, test_to_run = 0; int test, test_to_run = 0;
int randomize = 0; int randomize = 0;
size_t final_size, max_size, current_test_size; size_t final_size, max_size, current_test_size;
test_start(); test_start();
// Parse arguments // Parse arguments
checkDeviceTypeOverride( &g_device_type ); checkDeviceTypeOverride( &g_device_type );
for( int i = 1; i < argc; i++ ) for( int i = 1; i < argc; i++ )
{ {
strncpy( str, argv[ i ], sizeof( str ) - 1 ); strncpy( str, argv[ i ], sizeof( str ) - 1 );
if( strcmp( str, "cpu" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_CPU" ) == 0 ) if( strcmp( str, "cpu" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_CPU" ) == 0 )
g_device_type = CL_DEVICE_TYPE_CPU; g_device_type = CL_DEVICE_TYPE_CPU;
else if( strcmp( str, "gpu" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_GPU" ) == 0 ) else if( strcmp( str, "gpu" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_GPU" ) == 0 )
g_device_type = CL_DEVICE_TYPE_GPU; g_device_type = CL_DEVICE_TYPE_GPU;
else if( strcmp( str, "accelerator" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_ACCELERATOR" ) == 0 ) else if( strcmp( str, "accelerator" ) == 0 || strcmp( str, "CL_DEVICE_TYPE_ACCELERATOR" ) == 0 )
g_device_type = CL_DEVICE_TYPE_ACCELERATOR; g_device_type = CL_DEVICE_TYPE_ACCELERATOR;
else if( strcmp( str, "CL_DEVICE_TYPE_DEFAULT" ) == 0 ) else if( strcmp( str, "CL_DEVICE_TYPE_DEFAULT" ) == 0 )
g_device_type = CL_DEVICE_TYPE_DEFAULT; g_device_type = CL_DEVICE_TYPE_DEFAULT;
else if( strcmp( str, "multiple" ) == 0 ) else if( strcmp( str, "multiple" ) == 0 )
g_multiple_allocations = 1; g_multiple_allocations = 1;
else if( strcmp( str, "randomize" ) == 0 ) else if( strcmp( str, "randomize" ) == 0 )
randomize = 1; randomize = 1;
else if( strcmp( str, "single" ) == 0 ) else if( strcmp( str, "single" ) == 0 )
g_multiple_allocations = 0; g_multiple_allocations = 0;
else if( ( r = (int)strtol( str, &endPtr, 10 ) ) && ( endPtr != str ) && ( *endPtr == 0 ) ) else if( ( r = (int)strtol( str, &endPtr, 10 ) ) && ( endPtr != str ) && ( *endPtr == 0 ) )
{ {
// By spec, that means the entire string was an integer, so take it as a repetition count // By spec, that means the entire string was an integer, so take it as a repetition count
g_repetition_count = r; g_repetition_count = r;
} }
else if( strcmp( str, "all" ) == 0 ) else if( strcmp( str, "all" ) == 0 )
{ {
g_tests_to_run = BUFFER | IMAGE_READ | IMAGE_WRITE | BUFFER_NON_BLOCKING | IMAGE_READ_NON_BLOCKING | IMAGE_WRITE_NON_BLOCKING; g_tests_to_run = BUFFER | IMAGE_READ | IMAGE_WRITE | BUFFER_NON_BLOCKING | IMAGE_READ_NON_BLOCKING | IMAGE_WRITE_NON_BLOCKING;
} }
else if( strchr( str, '%' ) != NULL ) else if( strchr( str, '%' ) != NULL )
{ {
// Reduction percentage (let strtol ignore the percentage) // Reduction percentage (let strtol ignore the percentage)
g_reduction_percentage = (int)strtol( str, NULL, 10 ); g_reduction_percentage = (int)strtol( str, NULL, 10 );
} }
else if( g_tests_to_run == 0 ) else if( g_tests_to_run == 0 )
{ {
if( strcmp( str, "buffer" ) == 0 ) if( strcmp( str, "buffer" ) == 0 )
{ {
g_tests_to_run |= BUFFER; g_tests_to_run |= BUFFER;
} }
else if( strcmp( str, "image2d_read" ) == 0 ) else if( strcmp( str, "image2d_read" ) == 0 )
{ {
g_tests_to_run |= IMAGE_READ; g_tests_to_run |= IMAGE_READ;
} }
else if( strcmp( str, "image2d_write" ) == 0 ) else if( strcmp( str, "image2d_write" ) == 0 )
{ {
g_tests_to_run |= IMAGE_WRITE; g_tests_to_run |= IMAGE_WRITE;
} }
else if( strcmp( str, "buffer_non_blocking" ) == 0 ) else if( strcmp( str, "buffer_non_blocking" ) == 0 )
{ {
g_tests_to_run |= BUFFER_NON_BLOCKING; g_tests_to_run |= BUFFER_NON_BLOCKING;
} }
else if( strcmp( str, "image2d_read_non_blocking" ) == 0 ) else if( strcmp( str, "image2d_read_non_blocking" ) == 0 )
{ {
g_tests_to_run |= IMAGE_READ_NON_BLOCKING; g_tests_to_run |= IMAGE_READ_NON_BLOCKING;
} }
else if( strcmp( str, "image2d_write_non_blocking" ) == 0 ) else if( strcmp( str, "image2d_write_non_blocking" ) == 0 )
{ {
g_tests_to_run |= IMAGE_WRITE_NON_BLOCKING; g_tests_to_run |= IMAGE_WRITE_NON_BLOCKING;
} }
if( g_tests_to_run == 0 ) if( g_tests_to_run == 0 )
break; // Argument is invalid; break to print usage break; // Argument is invalid; break to print usage
} }
else if( strcmp( str, "do_not_force_fill" ) == 0 ) else if( strcmp( str, "do_not_force_fill" ) == 0 )
{ {
g_write_allocations = 0; g_write_allocations = 0;
} }
else if( strcmp( str, "do_not_execute" ) == 0 ) else if( strcmp( str, "do_not_execute" ) == 0 )
{ {
g_execute_kernel = 0; g_execute_kernel = 0;
} }
} }
if( randomize ) if( randomize )
{ {
gRandomSeed = (cl_uint) clock(); gRandomSeed = (cl_uint) clock();
gReSeed = 1; gReSeed = 1;
} }
if( g_tests_to_run == 0 ) if( g_tests_to_run == 0 )
{ {
// Allocation type was never specified, or one of the arguments was invalid. Print usage and bail // Allocation type was never specified, or one of the arguments was invalid. Print usage and bail
printUsage( argv[ 0 ] ); printUsage( argv[ 0 ] );
return -1; return -1;
} }
// All ready to go, so set up an environment // All ready to go, so set up an environment
error = init_cl(); error = init_cl();
if (error) { if (error) {
test_finish(); test_finish();
return -1; return -1;
} }
if( printDeviceHeader( g_device_id ) != CL_SUCCESS ) if( printDeviceHeader( g_device_id ) != CL_SUCCESS )
{ {
test_finish(); test_finish();
return -1; return -1;
} }
error = clGetDeviceInfo(g_device_id, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(max_individual_allocation_size), &max_individual_allocation_size, NULL); error = clGetDeviceInfo(g_device_id, CL_DEVICE_MAX_MEM_ALLOC_SIZE, sizeof(max_individual_allocation_size), &max_individual_allocation_size, NULL);
if ( error ) { if ( error ) {
print_error( error, "clGetDeviceInfo failed for CL_DEVICE_MAX_MEM_ALLOC_SIZE"); print_error( error, "clGetDeviceInfo failed for CL_DEVICE_MAX_MEM_ALLOC_SIZE");
test_finish(); test_finish();
return -1; return -1;
} }
error = clGetDeviceInfo(g_device_id, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(global_mem_size), &global_mem_size, NULL); error = clGetDeviceInfo(g_device_id, CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(global_mem_size), &global_mem_size, NULL);
if ( error ) { if ( error ) {
print_error( error, "clGetDeviceInfo failed for CL_DEVICE_GLOBAL_MEM_SIZE"); print_error( error, "clGetDeviceInfo failed for CL_DEVICE_GLOBAL_MEM_SIZE");
test_finish(); test_finish();
return -1; return -1;
} }
log_info("Device reports CL_DEVICE_MAX_MEM_ALLOC_SIZE=%llu bytes (%gMB), CL_DEVICE_GLOBAL_MEM_SIZE=%llu bytes (%gMB).\n", log_info("Device reports CL_DEVICE_MAX_MEM_ALLOC_SIZE=%llu bytes (%gMB), CL_DEVICE_GLOBAL_MEM_SIZE=%llu bytes (%gMB).\n",
max_individual_allocation_size, toMB(max_individual_allocation_size), max_individual_allocation_size, toMB(max_individual_allocation_size),
global_mem_size, toMB(global_mem_size)); global_mem_size, toMB(global_mem_size));
if( max_individual_allocation_size > global_mem_size ) if( max_individual_allocation_size > global_mem_size )
{ {
log_error( "FAILURE: CL_DEVICE_MAX_MEM_ALLOC_SIZE (%lld) is greater than the CL_DEVICE_GLOBAL_MEM_SIZE (%lld)\n", max_individual_allocation_size, global_mem_size ); log_error( "FAILURE: CL_DEVICE_MAX_MEM_ALLOC_SIZE (%lld) is greater than the CL_DEVICE_GLOBAL_MEM_SIZE (%lld)\n", max_individual_allocation_size, global_mem_size );
test_finish(); test_finish();
return -1; return -1;
} }
// We may need to back off the global_mem_size on unified memory devices to leave room for application and operating system code // We may need to back off the global_mem_size on unified memory devices to leave room for application and operating system code
// and associated data in the working set, so we dont start pathologically paging. // and associated data in the working set, so we dont start pathologically paging.
// Check to see if we are a unified memory device // Check to see if we are a unified memory device
cl_bool hasUnifiedMemory = CL_FALSE; cl_bool hasUnifiedMemory = CL_FALSE;
if( ( error = clGetDeviceInfo( g_device_id, CL_DEVICE_HOST_UNIFIED_MEMORY, sizeof( hasUnifiedMemory ), &hasUnifiedMemory, NULL ))) if( ( error = clGetDeviceInfo( g_device_id, CL_DEVICE_HOST_UNIFIED_MEMORY, sizeof( hasUnifiedMemory ), &hasUnifiedMemory, NULL )))
{ {
print_error( error, "clGetDeviceInfo failed for CL_DEVICE_HOST_UNIFIED_MEMORY"); print_error( error, "clGetDeviceInfo failed for CL_DEVICE_HOST_UNIFIED_MEMORY");
test_finish(); test_finish();
return -1; return -1;
} }
// we share unified memory so back off to 3/4 the global memory size. // we share unified memory so back off to 3/4 the global memory size.
if( CL_TRUE == hasUnifiedMemory ) if( CL_TRUE == hasUnifiedMemory )
{ {
global_mem_size -= global_mem_size /4; global_mem_size -= global_mem_size /4;
log_info( "Device shares memory with the host, so backing off the maximum combined allocation size to be %gMB to avoid rampant paging.\n", toMB( global_mem_size ) ); log_info( "Device shares memory with the host, so backing off the maximum combined allocation size to be %gMB to avoid rampant paging.\n", toMB( global_mem_size ) );
} }
// Pick the baseline size based on whether we are doing a single large or multiple allocations // Pick the baseline size based on whether we are doing a single large or multiple allocations
if (!g_multiple_allocations) { if (!g_multiple_allocations) {
max_size = (size_t)max_individual_allocation_size; max_size = (size_t)max_individual_allocation_size;
} else { } else {
max_size = (size_t)global_mem_size; max_size = (size_t)global_mem_size;
} }
// Adjust based on the percentage // Adjust based on the percentage
if (g_reduction_percentage != 100) { if (g_reduction_percentage != 100) {
log_info("NOTE: reducing max allocations to %d%%.\n", g_reduction_percentage); log_info("NOTE: reducing max allocations to %d%%.\n", g_reduction_percentage);
max_size = (size_t)((double)max_size * (double)g_reduction_percentage/100.0); max_size = (size_t)((double)max_size * (double)g_reduction_percentage/100.0);
} }
// Round to nearest MB. // Round to nearest MB.
max_size &= (size_t)(0xFFFFFFFFFF00000ULL); max_size &= (size_t)(0xFFFFFFFFFF00000ULL);
log_info("** Target allocation size (rounded to nearest MB) is: %lu bytes (%gMB).\n", max_size, toMB(max_size)); log_info("** Target allocation size (rounded to nearest MB) is: %lu bytes (%gMB).\n", max_size, toMB(max_size));
// Run all the requested tests // Run all the requested tests
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
for (test=0; test<6; test++) { for (test=0; test<6; test++) {
if (test == 0) test_to_run = BUFFER; if (test == 0) test_to_run = BUFFER;
if (test == 1) test_to_run = IMAGE_READ; if (test == 1) test_to_run = IMAGE_READ;
if (test == 2) test_to_run = IMAGE_WRITE; if (test == 2) test_to_run = IMAGE_WRITE;
if (test == 3) test_to_run = BUFFER_NON_BLOCKING; if (test == 3) test_to_run = BUFFER_NON_BLOCKING;
if (test == 4) test_to_run = IMAGE_READ_NON_BLOCKING; if (test == 4) test_to_run = IMAGE_READ_NON_BLOCKING;
if (test == 5) test_to_run = IMAGE_WRITE_NON_BLOCKING; if (test == 5) test_to_run = IMAGE_WRITE_NON_BLOCKING;
if (!(g_tests_to_run & test_to_run)) if (!(g_tests_to_run & test_to_run))
continue; continue;
// Skip image tests if we don't support images on the device // Skip image tests if we don't support images on the device
if (test > 0 && checkForImageSupport(g_device_id)) { if (test > 0 && checkForImageSupport(g_device_id)) {
log_info("Can not test image allocation because device does not support images.\n"); log_info("Can not test image allocation because device does not support images.\n");
continue; continue;
} }
if (test_to_run == BUFFER || test_to_run == BUFFER_NON_BLOCKING) log_info("** Allocating buffer(s) to size %gMB.\n", toMB(max_size)); if (test_to_run == BUFFER || test_to_run == BUFFER_NON_BLOCKING) log_info("** Allocating buffer(s) to size %gMB.\n", toMB(max_size));
else if (test_to_run == IMAGE_READ || test_to_run == IMAGE_READ_NON_BLOCKING) log_info("** Allocating read-only image(s) to size %gMB.\n", toMB(max_size)); else if (test_to_run == IMAGE_READ || test_to_run == IMAGE_READ_NON_BLOCKING) log_info("** Allocating read-only image(s) to size %gMB.\n", toMB(max_size));
else if (test_to_run == IMAGE_WRITE || test_to_run == IMAGE_WRITE_NON_BLOCKING) log_info("** Allocating write-only image(s) to size %gMB.\n", toMB(max_size)); else if (test_to_run == IMAGE_WRITE || test_to_run == IMAGE_WRITE_NON_BLOCKING) log_info("** Allocating write-only image(s) to size %gMB.\n", toMB(max_size));
else {log_error("Test logic error.\n"); return -1;} else {log_error("Test logic error.\n"); return -1;}
// Run the test the requested number of times // Run the test the requested number of times
for (count = 0; count < g_repetition_count; count++) { for (count = 0; count < g_repetition_count; count++) {
current_test_size = max_size; current_test_size = max_size;
error = FAILED_TOO_BIG; error = FAILED_TOO_BIG;
log_info(" => Allocation %d\n", count+1); log_info(" => Allocation %d\n", count+1);
while (error == FAILED_TOO_BIG && current_test_size > max_size/8) { while (error == FAILED_TOO_BIG && current_test_size > max_size/8) {
// Reset our checksum for each allocation // Reset our checksum for each allocation
checksum = 0; checksum = 0;
// Do the allocation // Do the allocation
error = allocate_size(g_context, &g_queue, g_device_id, g_multiple_allocations, current_test_size, test_to_run, mems, &number_of_mems_used, &final_size, g_write_allocations, seed); error = allocate_size(g_context, &g_queue, g_device_id, g_multiple_allocations, current_test_size, test_to_run, mems, &number_of_mems_used, &final_size, g_write_allocations, seed);
// If we succeeded and we're supposed to execute a kernel, do so. // If we succeeded and we're supposed to execute a kernel, do so.
if (error == SUCCEEDED && g_execute_kernel) { if (error == SUCCEEDED && g_execute_kernel) {
log_info("\tExecuting kernel with memory objects.\n"); log_info("\tExecuting kernel with memory objects.\n");
error = execute_kernel(g_context, &g_queue, g_device_id, test_to_run, mems, number_of_mems_used, g_write_allocations); error = execute_kernel(g_context, &g_queue, g_device_id, test_to_run, mems, number_of_mems_used, g_write_allocations);
} }
// If we failed to allocate more than 1/8th of the requested amount return a failure. // If we failed to allocate more than 1/8th of the requested amount return a failure.
if (final_size < (size_t)max_size/8) { if (final_size < (size_t)max_size/8) {
// log_error("===> Allocation %d failed to allocate more than 1/8th of the requested size.\n", count+1); // log_error("===> Allocation %d failed to allocate more than 1/8th of the requested size.\n", count+1);
failure_counts++; failure_counts++;
} }
// Clean up. // Clean up.
for (int i=0; i<number_of_mems_used; i++) for (int i=0; i<number_of_mems_used; i++)
clReleaseMemObject(mems[i]); clReleaseMemObject(mems[i]);
if (error == FAILED_ABORT) { if (error == FAILED_ABORT) {
log_error(" => Allocation %d failed.\n", count+1); log_error(" => Allocation %d failed.\n", count+1);
failure_counts++; failure_counts++;
} }
if (error == FAILED_TOO_BIG) { if (error == FAILED_TOO_BIG) {
current_test_size -= max_size/16; current_test_size -= max_size/16;
log_info("\tFailed at this size; trying a smaller size of %gMB.\n", toMB(current_test_size)); log_info("\tFailed at this size; trying a smaller size of %gMB.\n", toMB(current_test_size));
} }
} }
if (error == SUCCEEDED && current_test_size == max_size) if (error == SUCCEEDED && current_test_size == max_size)
log_info("\tPASS: Allocation succeeded.\n"); log_info("\tPASS: Allocation succeeded.\n");
else if (error == SUCCEEDED && current_test_size > max_size/8) else if (error == SUCCEEDED && current_test_size > max_size/8)
log_info("\tPASS: Allocation succeeded at reduced size.\n"); log_info("\tPASS: Allocation succeeded at reduced size.\n");
else { else {
log_error("\tFAIL: Allocation failed.\n"); log_error("\tFAIL: Allocation failed.\n");
failure_counts++; failure_counts++;
} }
} }
} }
if (failure_counts) if (failure_counts)
log_error("FAILED allocations test.\n"); log_error("FAILED allocations test.\n");
else else
log_info("PASSED allocations test.\n"); log_info("PASSED allocations test.\n");
test_finish(); test_finish();
return failure_counts; return failure_counts;
} }

View File

@@ -1,62 +1,62 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _testBase_h #ifndef _testBase_h
#define _testBase_h #define _testBase_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <sys/types.h> #include <sys/types.h>
#include <sys/stat.h> #include <sys/stat.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif #endif
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include "../../test_common/harness/kernelHelpers.h" #include "../../test_common/harness/kernelHelpers.h"
#include "../../test_common/harness/typeWrappers.h" #include "../../test_common/harness/typeWrappers.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#define MAX_NUMBER_TO_ALLOCATE 100 #define MAX_NUMBER_TO_ALLOCATE 100
#define FAILED_CORRUPTED_QUEUE -2 #define FAILED_CORRUPTED_QUEUE -2
#define FAILED_ABORT -1 #define FAILED_ABORT -1
#define FAILED_TOO_BIG 1 #define FAILED_TOO_BIG 1
#define SUCCEEDED 0 #define SUCCEEDED 0
#define BUFFER 1 #define BUFFER 1
#define IMAGE_READ 2 #define IMAGE_READ 2
#define IMAGE_WRITE 4 #define IMAGE_WRITE 4
#define BUFFER_NON_BLOCKING 8 #define BUFFER_NON_BLOCKING 8
#define IMAGE_READ_NON_BLOCKING 16 #define IMAGE_READ_NON_BLOCKING 16
#define IMAGE_WRITE_NON_BLOCKING 32 #define IMAGE_WRITE_NON_BLOCKING 32
#define test_error_abort(errCode,msg) test_error_ret_abort(errCode,msg,errCode) #define test_error_abort(errCode,msg) test_error_ret_abort(errCode,msg,errCode)
#define test_error_ret_abort(errCode,msg,retValue) { if( errCode != CL_SUCCESS ) { print_error( errCode, msg ); return FAILED_ABORT ; } } #define test_error_ret_abort(errCode,msg,retValue) { if( errCode != CL_SUCCESS ) { print_error( errCode, msg ); return FAILED_ABORT ; } }
#endif // _testBase_h #endif // _testBase_h

View File

@@ -14,6 +14,7 @@ add_executable(conformance_test_api
test_platform.cpp test_platform.cpp
test_retain.cpp test_retain.cpp
test_device_min_data_type_align_size_alignment.cpp test_device_min_data_type_align_size_alignment.cpp
test_queue_properties.cpp
test_mem_objects.cpp test_mem_objects.cpp
test_bool.c test_bool.c
test_null_buffer_arg.c test_null_buffer_arg.c

View File

@@ -1,27 +1,27 @@
project project
: requirements : requirements
<toolset>gcc:<cflags>-xc++ <toolset>gcc:<cflags>-xc++
<toolset>msvc:<cflags>"/TP" <toolset>msvc:<cflags>"/TP"
; ;
exe test_api exe test_api
: main.c : main.c
test_api_min_max.c test_api_min_max.c
test_binary.cpp test_binary.cpp
test_create_kernels.c test_create_kernels.c
test_create_context_from_type.cpp test_create_context_from_type.cpp
test_kernel_arg_changes.cpp test_kernel_arg_changes.cpp
test_kernel_arg_multi_setup.cpp test_kernel_arg_multi_setup.cpp
test_kernels.c test_kernels.c
test_native_kernel.cpp test_native_kernel.cpp
test_queries.cpp test_queries.cpp
test_retain_program.c test_retain_program.c
test_platform.cpp test_platform.cpp
; ;
install dist install dist
: test_api #test.lst : test_api #test.lst
: <variant>debug:<location>$(DIST)/debug/tests/test_conformance/api : <variant>debug:<location>$(DIST)/debug/tests/test_conformance/api
<variant>release:<location>$(DIST)/release/tests/test_conformance/api <variant>release:<location>$(DIST)/release/tests/test_conformance/api
; ;

View File

@@ -1,61 +1,61 @@
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
SRCS = main.c \ SRCS = main.c \
test_retain_program.c \ test_retain_program.c \
test_queries.cpp \ test_queries.cpp \
test_create_kernels.c \ test_create_kernels.c \
test_kernels.c \ test_kernels.c \
test_kernel_arg_info.c \ test_kernel_arg_info.c \
test_api_min_max.c \ test_api_min_max.c \
test_kernel_arg_changes.cpp \ test_kernel_arg_changes.cpp \
test_kernel_arg_multi_setup.cpp \ test_kernel_arg_multi_setup.cpp \
test_binary.cpp \ test_binary.cpp \
test_native_kernel.cpp \ test_native_kernel.cpp \
test_create_context_from_type.cpp \ test_create_context_from_type.cpp \
test_platform.cpp \ test_platform.cpp \
test_retain.cpp \ test_retain.cpp \
test_device_min_data_type_align_size_alignment.cpp \ test_device_min_data_type_align_size_alignment.cpp \
test_mem_objects.cpp \ test_mem_objects.cpp \
test_bool.c \ test_bool.c \
test_null_buffer_arg.c \ test_null_buffer_arg.c \
test_mem_object_info.cpp \ test_mem_object_info.cpp \
../../test_common/harness/errorHelpers.c \ ../../test_common/harness/errorHelpers.c \
../../test_common/harness/threadTesting.c \ ../../test_common/harness/threadTesting.c \
../../test_common/harness/testHarness.c \ ../../test_common/harness/testHarness.c \
../../test_common/harness/imageHelpers.cpp \ ../../test_common/harness/imageHelpers.cpp \
../../test_common/harness/kernelHelpers.c \ ../../test_common/harness/kernelHelpers.c \
../../test_common/harness/typeWrappers.cpp \ ../../test_common/harness/typeWrappers.cpp \
../../test_common/harness/mt19937.c \ ../../test_common/harness/mt19937.c \
../../test_common/harness/conversions.c ../../test_common/harness/conversions.c
DEFINES = DONT_TEST_GARBAGE_POINTERS DEFINES = DONT_TEST_GARBAGE_POINTERS
SOURCES = $(abspath $(SRCS)) SOURCES = $(abspath $(SRCS))
LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries
LIBPATH += -L. LIBPATH += -L.
HEADERS = HEADERS =
TARGET = test_api TARGET = test_api
INCLUDE = INCLUDE =
COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32 COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32
CC = c++ CC = c++
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF} LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF}
OBJECTS := ${SOURCES:.c=.o} OBJECTS := ${SOURCES:.c=.o}
OBJECTS := ${OBJECTS:.cpp=.o} OBJECTS := ${OBJECTS:.cpp=.o}
TARGETOBJECT = TARGETOBJECT =
all: $(TARGET) all: $(TARGET)
$(TARGET): $(OBJECTS) $(TARGET): $(OBJECTS)
$(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES) $(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES)
clean: clean:
rm -f $(TARGET) $(OBJECTS) rm -f $(TARGET) $(OBJECTS)
.DEFAULT: .DEFAULT:
@echo The target \"$@\" does not exist in Makefile. @echo The target \"$@\" does not exist in Makefile.

View File

@@ -1,215 +1,216 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#include "procs.h" #include "procs.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif #endif
basefn basefn_list[] = { basefn basefn_list[] = {
test_get_platform_info, test_get_platform_info,
test_get_sampler_info, test_get_sampler_info,
test_get_command_queue_info, test_get_command_queue_info,
test_get_context_info, test_get_context_info,
test_get_device_info, test_get_device_info,
test_enqueue_task, test_enqueue_task,
test_binary_get, test_binary_get,
test_program_binary_create, test_program_binary_create,
test_kernel_required_group_size, test_kernel_required_group_size,
test_release_kernel_order, test_release_kernel_order,
test_release_during_execute, test_release_during_execute,
test_load_single_kernel, test_load_single_kernel,
test_load_two_kernels, test_load_two_kernels,
test_load_two_kernels_in_one, test_load_two_kernels_in_one,
test_load_two_kernels_manually, test_load_two_kernels_manually,
test_get_program_info_kernel_names, test_get_program_info_kernel_names,
test_get_kernel_arg_info, test_get_kernel_arg_info,
test_create_kernels_in_program, test_create_kernels_in_program,
test_get_kernel_info, test_get_kernel_info,
test_execute_kernel_local_sizes, test_execute_kernel_local_sizes,
test_set_kernel_arg_by_index, test_set_kernel_arg_by_index,
test_set_kernel_arg_constant, test_set_kernel_arg_constant,
test_set_kernel_arg_struct_array, test_set_kernel_arg_struct_array,
test_kernel_global_constant, test_kernel_global_constant,
test_min_max_thread_dimensions, test_min_max_thread_dimensions,
test_min_max_work_items_sizes, test_min_max_work_items_sizes,
test_min_max_work_group_size, test_min_max_work_group_size,
test_min_max_read_image_args, test_min_max_read_image_args,
test_min_max_write_image_args, test_min_max_write_image_args,
test_min_max_mem_alloc_size, test_min_max_mem_alloc_size,
test_min_max_image_2d_width, test_min_max_image_2d_width,
test_min_max_image_2d_height, test_min_max_image_2d_height,
test_min_max_image_3d_width, test_min_max_image_3d_width,
test_min_max_image_3d_height, test_min_max_image_3d_height,
test_min_max_image_3d_depth, test_min_max_image_3d_depth,
test_min_max_image_array_size, test_min_max_image_array_size,
test_min_max_image_buffer_size, test_min_max_image_buffer_size,
test_min_max_parameter_size, test_min_max_parameter_size,
test_min_max_samplers, test_min_max_samplers,
test_min_max_constant_buffer_size, test_min_max_constant_buffer_size,
test_min_max_constant_args, test_min_max_constant_args,
test_min_max_compute_units, test_min_max_compute_units,
test_min_max_address_bits, test_min_max_address_bits,
test_min_max_single_fp_config, test_min_max_single_fp_config,
test_min_max_double_fp_config, test_min_max_double_fp_config,
test_min_max_local_mem_size, test_min_max_local_mem_size,
test_min_max_kernel_preferred_work_group_size_multiple, test_min_max_kernel_preferred_work_group_size_multiple,
test_min_max_execution_capabilities, test_min_max_execution_capabilities,
test_min_max_queue_properties, test_min_max_queue_properties,
test_min_max_device_version, test_min_max_device_version,
test_min_max_language_version, test_min_max_language_version,
test_kernel_arg_changes, test_kernel_arg_changes,
test_kernel_arg_multi_setup_random, test_kernel_arg_multi_setup_random,
test_native_kernel, test_native_kernel,
test_create_context_from_type, test_create_context_from_type,
test_platform_extensions, test_platform_extensions,
test_get_platform_ids, test_get_platform_ids,
test_for_bool_type, test_for_bool_type,
test_repeated_setup_cleanup, test_repeated_setup_cleanup,
test_retain_queue_single, test_retain_queue_single,
test_retain_queue_multiple, test_retain_queue_multiple,
test_retain_mem_object_single, test_retain_mem_object_single,
test_retain_mem_object_multiple, test_retain_mem_object_multiple,
test_min_data_type_align_size_alignment, test_min_data_type_align_size_alignment,
test_mem_object_destructor_callback, test_mem_object_destructor_callback,
test_null_buffer_arg, test_null_buffer_arg,
test_get_buffer_info, test_get_buffer_info,
test_get_image2d_info, test_get_image2d_info,
test_get_image3d_info, test_get_image3d_info,
test_get_image1d_info, test_get_image1d_info,
test_get_image1d_array_info, test_get_image1d_array_info,
test_get_image2d_array_info, test_get_image2d_array_info,
}; test_queue_properties,
};
const char *basefn_names[] = {
"get_platform_info", const char *basefn_names[] = {
"get_sampler_info", "get_platform_info",
"get_command_queue_info", "get_sampler_info",
"get_context_info", "get_command_queue_info",
"get_device_info", "get_context_info",
"enqueue_task", "get_device_info",
"binary_get", "enqueue_task",
"binary_create", "binary_get",
"kernel_required_group_size", "binary_create",
"kernel_required_group_size",
"release_kernel_order",
"release_during_execute", "release_kernel_order",
"release_during_execute",
"load_single_kernel",
"load_two_kernels", "load_single_kernel",
"load_two_kernels_in_one", "load_two_kernels",
"load_two_kernels_manually", "load_two_kernels_in_one",
"get_program_info_kernel_names", "load_two_kernels_manually",
"get_kernel_arg_info", "get_program_info_kernel_names",
"create_kernels_in_program", "get_kernel_arg_info",
"get_kernel_info", "create_kernels_in_program",
"execute_kernel_local_sizes", "get_kernel_info",
"set_kernel_arg_by_index", "execute_kernel_local_sizes",
"set_kernel_arg_constant", "set_kernel_arg_by_index",
"set_kernel_arg_struct_array", "set_kernel_arg_constant",
"kernel_global_constant", "set_kernel_arg_struct_array",
"kernel_global_constant",
"min_max_thread_dimensions",
"min_max_work_items_sizes", "min_max_thread_dimensions",
"min_max_work_group_size", "min_max_work_items_sizes",
"min_max_read_image_args", "min_max_work_group_size",
"min_max_write_image_args", "min_max_read_image_args",
"min_max_mem_alloc_size", "min_max_write_image_args",
"min_max_image_2d_width", "min_max_mem_alloc_size",
"min_max_image_2d_height", "min_max_image_2d_width",
"min_max_image_3d_width", "min_max_image_2d_height",
"min_max_image_3d_height", "min_max_image_3d_width",
"min_max_image_3d_depth", "min_max_image_3d_height",
"min_max_image_array_size", "min_max_image_3d_depth",
"min_max_image_buffer_size", "min_max_image_array_size",
"min_max_parameter_size", "min_max_image_buffer_size",
"min_max_samplers", "min_max_parameter_size",
"min_max_constant_buffer_size", "min_max_samplers",
"min_max_constant_args", "min_max_constant_buffer_size",
"min_max_compute_units", "min_max_constant_args",
"min_max_address_bits", "min_max_compute_units",
"min_max_single_fp_config", "min_max_address_bits",
"min_max_double_fp_config", "min_max_single_fp_config",
"min_max_local_mem_size", "min_max_double_fp_config",
"min_max_kernel_preferred_work_group_size_multiple", "min_max_local_mem_size",
"min_max_execution_capabilities", "min_max_kernel_preferred_work_group_size_multiple",
"min_max_queue_properties", "min_max_execution_capabilities",
"min_max_device_version", "min_max_queue_properties",
"min_max_language_version", "min_max_device_version",
"min_max_language_version",
"kernel_arg_changes",
"kernel_arg_multi_setup_random", "kernel_arg_changes",
"kernel_arg_multi_setup_random",
"native_kernel",
"native_kernel",
"create_context_from_type",
"platform_extensions", "create_context_from_type",
"platform_extensions",
"get_platform_ids",
"bool_type", "get_platform_ids",
"bool_type",
"repeated_setup_cleanup",
"repeated_setup_cleanup",
"retain_queue_single",
"retain_queue_multiple", "retain_queue_single",
"retain_mem_object_single", "retain_queue_multiple",
"retain_mem_object_multiple", "retain_mem_object_single",
"retain_mem_object_multiple",
"min_data_type_align_size_alignment",
"min_data_type_align_size_alignment",
"mem_object_destructor_callback",
"null_buffer_arg", "mem_object_destructor_callback",
"get_buffer_info", "null_buffer_arg",
"get_image2d_info", "get_buffer_info",
"get_image3d_info", "get_image2d_info",
"get_image1d_info", "get_image3d_info",
"get_image1d_array_info", "get_image1d_info",
"get_image2d_array_info", "get_image1d_array_info",
"get_image2d_array_info",
"all", "queue_properties",
}; "all",
};
ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0])));
ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0])));
int num_fns = sizeof(basefn_names) / sizeof(char *);
int num_fns = sizeof(basefn_names) / sizeof(char *);
int main(int argc, const char *argv[])
{ int main(int argc, const char *argv[])
return runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 ); {
} return runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 );
}

View File

@@ -1,108 +1,109 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include "../../test_common/harness/kernelHelpers.h" #include "../../test_common/harness/kernelHelpers.h"
#include "../../test_common/harness/typeWrappers.h" #include "../../test_common/harness/typeWrappers.h"
#include "../../test_common/harness/clImageHelper.h" #include "../../test_common/harness/clImageHelper.h"
#include "../../test_common/harness/imageHelpers.h" #include "../../test_common/harness/imageHelpers.h"
extern float calculate_ulperror(float a, float b); extern float calculate_ulperror(float a, float b);
extern int test_load_single_kernel(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_load_single_kernel(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_load_two_kernels(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_load_two_kernels(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_load_two_kernels_in_one(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_load_two_kernels_in_one(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_load_two_kernels_manually(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_load_two_kernels_manually(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_program_info_kernel_names( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_program_info_kernel_names( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_create_kernels_in_program(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_create_kernels_in_program(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_enqueue_task(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_enqueue_task(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_repeated_setup_cleanup(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_repeated_setup_cleanup(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_for_bool_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_for_bool_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_platform_extensions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_platform_extensions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_platform_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_platform_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_sampler_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_sampler_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_command_queue_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_command_queue_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_context_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_context_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_device_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_device_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_required_group_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_required_group_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_binary_get(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_binary_get(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_program_binary_create(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_program_binary_create(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_release_kernel_order(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_release_kernel_order(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_release_during_execute(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_release_during_execute(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_kernel_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_kernel_info(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_execute_kernel_local_sizes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_execute_kernel_local_sizes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_set_kernel_arg_by_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_set_kernel_arg_by_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_set_kernel_arg_struct(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_set_kernel_arg_struct(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_set_kernel_arg_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_set_kernel_arg_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_set_kernel_arg_struct_array(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_set_kernel_arg_struct_array(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_global_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_global_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_thread_dimensions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_thread_dimensions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_work_items_sizes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_work_items_sizes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_work_group_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_work_group_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_read_image_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_read_image_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_write_image_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_write_image_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_mem_alloc_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_mem_alloc_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_2d_width(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_2d_width(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_2d_height(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_2d_height(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_3d_width(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_3d_width(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_3d_height(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_3d_height(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_3d_depth(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_3d_depth(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_array_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_array_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_image_buffer_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_image_buffer_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_parameter_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_parameter_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_samplers(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_samplers(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_constant_buffer_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_constant_buffer_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_constant_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_constant_args(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_compute_units(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_compute_units(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_address_bits(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_address_bits(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_single_fp_config(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_single_fp_config(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_double_fp_config(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_double_fp_config(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_local_mem_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_local_mem_size(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_kernel_preferred_work_group_size_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_kernel_preferred_work_group_size_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_execution_capabilities(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_execution_capabilities(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_queue_properties(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_queue_properties(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_device_version(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_device_version(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_max_language_version(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_min_max_language_version(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_native_kernel(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_native_kernel(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_create_context_from_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_create_context_from_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_get_platform_ids(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_get_platform_ids(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_arg_changes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_arg_changes(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_arg_multi_setup_random(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_arg_multi_setup_random(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_retain_queue_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_retain_queue_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_retain_queue_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_retain_queue_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_retain_mem_object_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_retain_mem_object_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_retain_mem_object_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_retain_mem_object_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_min_data_type_align_size_alignment(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_min_data_type_align_size_alignment(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_mem_object_destructor_callback(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_mem_object_destructor_callback(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_null_buffer_arg( cl_device_id device_id, cl_context context, cl_command_queue queue, int num_elements ); extern int test_null_buffer_arg( cl_device_id device_id, cl_context context, cl_command_queue queue, int num_elements );
extern int test_get_buffer_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_buffer_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_image2d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_image2d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_image3d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_image3d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_image1d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_image1d_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_image1d_array_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_image1d_array_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_image2d_array_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements ); extern int test_get_image2d_array_info( cl_device_id deviceID, cl_context context, cl_command_queue ignoreQueue, int num_elements );
extern int test_get_kernel_arg_info( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements ); extern int test_get_kernel_arg_info( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements );
extern int test_queue_properties( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements );

View File

@@ -1,36 +1,36 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _testBase_h #ifndef _testBase_h
#define _testBase_h #define _testBase_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <sys/types.h> #include <sys/types.h>
#include <sys/stat.h> #include <sys/stat.h>
#include "procs.h" #include "procs.h"
#endif // _testBase_h #endif // _testBase_h

File diff suppressed because it is too large Load Diff

View File

@@ -1,236 +1,236 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
static const char *sample_binary_kernel_source[] = { static const char *sample_binary_kernel_source[] = {
"__kernel void sample_test(__global float *src, __global int *dst)\n" "__kernel void sample_test(__global float *src, __global int *dst)\n"
"{\n" "{\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
"\n" "\n"
" dst[tid] = (int)src[tid] + 1;\n" " dst[tid] = (int)src[tid] + 1;\n"
"\n" "\n"
"}\n" }; "}\n" };
int test_binary_get(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_binary_get(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
int error; int error;
clProgramWrapper program; clProgramWrapper program;
size_t binarySize; size_t binarySize;
program = clCreateProgramWithSource( context, 1, sample_binary_kernel_source, NULL, &error ); program = clCreateProgramWithSource( context, 1, sample_binary_kernel_source, NULL, &error );
test_error( error, "Unable to create program from source" ); test_error( error, "Unable to create program from source" );
// Build so we have a binary to get // Build so we have a binary to get
error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL ); error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL );
test_error( error, "Unable to build test program" ); test_error( error, "Unable to build test program" );
// Get the size of the resulting binary (only one device) // Get the size of the resulting binary (only one device)
error = clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, sizeof( binarySize ), &binarySize, NULL ); error = clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, sizeof( binarySize ), &binarySize, NULL );
test_error( error, "Unable to get binary size" ); test_error( error, "Unable to get binary size" );
// Sanity check // Sanity check
if( binarySize == 0 ) if( binarySize == 0 )
{ {
log_error( "ERROR: Binary size of program is zero\n" ); log_error( "ERROR: Binary size of program is zero\n" );
return -1; return -1;
} }
// Create a buffer and get the actual binary // Create a buffer and get the actual binary
unsigned char *binary; unsigned char *binary;
binary = (unsigned char*)malloc(sizeof(unsigned char)*binarySize); binary = (unsigned char*)malloc(sizeof(unsigned char)*binarySize);
unsigned char *buffers[ 1 ] = { binary }; unsigned char *buffers[ 1 ] = { binary };
// Do another sanity check here first // Do another sanity check here first
size_t size; size_t size;
error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, 0, NULL, &size ); error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, 0, NULL, &size );
test_error( error, "Unable to get expected size of binaries array" ); test_error( error, "Unable to get expected size of binaries array" );
if( size != sizeof( buffers ) ) if( size != sizeof( buffers ) )
{ {
log_error( "ERROR: Expected size of binaries array in clGetProgramInfo is incorrect (should be %d, got %d)\n", (int)sizeof( buffers ), (int)size ); log_error( "ERROR: Expected size of binaries array in clGetProgramInfo is incorrect (should be %d, got %d)\n", (int)sizeof( buffers ), (int)size );
free(binary); free(binary);
return -1; return -1;
} }
error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL ); error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL );
test_error( error, "Unable to get program binary" ); test_error( error, "Unable to get program binary" );
// No way to verify the binary is correct, so just be good with that // No way to verify the binary is correct, so just be good with that
free(binary); free(binary);
return 0; return 0;
} }
int test_program_binary_create(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_program_binary_create(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
/* To test this in a self-contained fashion, we have to create a program with /* To test this in a self-contained fashion, we have to create a program with
source, then get the binary, then use that binary to reload the program, and then verify */ source, then get the binary, then use that binary to reload the program, and then verify */
int error; int error;
clProgramWrapper program, program_from_binary; clProgramWrapper program, program_from_binary;
size_t binarySize; size_t binarySize;
program = clCreateProgramWithSource( context, 1, sample_binary_kernel_source, NULL, &error ); program = clCreateProgramWithSource( context, 1, sample_binary_kernel_source, NULL, &error );
test_error( error, "Unable to create program from source" ); test_error( error, "Unable to create program from source" );
// Build so we have a binary to get // Build so we have a binary to get
error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL ); error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL );
test_error( error, "Unable to build test program" ); test_error( error, "Unable to build test program" );
// Get the size of the resulting binary (only one device) // Get the size of the resulting binary (only one device)
error = clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, sizeof( binarySize ), &binarySize, NULL ); error = clGetProgramInfo( program, CL_PROGRAM_BINARY_SIZES, sizeof( binarySize ), &binarySize, NULL );
test_error( error, "Unable to get binary size" ); test_error( error, "Unable to get binary size" );
// Sanity check // Sanity check
if( binarySize == 0 ) if( binarySize == 0 )
{ {
log_error( "ERROR: Binary size of program is zero\n" ); log_error( "ERROR: Binary size of program is zero\n" );
return -1; return -1;
} }
// Create a buffer and get the actual binary // Create a buffer and get the actual binary
unsigned char *binary; unsigned char *binary;
binary = (unsigned char*)malloc(sizeof(unsigned char)*binarySize); binary = (unsigned char*)malloc(sizeof(unsigned char)*binarySize);
const unsigned char *buffers[ 1 ] = { binary }; const unsigned char *buffers[ 1 ] = { binary };
error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL ); error = clGetProgramInfo( program, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL );
test_error( error, "Unable to get program binary" ); test_error( error, "Unable to get program binary" );
cl_int loadErrors[ 1 ]; cl_int loadErrors[ 1 ];
program_from_binary = clCreateProgramWithBinary( context, 1, &deviceID, &binarySize, buffers, loadErrors, &error ); program_from_binary = clCreateProgramWithBinary( context, 1, &deviceID, &binarySize, buffers, loadErrors, &error );
test_error( error, "Unable to load valid program binary" ); test_error( error, "Unable to load valid program binary" );
test_error( loadErrors[ 0 ], "Unable to load valid device binary into program" ); test_error( loadErrors[ 0 ], "Unable to load valid device binary into program" );
error = clBuildProgram( program_from_binary, 1, &deviceID, NULL, NULL, NULL ); error = clBuildProgram( program_from_binary, 1, &deviceID, NULL, NULL, NULL );
test_error( error, "Unable to build binary program" ); test_error( error, "Unable to build binary program" );
// Now get the binary one more time and verify it loaded the right binary // Now get the binary one more time and verify it loaded the right binary
unsigned char *binary2; unsigned char *binary2;
binary2 = (unsigned char*)malloc(sizeof(unsigned char)*binarySize); binary2 = (unsigned char*)malloc(sizeof(unsigned char)*binarySize);
buffers[ 0 ] = binary2; buffers[ 0 ] = binary2;
error = clGetProgramInfo( program_from_binary, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL ); error = clGetProgramInfo( program_from_binary, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL );
test_error( error, "Unable to get program binary second time" ); test_error( error, "Unable to get program binary second time" );
if( memcmp( binary, binary2, binarySize ) != 0 ) if( memcmp( binary, binary2, binarySize ) != 0 )
{ {
log_error( "ERROR: Program binary is different when loaded from binary!\n" ); log_error( "ERROR: Program binary is different when loaded from binary!\n" );
free(binary2); free(binary2);
free(binary); free(binary);
return -1; return -1;
} }
// Try again, this time without passing the status ptr in, to make sure we still // Try again, this time without passing the status ptr in, to make sure we still
// get a valid binary // get a valid binary
clProgramWrapper programWithoutStatus = clCreateProgramWithBinary( context, 1, &deviceID, &binarySize, buffers, NULL, &error ); clProgramWrapper programWithoutStatus = clCreateProgramWithBinary( context, 1, &deviceID, &binarySize, buffers, NULL, &error );
test_error( error, "Unable to load valid program binary when binary_status pointer is NULL" ); test_error( error, "Unable to load valid program binary when binary_status pointer is NULL" );
error = clBuildProgram( programWithoutStatus, 1, &deviceID, NULL, NULL, NULL ); error = clBuildProgram( programWithoutStatus, 1, &deviceID, NULL, NULL, NULL );
test_error( error, "Unable to build binary program" ); test_error( error, "Unable to build binary program" );
// Now get the binary one more time and verify it loaded the right binary // Now get the binary one more time and verify it loaded the right binary
unsigned char *binary3; unsigned char *binary3;
binary3 = (unsigned char*)malloc(sizeof(unsigned char)*binarySize); binary3 = (unsigned char*)malloc(sizeof(unsigned char)*binarySize);
buffers[ 0 ] = binary3; buffers[ 0 ] = binary3;
error = clGetProgramInfo( program_from_binary, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL ); error = clGetProgramInfo( program_from_binary, CL_PROGRAM_BINARIES, sizeof( buffers ), &buffers, NULL );
test_error( error, "Unable to get program binary second time" ); test_error( error, "Unable to get program binary second time" );
if( memcmp( binary, binary3, binarySize ) != 0 ) if( memcmp( binary, binary3, binarySize ) != 0 )
{ {
log_error( "ERROR: Program binary is different when status pointer is NULL!\n" ); log_error( "ERROR: Program binary is different when status pointer is NULL!\n" );
free(binary3); free(binary3);
free(binary2); free(binary2);
free(binary); free(binary);
return -1; return -1;
} }
free(binary3); free(binary3);
// Now execute them both to see that they both do the same thing. // Now execute them both to see that they both do the same thing.
clMemWrapper in, out, out_binary; clMemWrapper in, out, out_binary;
clKernelWrapper kernel, kernel_binary; clKernelWrapper kernel, kernel_binary;
cl_int *out_data, *out_data_binary; cl_int *out_data, *out_data_binary;
cl_float *in_data; cl_float *in_data;
size_t size_to_run = 1000; size_t size_to_run = 1000;
// Allocate some data // Allocate some data
in_data = (cl_float*)malloc(sizeof(cl_float)*size_to_run); in_data = (cl_float*)malloc(sizeof(cl_float)*size_to_run);
out_data = (cl_int*)malloc(sizeof(cl_int)*size_to_run); out_data = (cl_int*)malloc(sizeof(cl_int)*size_to_run);
out_data_binary = (cl_int*)malloc(sizeof(cl_int)*size_to_run); out_data_binary = (cl_int*)malloc(sizeof(cl_int)*size_to_run);
memset(out_data, 0, sizeof(cl_int)*size_to_run); memset(out_data, 0, sizeof(cl_int)*size_to_run);
memset(out_data_binary, 0, sizeof(cl_int)*size_to_run); memset(out_data_binary, 0, sizeof(cl_int)*size_to_run);
for (size_t i=0; i<size_to_run; i++) for (size_t i=0; i<size_to_run; i++)
in_data[i] = (cl_float)i; in_data[i] = (cl_float)i;
// Create the buffers // Create the buffers
in = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_float)*size_to_run, in_data, &error); in = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_float)*size_to_run, in_data, &error);
test_error( error, "clCreateBuffer failed"); test_error( error, "clCreateBuffer failed");
out = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_int)*size_to_run, out_data, &error); out = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_int)*size_to_run, out_data, &error);
test_error( error, "clCreateBuffer failed"); test_error( error, "clCreateBuffer failed");
out_binary = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_int)*size_to_run, out_data_binary, &error); out_binary = clCreateBuffer(context, CL_MEM_COPY_HOST_PTR, sizeof(cl_int)*size_to_run, out_data_binary, &error);
test_error( error, "clCreateBuffer failed"); test_error( error, "clCreateBuffer failed");
// Create the kernels // Create the kernels
kernel = clCreateKernel(program, "sample_test", &error); kernel = clCreateKernel(program, "sample_test", &error);
test_error( error, "clCreateKernel failed"); test_error( error, "clCreateKernel failed");
kernel_binary = clCreateKernel(program_from_binary, "sample_test", &error); kernel_binary = clCreateKernel(program_from_binary, "sample_test", &error);
test_error( error, "clCreateKernel from binary failed"); test_error( error, "clCreateKernel from binary failed");
// Set the arguments // Set the arguments
error = clSetKernelArg(kernel, 0, sizeof(in), &in); error = clSetKernelArg(kernel, 0, sizeof(in), &in);
test_error( error, "clSetKernelArg failed"); test_error( error, "clSetKernelArg failed");
error = clSetKernelArg(kernel, 1, sizeof(out), &out); error = clSetKernelArg(kernel, 1, sizeof(out), &out);
test_error( error, "clSetKernelArg failed"); test_error( error, "clSetKernelArg failed");
error = clSetKernelArg(kernel_binary, 0, sizeof(in), &in); error = clSetKernelArg(kernel_binary, 0, sizeof(in), &in);
test_error( error, "clSetKernelArg failed"); test_error( error, "clSetKernelArg failed");
error = clSetKernelArg(kernel_binary, 1, sizeof(out_binary), &out_binary); error = clSetKernelArg(kernel_binary, 1, sizeof(out_binary), &out_binary);
test_error( error, "clSetKernelArg failed"); test_error( error, "clSetKernelArg failed");
// Execute the kernels // Execute the kernels
error = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &size_to_run, NULL, 0, NULL, NULL); error = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &size_to_run, NULL, 0, NULL, NULL);
test_error( error, "clEnqueueNDRangeKernel failed"); test_error( error, "clEnqueueNDRangeKernel failed");
error = clEnqueueNDRangeKernel(queue, kernel_binary, 1, NULL, &size_to_run, NULL, 0, NULL, NULL); error = clEnqueueNDRangeKernel(queue, kernel_binary, 1, NULL, &size_to_run, NULL, 0, NULL, NULL);
test_error( error, "clEnqueueNDRangeKernel for binary kernel failed"); test_error( error, "clEnqueueNDRangeKernel for binary kernel failed");
// Finish up // Finish up
error = clFinish(queue); error = clFinish(queue);
test_error( error, "clFinish failed"); test_error( error, "clFinish failed");
// Get the results back // Get the results back
error = clEnqueueReadBuffer(queue, out, CL_TRUE, 0, sizeof(cl_int)*size_to_run, out_data, 0, NULL, NULL); error = clEnqueueReadBuffer(queue, out, CL_TRUE, 0, sizeof(cl_int)*size_to_run, out_data, 0, NULL, NULL);
test_error( error, "clEnqueueReadBuffer failed"); test_error( error, "clEnqueueReadBuffer failed");
error = clEnqueueReadBuffer(queue, out_binary, CL_TRUE, 0, sizeof(cl_int)*size_to_run, out_data_binary, 0, NULL, NULL); error = clEnqueueReadBuffer(queue, out_binary, CL_TRUE, 0, sizeof(cl_int)*size_to_run, out_data_binary, 0, NULL, NULL);
test_error( error, "clEnqueueReadBuffer failed"); test_error( error, "clEnqueueReadBuffer failed");
// Compare the results // Compare the results
if( memcmp( out_data, out_data_binary, sizeof(cl_int)*size_to_run ) != 0 ) if( memcmp( out_data, out_data_binary, sizeof(cl_int)*size_to_run ) != 0 )
{ {
log_error( "ERROR: Results from executing binary and regular kernel differ.\n" ); log_error( "ERROR: Results from executing binary and regular kernel differ.\n" );
free(binary2); free(binary2);
free(binary); free(binary);
return -1; return -1;
} }
// All done! // All done!
free(in_data); free(in_data);
free(out_data); free(out_data);
free(out_data_binary); free(out_data_binary);
free(binary2); free(binary2);
free(binary); free(binary);
return 0; return 0;
} }

View File

@@ -1,52 +1,52 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
const char *kernel_with_bool[] = { const char *kernel_with_bool[] = {
"__kernel void kernel_with_bool(__global float *src, __global int *dst)\n" "__kernel void kernel_with_bool(__global float *src, __global int *dst)\n"
"{\n" "{\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
"\n" "\n"
" bool myBool = (src[tid] < 0.5f) && (src[tid] > -0.5f);\n" " bool myBool = (src[tid] < 0.5f) && (src[tid] > -0.5f);\n"
" if(myBool)\n" " if(myBool)\n"
" {\n" " {\n"
" dst[tid] = (int)src[tid];\n" " dst[tid] = (int)src[tid];\n"
" }\n" " }\n"
" else\n" " else\n"
" {\n" " {\n"
" dst[tid] = 0;\n" " dst[tid] = 0;\n"
" }\n" " }\n"
"\n" "\n"
"}\n" "}\n"
}; };
int test_for_bool_type(cl_device_id deviceID, cl_context context, int test_for_bool_type(cl_device_id deviceID, cl_context context,
cl_command_queue queue, int num_elements) cl_command_queue queue, int num_elements)
{ {
cl_program program; cl_program program;
cl_kernel kernel; cl_kernel kernel;
int err = create_single_kernel_helper(context, int err = create_single_kernel_helper(context,
&program, &program,
&kernel, &kernel,
1, kernel_with_bool, 1, kernel_with_bool,
"kernel_with_bool" ); "kernel_with_bool" );
return err; return err;
} }

View File

@@ -1,135 +1,135 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#ifndef _WIN32 #ifndef _WIN32
#include <unistd.h> #include <unistd.h>
#endif #endif
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
extern cl_uint gRandomSeed; extern cl_uint gRandomSeed;
void CL_CALLBACK notify_callback(const char *errinfo, const void *private_info, size_t cb, void *user_data) void CL_CALLBACK notify_callback(const char *errinfo, const void *private_info, size_t cb, void *user_data)
{ {
log_info( "%s\n", errinfo ); log_info( "%s\n", errinfo );
} }
int test_create_context_from_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_create_context_from_type(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
int error; int error;
clProgramWrapper program; clProgramWrapper program;
clKernelWrapper kernel; clKernelWrapper kernel;
clMemWrapper streams[2]; clMemWrapper streams[2];
clContextWrapper context_to_test; clContextWrapper context_to_test;
clCommandQueueWrapper queue_to_test; clCommandQueueWrapper queue_to_test;
size_t threads[1], localThreads[1]; size_t threads[1], localThreads[1];
cl_float inputData[10]; cl_float inputData[10];
cl_int outputData[10]; cl_int outputData[10];
int i; int i;
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
const char *sample_single_test_kernel[] = { const char *sample_single_test_kernel[] = {
"__kernel void sample_test(__global float *src, __global int *dst)\n" "__kernel void sample_test(__global float *src, __global int *dst)\n"
"{\n" "{\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
"\n" "\n"
" dst[tid] = (int)src[tid];\n" " dst[tid] = (int)src[tid];\n"
"\n" "\n"
"}\n" }; "}\n" };
cl_device_type type; cl_device_type type;
error = clGetDeviceInfo(deviceID, CL_DEVICE_TYPE, sizeof(type), &type, NULL); error = clGetDeviceInfo(deviceID, CL_DEVICE_TYPE, sizeof(type), &type, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed\n"); test_error(error, "clGetDeviceInfo for CL_DEVICE_TYPE failed\n");
cl_platform_id platform; cl_platform_id platform;
error = clGetDeviceInfo(deviceID, CL_DEVICE_PLATFORM, sizeof(platform), &platform, NULL); error = clGetDeviceInfo(deviceID, CL_DEVICE_PLATFORM, sizeof(platform), &platform, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_PLATFORM failed\n"); test_error(error, "clGetDeviceInfo for CL_DEVICE_PLATFORM failed\n");
cl_context_properties properties[3] = { cl_context_properties properties[3] = {
(cl_context_properties)CL_CONTEXT_PLATFORM, (cl_context_properties)CL_CONTEXT_PLATFORM,
(cl_context_properties)platform, (cl_context_properties)platform,
NULL NULL
}; };
context_to_test = clCreateContextFromType(properties, type, notify_callback, NULL, &error); context_to_test = clCreateContextFromType(properties, type, notify_callback, NULL, &error);
test_error(error, "clCreateContextFromType failed"); test_error(error, "clCreateContextFromType failed");
if (context_to_test == NULL) { if (context_to_test == NULL) {
log_error("clCreateContextFromType returned NULL, but error was CL_SUCCESS."); log_error("clCreateContextFromType returned NULL, but error was CL_SUCCESS.");
return -1; return -1;
} }
queue_to_test = clCreateCommandQueue(context_to_test, deviceID, NULL, &error); queue_to_test = clCreateCommandQueue(context_to_test, deviceID, NULL, &error);
test_error(error, "clCreateCommandQueue failed"); test_error(error, "clCreateCommandQueue failed");
if (queue_to_test == NULL) { if (queue_to_test == NULL) {
log_error("clCreateCommandQueue returned NULL, but error was CL_SUCCESS."); log_error("clCreateCommandQueue returned NULL, but error was CL_SUCCESS.");
return -1; return -1;
} }
/* Create a kernel to test with */ /* Create a kernel to test with */
if( create_single_kernel_helper( context_to_test, &program, &kernel, 1, sample_single_test_kernel, "sample_test" ) != 0 ) if( create_single_kernel_helper( context_to_test, &program, &kernel, 1, sample_single_test_kernel, "sample_test" ) != 0 )
{ {
return -1; return -1;
} }
/* Create some I/O streams */ /* Create some I/O streams */
streams[0] = clCreateBuffer(context_to_test, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_float) * 10, NULL, &error); streams[0] = clCreateBuffer(context_to_test, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_float) * 10, NULL, &error);
test_error( error, "Creating test array failed" ); test_error( error, "Creating test array failed" );
streams[1] = clCreateBuffer(context_to_test, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_int) * 10, NULL, &error); streams[1] = clCreateBuffer(context_to_test, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_int) * 10, NULL, &error);
test_error( error, "Creating test array failed" ); test_error( error, "Creating test array failed" );
/* Write some test data */ /* Write some test data */
memset( outputData, 0, sizeof( outputData ) ); memset( outputData, 0, sizeof( outputData ) );
for (i=0; i<10; i++) for (i=0; i<10; i++)
inputData[i] = get_random_float(-(float) 0x7fffffff, (float) 0x7fffffff, seed); inputData[i] = get_random_float(-(float) 0x7fffffff, (float) 0x7fffffff, seed);
error = clEnqueueWriteBuffer(queue_to_test, streams[0], CL_TRUE, 0, sizeof(cl_float)*10, (void *)inputData, 0, NULL, NULL); error = clEnqueueWriteBuffer(queue_to_test, streams[0], CL_TRUE, 0, sizeof(cl_float)*10, (void *)inputData, 0, NULL, NULL);
test_error( error, "Unable to set testing kernel data" ); test_error( error, "Unable to set testing kernel data" );
/* Test setting the arguments by index manually */ /* Test setting the arguments by index manually */
error = clSetKernelArg(kernel, 1, sizeof( streams[1] ), &streams[1]); error = clSetKernelArg(kernel, 1, sizeof( streams[1] ), &streams[1]);
test_error( error, "Unable to set indexed kernel arguments" ); test_error( error, "Unable to set indexed kernel arguments" );
error = clSetKernelArg(kernel, 0, sizeof( streams[0] ), &streams[0]); error = clSetKernelArg(kernel, 0, sizeof( streams[0] ), &streams[0]);
test_error( error, "Unable to set indexed kernel arguments" ); test_error( error, "Unable to set indexed kernel arguments" );
/* Test running the kernel and verifying it */ /* Test running the kernel and verifying it */
threads[0] = (size_t)10; threads[0] = (size_t)10;
error = get_max_common_work_group_size( context_to_test, kernel, threads[0], &localThreads[0] ); error = get_max_common_work_group_size( context_to_test, kernel, threads[0], &localThreads[0] );
test_error( error, "Unable to get work group size to use" ); test_error( error, "Unable to get work group size to use" );
error = clEnqueueNDRangeKernel( queue_to_test, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL ); error = clEnqueueNDRangeKernel( queue_to_test, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL );
test_error( error, "Kernel execution failed" ); test_error( error, "Kernel execution failed" );
error = clEnqueueReadBuffer( queue_to_test, streams[1], CL_TRUE, 0, sizeof(cl_int)*10, (void *)outputData, 0, NULL, NULL ); error = clEnqueueReadBuffer( queue_to_test, streams[1], CL_TRUE, 0, sizeof(cl_int)*10, (void *)outputData, 0, NULL, NULL );
test_error( error, "Unable to get result data" ); test_error( error, "Unable to get result data" );
for (i=0; i<10; i++) for (i=0; i<10; i++)
{ {
if (outputData[i] != (int)inputData[i]) if (outputData[i] != (int)inputData[i])
{ {
log_error( "ERROR: Data did not verify on first pass!\n" ); log_error( "ERROR: Data did not verify on first pass!\n" );
return -1; return -1;
} }
} }
return 0; return 0;
} }

File diff suppressed because it is too large Load Diff

View File

@@ -1,60 +1,60 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#ifndef _WIN32 #ifndef _WIN32
#include <unistd.h> #include <unistd.h>
#endif #endif
int IsAPowerOfTwo( unsigned long x ) int IsAPowerOfTwo( unsigned long x )
{ {
return 0 == (x & (x-1)); return 0 == (x & (x-1));
} }
int test_min_data_type_align_size_alignment(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ) int test_min_data_type_align_size_alignment(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems )
{ {
cl_uint min_alignment; cl_uint min_alignment;
if (gHasLong) if (gHasLong)
min_alignment = sizeof(cl_long)*16; min_alignment = sizeof(cl_long)*16;
else else
min_alignment = sizeof(cl_int)*16; min_alignment = sizeof(cl_int)*16;
int error = 0; int error = 0;
cl_uint alignment; cl_uint alignment;
error = clGetDeviceInfo(device, CL_DEVICE_MEM_BASE_ADDR_ALIGN, sizeof(alignment), &alignment, NULL); error = clGetDeviceInfo(device, CL_DEVICE_MEM_BASE_ADDR_ALIGN, sizeof(alignment), &alignment, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_MEM_BASE_ADDR_ALIGN failed"); test_error(error, "clGetDeviceInfo for CL_DEVICE_MEM_BASE_ADDR_ALIGN failed");
log_info("Device reported CL_DEVICE_MEM_BASE_ADDR_ALIGN = %lu bits.\n", (unsigned long)alignment); log_info("Device reported CL_DEVICE_MEM_BASE_ADDR_ALIGN = %lu bits.\n", (unsigned long)alignment);
// Verify the size is large enough // Verify the size is large enough
if (alignment < min_alignment*8) { if (alignment < min_alignment*8) {
log_error("ERROR: alignment too small. Minimum alignment for %s16 is %lu bits, device reported %lu bits.", log_error("ERROR: alignment too small. Minimum alignment for %s16 is %lu bits, device reported %lu bits.",
(gHasLong) ? "long" : "int", (gHasLong) ? "long" : "int",
(unsigned long)(min_alignment*8), (unsigned long)alignment); (unsigned long)(min_alignment*8), (unsigned long)alignment);
return -1; return -1;
} }
// Verify the size is a power of two // Verify the size is a power of two
if (!IsAPowerOfTwo((unsigned long)alignment)) { if (!IsAPowerOfTwo((unsigned long)alignment)) {
log_error("ERROR: alignment is not a power of two.\n"); log_error("ERROR: alignment is not a power of two.\n");
return -1; return -1;
} }
return 0; return 0;
} }

View File

@@ -1,141 +1,141 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
extern "C" { extern cl_uint gRandomSeed;} extern "C" { extern cl_uint gRandomSeed;}
// This test is designed to stress changing kernel arguments between execute calls (that are asynchronous and thus // This test is designed to stress changing kernel arguments between execute calls (that are asynchronous and thus
// potentially overlapping) to make sure each kernel gets the right arguments // potentially overlapping) to make sure each kernel gets the right arguments
// Note: put a delay loop in the kernel to make sure we have time to queue the next kernel before this one finishes // Note: put a delay loop in the kernel to make sure we have time to queue the next kernel before this one finishes
const char *inspect_image_kernel_source[] = { const char *inspect_image_kernel_source[] = {
"__kernel void sample_test(read_only image2d_t src, __global int *outDimensions )\n" "__kernel void sample_test(read_only image2d_t src, __global int *outDimensions )\n"
"{\n" "{\n"
" int tid = get_global_id(0), i;\n" " int tid = get_global_id(0), i;\n"
" for( i = 0; i < 100000; i++ ); \n" " for( i = 0; i < 100000; i++ ); \n"
" outDimensions[tid * 2] = get_image_width(src) * tid;\n" " outDimensions[tid * 2] = get_image_width(src) * tid;\n"
" outDimensions[tid * 2 + 1] = get_image_height(src) * tid;\n" " outDimensions[tid * 2 + 1] = get_image_height(src) * tid;\n"
"\n" "\n"
"}\n" }; "}\n" };
#define NUM_TRIES 100 #define NUM_TRIES 100
#define NUM_THREADS 2048 #define NUM_THREADS 2048
int test_kernel_arg_changes(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements) int test_kernel_arg_changes(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements)
{ {
clProgramWrapper program; clProgramWrapper program;
clKernelWrapper kernel; clKernelWrapper kernel;
int error, i; int error, i;
clMemWrapper images[ NUM_TRIES ]; clMemWrapper images[ NUM_TRIES ];
size_t sizes[ NUM_TRIES ][ 2 ]; size_t sizes[ NUM_TRIES ][ 2 ];
clMemWrapper results[ NUM_TRIES ]; clMemWrapper results[ NUM_TRIES ];
cl_image_format imageFormat; cl_image_format imageFormat;
size_t maxWidth, maxHeight; size_t maxWidth, maxHeight;
size_t threads[1], localThreads[1]; size_t threads[1], localThreads[1];
cl_int resultArray[ NUM_THREADS * 2 ]; cl_int resultArray[ NUM_THREADS * 2 ];
char errStr[ 128 ]; char errStr[ 128 ];
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
PASSIVE_REQUIRE_IMAGE_SUPPORT( device ) PASSIVE_REQUIRE_IMAGE_SUPPORT( device )
// Just get any ol format to test with // Just get any ol format to test with
error = get_8_bit_image_format( context, CL_MEM_OBJECT_IMAGE2D, CL_MEM_READ_WRITE, 0, &imageFormat ); error = get_8_bit_image_format( context, CL_MEM_OBJECT_IMAGE2D, CL_MEM_READ_WRITE, 0, &imageFormat );
test_error( error, "Unable to obtain suitable image format to test with!" ); test_error( error, "Unable to obtain suitable image format to test with!" );
// Create our testing kernel // Create our testing kernel
error = create_single_kernel_helper( context, &program, &kernel, 1, inspect_image_kernel_source, "sample_test" ); error = create_single_kernel_helper( context, &program, &kernel, 1, inspect_image_kernel_source, "sample_test" );
test_error( error, "Unable to create testing kernel" ); test_error( error, "Unable to create testing kernel" );
// Get max dimensions for each of our images // Get max dimensions for each of our images
error = clGetDeviceInfo( device, CL_DEVICE_IMAGE2D_MAX_WIDTH, sizeof( maxWidth ), &maxWidth, NULL ); error = clGetDeviceInfo( device, CL_DEVICE_IMAGE2D_MAX_WIDTH, sizeof( maxWidth ), &maxWidth, NULL );
error |= clGetDeviceInfo( device, CL_DEVICE_IMAGE2D_MAX_HEIGHT, sizeof( maxHeight ), &maxHeight, NULL ); error |= clGetDeviceInfo( device, CL_DEVICE_IMAGE2D_MAX_HEIGHT, sizeof( maxHeight ), &maxHeight, NULL );
test_error( error, "Unable to get max image dimensions for device" ); test_error( error, "Unable to get max image dimensions for device" );
// Get the number of threads we'll be able to run // Get the number of threads we'll be able to run
threads[0] = NUM_THREADS; threads[0] = NUM_THREADS;
error = get_max_common_work_group_size( context, kernel, threads[0], &localThreads[0] ); error = get_max_common_work_group_size( context, kernel, threads[0], &localThreads[0] );
test_error( error, "Unable to get work group size for kernel" ); test_error( error, "Unable to get work group size for kernel" );
// Create a variety of images and output arrays // Create a variety of images and output arrays
for( i = 0; i < NUM_TRIES; i++ ) for( i = 0; i < NUM_TRIES; i++ )
{ {
sizes[ i ][ 0 ] = genrand_int32(seed) % (maxWidth/32) + 1; sizes[ i ][ 0 ] = genrand_int32(seed) % (maxWidth/32) + 1;
sizes[ i ][ 1 ] = genrand_int32(seed) % (maxHeight/32) + 1; sizes[ i ][ 1 ] = genrand_int32(seed) % (maxHeight/32) + 1;
images[ i ] = create_image_2d( context, (cl_mem_flags)(CL_MEM_READ_ONLY), images[ i ] = create_image_2d( context, (cl_mem_flags)(CL_MEM_READ_ONLY),
&imageFormat, sizes[ i ][ 0], sizes[ i ][ 1 ], 0, NULL, &error ); &imageFormat, sizes[ i ][ 0], sizes[ i ][ 1 ], 0, NULL, &error );
if( images[i] == NULL ) if( images[i] == NULL )
{ {
log_error("Failed to create image %d of size %d x %d (%s).\n", i, (int)sizes[i][0], (int)sizes[i][1], IGetErrorString( error )); log_error("Failed to create image %d of size %d x %d (%s).\n", i, (int)sizes[i][0], (int)sizes[i][1], IGetErrorString( error ));
return -1; return -1;
} }
results[ i ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof( cl_int ) * threads[0] * 2, NULL, &error ); results[ i ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof( cl_int ) * threads[0] * 2, NULL, &error );
if( results[i] == NULL) if( results[i] == NULL)
{ {
log_error("Failed to create array %d of size %d.\n", i, (int)threads[0]*2); log_error("Failed to create array %d of size %d.\n", i, (int)threads[0]*2);
return -1; return -1;
} }
} }
// Start setting arguments and executing kernels // Start setting arguments and executing kernels
for( i = 0; i < NUM_TRIES; i++ ) for( i = 0; i < NUM_TRIES; i++ )
{ {
// Set the arguments for this try // Set the arguments for this try
error = clSetKernelArg( kernel, 0, sizeof( cl_mem ), &images[ i ] ); error = clSetKernelArg( kernel, 0, sizeof( cl_mem ), &images[ i ] );
sprintf( errStr, "Unable to set argument 0 for kernel try %d", i ); sprintf( errStr, "Unable to set argument 0 for kernel try %d", i );
test_error( error, errStr ); test_error( error, errStr );
error = clSetKernelArg( kernel, 1, sizeof( cl_mem ), &results[ i ] ); error = clSetKernelArg( kernel, 1, sizeof( cl_mem ), &results[ i ] );
sprintf( errStr, "Unable to set argument 1 for kernel try %d", i ); sprintf( errStr, "Unable to set argument 1 for kernel try %d", i );
test_error( error, errStr ); test_error( error, errStr );
// Queue up execution // Queue up execution
error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL ); error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL );
sprintf( errStr, "Unable to execute kernel try %d", i ); sprintf( errStr, "Unable to execute kernel try %d", i );
test_error( error, errStr ); test_error( error, errStr );
} }
// Read the results back out, one at a time, and verify // Read the results back out, one at a time, and verify
for( i = 0; i < NUM_TRIES; i++ ) for( i = 0; i < NUM_TRIES; i++ )
{ {
error = clEnqueueReadBuffer( queue, results[ i ], CL_TRUE, 0, sizeof( cl_int ) * threads[0] * 2, resultArray, 0, NULL, NULL ); error = clEnqueueReadBuffer( queue, results[ i ], CL_TRUE, 0, sizeof( cl_int ) * threads[0] * 2, resultArray, 0, NULL, NULL );
sprintf( errStr, "Unable to read results for kernel try %d", i ); sprintf( errStr, "Unable to read results for kernel try %d", i );
test_error( error, errStr ); test_error( error, errStr );
// Verify. Each entry should be n * the (width/height) of image i // Verify. Each entry should be n * the (width/height) of image i
for( int j = 0; j < NUM_THREADS; j++ ) for( int j = 0; j < NUM_THREADS; j++ )
{ {
if( resultArray[ j * 2 + 0 ] != (int)sizes[ i ][ 0 ] * j ) if( resultArray[ j * 2 + 0 ] != (int)sizes[ i ][ 0 ] * j )
{ {
log_error( "ERROR: Verficiation for kernel try %d, sample %d FAILED, expected a width of %d, got %d\n", log_error( "ERROR: Verficiation for kernel try %d, sample %d FAILED, expected a width of %d, got %d\n",
i, j, (int)sizes[ i ][ 0 ] * j, resultArray[ j * 2 + 0 ] ); i, j, (int)sizes[ i ][ 0 ] * j, resultArray[ j * 2 + 0 ] );
return -1; return -1;
} }
if( resultArray[ j * 2 + 1 ] != (int)sizes[ i ][ 1 ] * j ) if( resultArray[ j * 2 + 1 ] != (int)sizes[ i ][ 1 ] * j )
{ {
log_error( "ERROR: Verficiation for kernel try %d, sample %d FAILED, expected a height of %d, got %d\n", log_error( "ERROR: Verficiation for kernel try %d, sample %d FAILED, expected a height of %d, got %d\n",
i, j, (int)sizes[ i ][ 1 ] * j, resultArray[ j * 2 + 1 ] ); i, j, (int)sizes[ i ][ 1 ] * j, resultArray[ j * 2 + 1 ] );
return -1; return -1;
} }
} }
} }
// If we got here, everything verified successfully // If we got here, everything verified successfully
return 0; return 0;
} }

File diff suppressed because it is too large Load Diff

View File

@@ -1,277 +1,277 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
// This test is designed to stress passing multiple vector parameters to kernels and verifying access between them all // This test is designed to stress passing multiple vector parameters to kernels and verifying access between them all
const char *multi_arg_kernel_source_pattern = const char *multi_arg_kernel_source_pattern =
"__kernel void sample_test(__global %s *src1, __global %s *src2, __global %s *src3, __global %s *dst1, __global %s *dst2, __global %s *dst3 )\n" "__kernel void sample_test(__global %s *src1, __global %s *src2, __global %s *src3, __global %s *dst1, __global %s *dst2, __global %s *dst3 )\n"
"{\n" "{\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
" dst1[tid] = src1[tid];\n" " dst1[tid] = src1[tid];\n"
" dst2[tid] = src2[tid];\n" " dst2[tid] = src2[tid];\n"
" dst3[tid] = src3[tid];\n" " dst3[tid] = src3[tid];\n"
"}\n"; "}\n";
extern cl_uint gRandomSeed; extern cl_uint gRandomSeed;
#define MAX_ERROR_TOLERANCE 0.0005f #define MAX_ERROR_TOLERANCE 0.0005f
int test_multi_arg_set(cl_device_id device, cl_context context, cl_command_queue queue, int test_multi_arg_set(cl_device_id device, cl_context context, cl_command_queue queue,
ExplicitType vec1Type, int vec1Size, ExplicitType vec1Type, int vec1Size,
ExplicitType vec2Type, int vec2Size, ExplicitType vec2Type, int vec2Size,
ExplicitType vec3Type, int vec3Size, MTdata d) ExplicitType vec3Type, int vec3Size, MTdata d)
{ {
clProgramWrapper program; clProgramWrapper program;
clKernelWrapper kernel; clKernelWrapper kernel;
int error, i, j; int error, i, j;
clMemWrapper streams[ 6 ]; clMemWrapper streams[ 6 ];
size_t threads[1], localThreads[1]; size_t threads[1], localThreads[1];
char programSrc[ 10248 ], vec1Name[ 64 ], vec2Name[ 64 ], vec3Name[ 64 ]; char programSrc[ 10248 ], vec1Name[ 64 ], vec2Name[ 64 ], vec3Name[ 64 ];
char sizeNames[][ 4 ] = { "", "2", "3", "4", "", "", "", "8" }; char sizeNames[][ 4 ] = { "", "2", "3", "4", "", "", "", "8" };
const char *ptr; const char *ptr;
void *initData[3], *resultData[3]; void *initData[3], *resultData[3];
// Create the program source // Create the program source
sprintf( vec1Name, "%s%s", get_explicit_type_name( vec1Type ), sizeNames[ vec1Size - 1 ] ); sprintf( vec1Name, "%s%s", get_explicit_type_name( vec1Type ), sizeNames[ vec1Size - 1 ] );
sprintf( vec2Name, "%s%s", get_explicit_type_name( vec2Type ), sizeNames[ vec2Size - 1 ] ); sprintf( vec2Name, "%s%s", get_explicit_type_name( vec2Type ), sizeNames[ vec2Size - 1 ] );
sprintf( vec3Name, "%s%s", get_explicit_type_name( vec3Type ), sizeNames[ vec3Size - 1 ] ); sprintf( vec3Name, "%s%s", get_explicit_type_name( vec3Type ), sizeNames[ vec3Size - 1 ] );
sprintf( programSrc, multi_arg_kernel_source_pattern, sprintf( programSrc, multi_arg_kernel_source_pattern,
vec1Name, vec2Name, vec3Name, vec1Name, vec2Name, vec3Name, vec1Name, vec2Name, vec3Name, vec1Name, vec2Name, vec3Name,
vec1Size, vec1Size, vec2Size, vec2Size, vec3Size, vec3Size ); vec1Size, vec1Size, vec2Size, vec2Size, vec3Size, vec3Size );
ptr = programSrc; ptr = programSrc;
// Create our testing kernel // Create our testing kernel
error = create_single_kernel_helper( context, &program, &kernel, 1, &ptr, "sample_test" ); error = create_single_kernel_helper( context, &program, &kernel, 1, &ptr, "sample_test" );
test_error( error, "Unable to create testing kernel" ); test_error( error, "Unable to create testing kernel" );
// Get thread dimensions // Get thread dimensions
threads[0] = 1024; threads[0] = 1024;
error = get_max_common_work_group_size( context, kernel, threads[0], &localThreads[0] ); error = get_max_common_work_group_size( context, kernel, threads[0], &localThreads[0] );
test_error( error, "Unable to get work group size for kernel" ); test_error( error, "Unable to get work group size for kernel" );
// Create input streams // Create input streams
initData[ 0 ] = create_random_data( vec1Type, d, (unsigned int)threads[ 0 ] * vec1Size ); initData[ 0 ] = create_random_data( vec1Type, d, (unsigned int)threads[ 0 ] * vec1Size );
streams[ 0 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec1Type ) * threads[0] * vec1Size, initData[ 0 ], &error ); streams[ 0 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec1Type ) * threads[0] * vec1Size, initData[ 0 ], &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
initData[ 1 ] = create_random_data( vec2Type, d, (unsigned int)threads[ 0 ] * vec2Size ); initData[ 1 ] = create_random_data( vec2Type, d, (unsigned int)threads[ 0 ] * vec2Size );
streams[ 1 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec2Type ) * threads[0] * vec2Size, initData[ 1 ], &error ); streams[ 1 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec2Type ) * threads[0] * vec2Size, initData[ 1 ], &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
initData[ 2 ] = create_random_data( vec3Type, d, (unsigned int)threads[ 0 ] * vec3Size ); initData[ 2 ] = create_random_data( vec3Type, d, (unsigned int)threads[ 0 ] * vec3Size );
streams[ 2 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec3Type ) * threads[0] * vec3Size, initData[ 2 ], &error ); streams[ 2 ] = clCreateBuffer( context, (cl_mem_flags)( CL_MEM_COPY_HOST_PTR ), get_explicit_type_size( vec3Type ) * threads[0] * vec3Size, initData[ 2 ], &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
streams[ 3 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec1Type ) * threads[0] * vec1Size, NULL, &error ); streams[ 3 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec1Type ) * threads[0] * vec1Size, NULL, &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
streams[ 4 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec2Type ) * threads[0] * vec2Size, NULL, &error ); streams[ 4 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec2Type ) * threads[0] * vec2Size, NULL, &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
streams[ 5 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec3Type ) * threads[0] * vec3Size, NULL, &error ); streams[ 5 ] = clCreateBuffer( context, (cl_mem_flags)(CL_MEM_READ_WRITE), get_explicit_type_size( vec3Type ) * threads[0] * vec3Size, NULL, &error );
test_error( error, "Unable to create testing stream" ); test_error( error, "Unable to create testing stream" );
// Set the arguments // Set the arguments
error = 0; error = 0;
for( i = 0; i < 6; i++ ) for( i = 0; i < 6; i++ )
error |= clSetKernelArg( kernel, i, sizeof( cl_mem ), &streams[ i ] ); error |= clSetKernelArg( kernel, i, sizeof( cl_mem ), &streams[ i ] );
test_error( error, "Unable to set arguments for kernel" ); test_error( error, "Unable to set arguments for kernel" );
// Execute! // Execute!
error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL ); error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, localThreads, 0, NULL, NULL );
test_error( error, "Unable to execute kernel" ); test_error( error, "Unable to execute kernel" );
// Read results // Read results
resultData[0] = malloc( get_explicit_type_size( vec1Type ) * vec1Size * threads[0] ); resultData[0] = malloc( get_explicit_type_size( vec1Type ) * vec1Size * threads[0] );
resultData[1] = malloc( get_explicit_type_size( vec2Type ) * vec2Size * threads[0] ); resultData[1] = malloc( get_explicit_type_size( vec2Type ) * vec2Size * threads[0] );
resultData[2] = malloc( get_explicit_type_size( vec3Type ) * vec3Size * threads[0] ); resultData[2] = malloc( get_explicit_type_size( vec3Type ) * vec3Size * threads[0] );
error = clEnqueueReadBuffer( queue, streams[ 3 ], CL_TRUE, 0, get_explicit_type_size( vec1Type ) * vec1Size * threads[ 0 ], resultData[0], 0, NULL, NULL ); error = clEnqueueReadBuffer( queue, streams[ 3 ], CL_TRUE, 0, get_explicit_type_size( vec1Type ) * vec1Size * threads[ 0 ], resultData[0], 0, NULL, NULL );
error |= clEnqueueReadBuffer( queue, streams[ 4 ], CL_TRUE, 0, get_explicit_type_size( vec2Type ) * vec2Size * threads[ 0 ], resultData[1], 0, NULL, NULL ); error |= clEnqueueReadBuffer( queue, streams[ 4 ], CL_TRUE, 0, get_explicit_type_size( vec2Type ) * vec2Size * threads[ 0 ], resultData[1], 0, NULL, NULL );
error |= clEnqueueReadBuffer( queue, streams[ 5 ], CL_TRUE, 0, get_explicit_type_size( vec3Type ) * vec3Size * threads[ 0 ], resultData[2], 0, NULL, NULL ); error |= clEnqueueReadBuffer( queue, streams[ 5 ], CL_TRUE, 0, get_explicit_type_size( vec3Type ) * vec3Size * threads[ 0 ], resultData[2], 0, NULL, NULL );
test_error( error, "Unable to read result stream" ); test_error( error, "Unable to read result stream" );
// Verify // Verify
char *ptr1 = (char *)initData[ 0 ], *ptr2 = (char *)resultData[ 0 ]; char *ptr1 = (char *)initData[ 0 ], *ptr2 = (char *)resultData[ 0 ];
size_t span = get_explicit_type_size( vec1Type ); size_t span = get_explicit_type_size( vec1Type );
for( i = 0; i < (int)threads[0]; i++ ) for( i = 0; i < (int)threads[0]; i++ )
{ {
for( j = 0; j < vec1Size; j++ ) for( j = 0; j < vec1Size; j++ )
{ {
if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 ) if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 )
{ {
log_error( "ERROR: Value did not validate for component %d of item %d of stream 0!\n", j, i ); log_error( "ERROR: Value did not validate for component %d of item %d of stream 0!\n", j, i );
free( initData[ 0 ] ); free( initData[ 0 ] );
free( initData[ 1 ] ); free( initData[ 1 ] );
free( initData[ 2 ] ); free( initData[ 2 ] );
free( resultData[ 0 ] ); free( resultData[ 0 ] );
free( resultData[ 1 ] ); free( resultData[ 1 ] );
free( resultData[ 2 ] ); free( resultData[ 2 ] );
return -1; return -1;
} }
} }
ptr1 += span * vec1Size; ptr1 += span * vec1Size;
ptr2 += span * vec1Size; ptr2 += span * vec1Size;
} }
ptr1 = (char *)initData[ 1 ]; ptr1 = (char *)initData[ 1 ];
ptr2 = (char *)resultData[ 1 ]; ptr2 = (char *)resultData[ 1 ];
span = get_explicit_type_size( vec2Type ); span = get_explicit_type_size( vec2Type );
for( i = 0; i < (int)threads[0]; i++ ) for( i = 0; i < (int)threads[0]; i++ )
{ {
for( j = 0; j < vec2Size; j++ ) for( j = 0; j < vec2Size; j++ )
{ {
if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 ) if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 )
{ {
log_error( "ERROR: Value did not validate for component %d of item %d of stream 1!\n", j, i ); log_error( "ERROR: Value did not validate for component %d of item %d of stream 1!\n", j, i );
free( initData[ 0 ] ); free( initData[ 0 ] );
free( initData[ 1 ] ); free( initData[ 1 ] );
free( initData[ 2 ] ); free( initData[ 2 ] );
free( resultData[ 0 ] ); free( resultData[ 0 ] );
free( resultData[ 1 ] ); free( resultData[ 1 ] );
free( resultData[ 2 ] ); free( resultData[ 2 ] );
return -1; return -1;
} }
} }
ptr1 += span * vec2Size; ptr1 += span * vec2Size;
ptr2 += span * vec2Size; ptr2 += span * vec2Size;
} }
ptr1 = (char *)initData[ 2 ]; ptr1 = (char *)initData[ 2 ];
ptr2 = (char *)resultData[ 2 ]; ptr2 = (char *)resultData[ 2 ];
span = get_explicit_type_size( vec3Type ); span = get_explicit_type_size( vec3Type );
for( i = 0; i < (int)threads[0]; i++ ) for( i = 0; i < (int)threads[0]; i++ )
{ {
for( j = 0; j < vec3Size; j++ ) for( j = 0; j < vec3Size; j++ )
{ {
if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 ) if( memcmp( ptr1 + span * j , ptr2 + span * j, span ) != 0 )
{ {
log_error( "ERROR: Value did not validate for component %d of item %d of stream 2!\n", j, i ); log_error( "ERROR: Value did not validate for component %d of item %d of stream 2!\n", j, i );
free( initData[ 0 ] ); free( initData[ 0 ] );
free( initData[ 1 ] ); free( initData[ 1 ] );
free( initData[ 2 ] ); free( initData[ 2 ] );
free( resultData[ 0 ] ); free( resultData[ 0 ] );
free( resultData[ 1 ] ); free( resultData[ 1 ] );
free( resultData[ 2 ] ); free( resultData[ 2 ] );
return -1; return -1;
} }
} }
ptr1 += span * vec3Size; ptr1 += span * vec3Size;
ptr2 += span * vec3Size; ptr2 += span * vec3Size;
} }
// If we got here, everything verified successfully // If we got here, everything verified successfully
free( initData[ 0 ] ); free( initData[ 0 ] );
free( initData[ 1 ] ); free( initData[ 1 ] );
free( initData[ 2 ] ); free( initData[ 2 ] );
free( resultData[ 0 ] ); free( resultData[ 0 ] );
free( resultData[ 1 ] ); free( resultData[ 1 ] );
free( resultData[ 2 ] ); free( resultData[ 2 ] );
return 0; return 0;
} }
int test_kernel_arg_multi_setup_exhaustive(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements) int test_kernel_arg_multi_setup_exhaustive(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements)
{ {
// Loop through every combination of input and output types // Loop through every combination of input and output types
ExplicitType types[] = { kChar, kShort, kInt, kFloat, kNumExplicitTypes }; ExplicitType types[] = { kChar, kShort, kInt, kFloat, kNumExplicitTypes };
int type1, type2, type3; int type1, type2, type3;
int size1, size2, size3; int size1, size2, size3;
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
log_info( "\n" ); // for formatting log_info( "\n" ); // for formatting
for( type1 = 0; types[ type1 ] != kNumExplicitTypes; type1++ ) for( type1 = 0; types[ type1 ] != kNumExplicitTypes; type1++ )
{ {
for( type2 = 0; types[ type2 ] != kNumExplicitTypes; type2++ ) for( type2 = 0; types[ type2 ] != kNumExplicitTypes; type2++ )
{ {
for( type3 = 0; types[ type3 ] != kNumExplicitTypes; type3++ ) for( type3 = 0; types[ type3 ] != kNumExplicitTypes; type3++ )
{ {
log_info( "\n\ttesting %s, %s, %s...", get_explicit_type_name( types[ type1 ] ), get_explicit_type_name( types[ type2 ] ), get_explicit_type_name( types[ type3 ] ) ); log_info( "\n\ttesting %s, %s, %s...", get_explicit_type_name( types[ type1 ] ), get_explicit_type_name( types[ type2 ] ), get_explicit_type_name( types[ type3 ] ) );
// Loop through every combination of vector size // Loop through every combination of vector size
for( size1 = 2; size1 <= 8; size1 <<= 1 ) for( size1 = 2; size1 <= 8; size1 <<= 1 )
{ {
for( size2 = 2; size2 <= 8; size2 <<= 1 ) for( size2 = 2; size2 <= 8; size2 <<= 1 )
{ {
for( size3 = 2; size3 <= 8; size3 <<= 1 ) for( size3 = 2; size3 <= 8; size3 <<= 1 )
{ {
log_info("."); log_info(".");
fflush( stdout); fflush( stdout);
if( test_multi_arg_set( device, context, queue, if( test_multi_arg_set( device, context, queue,
types[ type1 ], size1, types[ type1 ], size1,
types[ type2 ], size2, types[ type2 ], size2,
types[ type3 ], size3, seed ) ) types[ type3 ], size3, seed ) )
return -1; return -1;
} }
} }
} }
} }
} }
} }
log_info( "\n" ); log_info( "\n" );
return 0; return 0;
} }
int test_kernel_arg_multi_setup_random(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements) int test_kernel_arg_multi_setup_random(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements)
{ {
// Loop through a selection of combinations // Loop through a selection of combinations
ExplicitType types[] = { kChar, kShort, kInt, kFloat, kNumExplicitTypes }; ExplicitType types[] = { kChar, kShort, kInt, kFloat, kNumExplicitTypes };
int type1, type2, type3; int type1, type2, type3;
int size1, size2, size3; int size1, size2, size3;
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
num_elements = 3*3*3*4; num_elements = 3*3*3*4;
log_info( "Testing %d random configurations\n", num_elements ); log_info( "Testing %d random configurations\n", num_elements );
// Loop through every combination of vector size // Loop through every combination of vector size
for( size1 = 2; size1 <= 8; size1 <<= 1 ) for( size1 = 2; size1 <= 8; size1 <<= 1 )
{ {
for( size2 = 2; size2 <= 8; size2 <<= 1 ) for( size2 = 2; size2 <= 8; size2 <<= 1 )
{ {
for( size3 = 2; size3 <= 8; size3 <<= 1 ) for( size3 = 2; size3 <= 8; size3 <<= 1 )
{ {
// Loop through 4 type combinations for each size combination // Loop through 4 type combinations for each size combination
int n; int n;
for (n=0; n<4; n++) { for (n=0; n<4; n++) {
type1 = (int)get_random_float(0,4, seed); type1 = (int)get_random_float(0,4, seed);
type2 = (int)get_random_float(0,4, seed); type2 = (int)get_random_float(0,4, seed);
type3 = (int)get_random_float(0,4, seed); type3 = (int)get_random_float(0,4, seed);
log_info( "\ttesting %s%d, %s%d, %s%d...\n", log_info( "\ttesting %s%d, %s%d, %s%d...\n",
get_explicit_type_name( types[ type1 ] ), size1, get_explicit_type_name( types[ type1 ] ), size1,
get_explicit_type_name( types[ type2 ] ), size2, get_explicit_type_name( types[ type2 ] ), size2,
get_explicit_type_name( types[ type3 ] ), size3 ); get_explicit_type_name( types[ type3 ] ), size3 );
if( test_multi_arg_set( device, context, queue, if( test_multi_arg_set( device, context, queue,
types[ type1 ], size1, types[ type1 ], size1,
types[ type2 ], size2, types[ type2 ], size2,
types[ type3 ], size3, seed ) ) types[ type3 ], size3, seed ) )
return -1; return -1;
} }
} }
} }
} }
return 0; return 0;
} }

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,108 +1,108 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
static volatile cl_int sDestructorIndex; static volatile cl_int sDestructorIndex;
void CL_CALLBACK mem_destructor_callback( cl_mem memObject, void * userData ) void CL_CALLBACK mem_destructor_callback( cl_mem memObject, void * userData )
{ {
int * userPtr = (int *)userData; int * userPtr = (int *)userData;
// ordering of callbacks is guaranteed, meaning we don't need to do atomic operation here // ordering of callbacks is guaranteed, meaning we don't need to do atomic operation here
*userPtr = ++sDestructorIndex; *userPtr = ++sDestructorIndex;
} }
#ifndef ABS #ifndef ABS
#define ABS( x ) ( ( x < 0 ) ? -x : x ) #define ABS( x ) ( ( x < 0 ) ? -x : x )
#endif #endif
int test_mem_object_destructor_callback_single( clMemWrapper &memObject ) int test_mem_object_destructor_callback_single( clMemWrapper &memObject )
{ {
cl_int error; cl_int error;
int i; int i;
// Set up some variables to catch the order in which callbacks are called // Set up some variables to catch the order in which callbacks are called
volatile int callbackOrders[ 3 ] = { 0, 0, 0 }; volatile int callbackOrders[ 3 ] = { 0, 0, 0 };
sDestructorIndex = 0; sDestructorIndex = 0;
// Set up the callbacks // Set up the callbacks
error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 0 ] ); error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 0 ] );
test_error( error, "Unable to set destructor callback" ); test_error( error, "Unable to set destructor callback" );
error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 1 ] ); error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 1 ] );
test_error( error, "Unable to set destructor callback" ); test_error( error, "Unable to set destructor callback" );
error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 2 ] ); error = clSetMemObjectDestructorCallback( memObject, mem_destructor_callback, (void*) &callbackOrders[ 2 ] );
test_error( error, "Unable to set destructor callback" ); test_error( error, "Unable to set destructor callback" );
// Now release the buffer, which SHOULD call the callbacks // Now release the buffer, which SHOULD call the callbacks
error = clReleaseMemObject( memObject ); error = clReleaseMemObject( memObject );
test_error( error, "Unable to release test buffer" ); test_error( error, "Unable to release test buffer" );
// Note: since we manually released the mem wrapper, we need to set it to NULL to prevent a double-release // Note: since we manually released the mem wrapper, we need to set it to NULL to prevent a double-release
memObject = NULL; memObject = NULL;
// At this point, all three callbacks should have already been called // At this point, all three callbacks should have already been called
int numErrors = 0; int numErrors = 0;
for( i = 0; i < 3; i++ ) for( i = 0; i < 3; i++ )
{ {
// Spin waiting for the release to finish. If you don't call the mem_destructor_callback, you will not // Spin waiting for the release to finish. If you don't call the mem_destructor_callback, you will not
// pass the test. bugzilla 6316 // pass the test. bugzilla 6316
while( 0 == callbackOrders[i] ) while( 0 == callbackOrders[i] )
{} {}
if( ABS( callbackOrders[ i ] ) != 3-i ) if( ABS( callbackOrders[ i ] ) != 3-i )
{ {
log_error( "\tERROR: Callback %d was called in the wrong order! (Was called order %d, should have been order %d)\n", log_error( "\tERROR: Callback %d was called in the wrong order! (Was called order %d, should have been order %d)\n",
i+1, ABS( callbackOrders[ i ] ), i ); i+1, ABS( callbackOrders[ i ] ), i );
numErrors++; numErrors++;
} }
} }
return ( numErrors > 0 ) ? -1 : 0; return ( numErrors > 0 ) ? -1 : 0;
} }
int test_mem_object_destructor_callback(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_mem_object_destructor_callback(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
clMemWrapper testBuffer, testImage; clMemWrapper testBuffer, testImage;
cl_int error; cl_int error;
// Create a buffer and an image to test callbacks against // Create a buffer and an image to test callbacks against
testBuffer = clCreateBuffer( context, CL_MEM_READ_WRITE, 1024, NULL, &error ); testBuffer = clCreateBuffer( context, CL_MEM_READ_WRITE, 1024, NULL, &error );
test_error( error, "Unable to create testing buffer" ); test_error( error, "Unable to create testing buffer" );
if( test_mem_object_destructor_callback_single( testBuffer ) != 0 ) if( test_mem_object_destructor_callback_single( testBuffer ) != 0 )
{ {
log_error( "ERROR: Destructor callbacks for buffer object FAILED\n" ); log_error( "ERROR: Destructor callbacks for buffer object FAILED\n" );
return -1; return -1;
} }
if( checkForImageSupport( deviceID ) == 0 ) if( checkForImageSupport( deviceID ) == 0 )
{ {
cl_image_format imageFormat = { CL_RGBA, CL_SIGNED_INT8 }; cl_image_format imageFormat = { CL_RGBA, CL_SIGNED_INT8 };
testImage = create_image_2d( context, CL_MEM_READ_ONLY, &imageFormat, 16, 16, 0, NULL, &error ); testImage = create_image_2d( context, CL_MEM_READ_ONLY, &imageFormat, 16, 16, 0, NULL, &error );
test_error( error, "Unable to create testing image" ); test_error( error, "Unable to create testing image" );
if( test_mem_object_destructor_callback_single( testImage ) != 0 ) if( test_mem_object_destructor_callback_single( testImage ) != 0 )
{ {
log_error( "ERROR: Destructor callbacks for image object FAILED\n" ); log_error( "ERROR: Destructor callbacks for image object FAILED\n" );
return -1; return -1;
} }
} }
return 0; return 0;
} }

View File

@@ -1,121 +1,121 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#ifndef _WIN32 #ifndef _WIN32
#include <unistd.h> #include <unistd.h>
#endif #endif
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
extern cl_uint gRandomSeed; extern cl_uint gRandomSeed;
static void CL_CALLBACK test_native_kernel_fn( void *userData ) static void CL_CALLBACK test_native_kernel_fn( void *userData )
{ {
struct arg_struct { struct arg_struct {
cl_int * source; cl_int * source;
cl_int * dest; cl_int * dest;
cl_int count; cl_int count;
} *args = (arg_struct *)userData; } *args = (arg_struct *)userData;
for( cl_int i = 0; i < args->count; i++ ) for( cl_int i = 0; i < args->count; i++ )
args->dest[ i ] = args->source[ i ]; args->dest[ i ] = args->source[ i ];
} }
int test_native_kernel(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ) int test_native_kernel(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems )
{ {
int error; int error;
RandomSeed seed( gRandomSeed ); RandomSeed seed( gRandomSeed );
// Check if we support native kernels // Check if we support native kernels
cl_device_exec_capabilities capabilities; cl_device_exec_capabilities capabilities;
error = clGetDeviceInfo(device, CL_DEVICE_EXECUTION_CAPABILITIES, sizeof(capabilities), &capabilities, NULL); error = clGetDeviceInfo(device, CL_DEVICE_EXECUTION_CAPABILITIES, sizeof(capabilities), &capabilities, NULL);
if (!(capabilities & CL_EXEC_NATIVE_KERNEL)) { if (!(capabilities & CL_EXEC_NATIVE_KERNEL)) {
log_info("Device does not support CL_EXEC_NATIVE_KERNEL.\n"); log_info("Device does not support CL_EXEC_NATIVE_KERNEL.\n");
return 0; return 0;
} }
clMemWrapper streams[ 2 ]; clMemWrapper streams[ 2 ];
#if !(defined (_WIN32) && defined (_MSC_VER)) #if !(defined (_WIN32) && defined (_MSC_VER))
cl_int inBuffer[ n_elems ], outBuffer[ n_elems ]; cl_int inBuffer[ n_elems ], outBuffer[ n_elems ];
#else #else
cl_int* inBuffer = (cl_int *)_malloca( n_elems * sizeof(cl_int) ); cl_int* inBuffer = (cl_int *)_malloca( n_elems * sizeof(cl_int) );
cl_int* outBuffer = (cl_int *)_malloca( n_elems * sizeof(cl_int) ); cl_int* outBuffer = (cl_int *)_malloca( n_elems * sizeof(cl_int) );
#endif #endif
clEventWrapper finishEvent; clEventWrapper finishEvent;
struct arg_struct struct arg_struct
{ {
cl_mem inputStream; cl_mem inputStream;
cl_mem outputStream; cl_mem outputStream;
cl_int count; cl_int count;
} args; } args;
// Create some input values // Create some input values
generate_random_data( kInt, n_elems, seed, inBuffer ); generate_random_data( kInt, n_elems, seed, inBuffer );
// Create I/O streams // Create I/O streams
streams[ 0 ] = clCreateBuffer( context, CL_MEM_COPY_HOST_PTR, n_elems * sizeof(cl_int), inBuffer, &error ); streams[ 0 ] = clCreateBuffer( context, CL_MEM_COPY_HOST_PTR, n_elems * sizeof(cl_int), inBuffer, &error );
test_error( error, "Unable to create I/O stream" ); test_error( error, "Unable to create I/O stream" );
streams[ 1 ] = clCreateBuffer( context, 0, n_elems * sizeof(cl_int), NULL, &error ); streams[ 1 ] = clCreateBuffer( context, 0, n_elems * sizeof(cl_int), NULL, &error );
test_error( error, "Unable to create I/O stream" ); test_error( error, "Unable to create I/O stream" );
// Set up the arrays to call with // Set up the arrays to call with
args.inputStream = streams[ 0 ]; args.inputStream = streams[ 0 ];
args.outputStream = streams[ 1 ]; args.outputStream = streams[ 1 ];
args.count = n_elems; args.count = n_elems;
void * memLocs[ 2 ] = { &args.inputStream, &args.outputStream }; void * memLocs[ 2 ] = { &args.inputStream, &args.outputStream };
// Run the kernel // Run the kernel
error = clEnqueueNativeKernel( queue, test_native_kernel_fn, error = clEnqueueNativeKernel( queue, test_native_kernel_fn,
&args, sizeof( args ), &args, sizeof( args ),
2, &streams[ 0 ], 2, &streams[ 0 ],
(const void **)memLocs, (const void **)memLocs,
0, NULL, &finishEvent ); 0, NULL, &finishEvent );
test_error( error, "Unable to queue native kernel" ); test_error( error, "Unable to queue native kernel" );
// Finish and wait for the kernel to complete // Finish and wait for the kernel to complete
error = clFinish( queue ); error = clFinish( queue );
test_error(error, "clFinish failed"); test_error(error, "clFinish failed");
error = clWaitForEvents( 1, &finishEvent ); error = clWaitForEvents( 1, &finishEvent );
test_error(error, "clWaitForEvents failed"); test_error(error, "clWaitForEvents failed");
// Now read the results and verify // Now read the results and verify
error = clEnqueueReadBuffer( queue, streams[ 1 ], CL_TRUE, 0, n_elems * sizeof(cl_int), outBuffer, 0, NULL, NULL ); error = clEnqueueReadBuffer( queue, streams[ 1 ], CL_TRUE, 0, n_elems * sizeof(cl_int), outBuffer, 0, NULL, NULL );
test_error( error, "Unable to read results" ); test_error( error, "Unable to read results" );
for( int i = 0; i < n_elems; i++ ) for( int i = 0; i < n_elems; i++ )
{ {
if( inBuffer[ i ] != outBuffer[ i ] ) if( inBuffer[ i ] != outBuffer[ i ] )
{ {
log_error( "ERROR: Data sample %d for native kernel did not validate (expected %d, got %d)\n", log_error( "ERROR: Data sample %d for native kernel did not validate (expected %d, got %d)\n",
i, (int)inBuffer[ i ], (int)outBuffer[ i ] ); i, (int)inBuffer[ i ], (int)outBuffer[ i ] );
return 1; return 1;
} }
} }
return 0; return 0;
} }

View File

@@ -1,162 +1,162 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include <stdio.h> #include <stdio.h>
#if defined(__APPLE__) #if defined(__APPLE__)
#include <OpenCL/opencl.h> #include <OpenCL/opencl.h>
#include <OpenCL/cl_platform.h> #include <OpenCL/cl_platform.h>
#else #else
#include <CL/opencl.h> #include <CL/opencl.h>
#include <CL/cl_platform.h> #include <CL/cl_platform.h>
#endif #endif
#include "procs.h" #include "procs.h"
enum { SUCCESS, FAILURE }; enum { SUCCESS, FAILURE };
typedef enum { NON_NULL_PATH, ADDROF_NULL_PATH, NULL_PATH } test_type; typedef enum { NON_NULL_PATH, ADDROF_NULL_PATH, NULL_PATH } test_type;
#define NITEMS 4096 #define NITEMS 4096
/* places the casted long value of the src ptr into each element of the output /* places the casted long value of the src ptr into each element of the output
* array, to allow testing that the kernel actually _gets_ the NULL value */ * array, to allow testing that the kernel actually _gets_ the NULL value */
const char *kernel_string = const char *kernel_string =
"kernel void test_kernel(global float *src, global long *dst)\n" "kernel void test_kernel(global float *src, global long *dst)\n"
"{\n" "{\n"
" uint tid = get_global_id(0);\n" " uint tid = get_global_id(0);\n"
" dst[tid] = (long)src;\n" " dst[tid] = (long)src;\n"
"}\n"; "}\n";
/* /*
* The guts of the test: * The guts of the test:
* call setKernelArgs with a regular buffer, &NULL, or NULL depending on * call setKernelArgs with a regular buffer, &NULL, or NULL depending on
* the value of 'test_type' * the value of 'test_type'
*/ */
static int test_setargs_and_execution(cl_command_queue queue, cl_kernel kernel, static int test_setargs_and_execution(cl_command_queue queue, cl_kernel kernel,
cl_mem test_buf, cl_mem result_buf, test_type type) cl_mem test_buf, cl_mem result_buf, test_type type)
{ {
unsigned int test_success = 0; unsigned int test_success = 0;
unsigned int i; unsigned int i;
cl_int status; cl_int status;
char *typestr; char *typestr;
if (type == NON_NULL_PATH) { if (type == NON_NULL_PATH) {
status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &test_buf); status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &test_buf);
typestr = "non-NULL"; typestr = "non-NULL";
} else if (type == ADDROF_NULL_PATH) { } else if (type == ADDROF_NULL_PATH) {
test_buf = NULL; test_buf = NULL;
status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &test_buf); status = clSetKernelArg(kernel, 0, sizeof(cl_mem), &test_buf);
typestr = "&NULL"; typestr = "&NULL";
} else if (type == NULL_PATH) { } else if (type == NULL_PATH) {
status = clSetKernelArg(kernel, 0, sizeof(cl_mem), NULL); status = clSetKernelArg(kernel, 0, sizeof(cl_mem), NULL);
typestr = "NULL"; typestr = "NULL";
} }
log_info("Testing setKernelArgs with %s buffer.\n", typestr); log_info("Testing setKernelArgs with %s buffer.\n", typestr);
if (status != CL_SUCCESS) { if (status != CL_SUCCESS) {
log_error("clSetKernelArg failed with status: %d\n", status); log_error("clSetKernelArg failed with status: %d\n", status);
return FAILURE; // no point in continuing *this* test return FAILURE; // no point in continuing *this* test
} }
size_t global = NITEMS; size_t global = NITEMS;
status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global, status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global,
NULL, 0, NULL, NULL); NULL, 0, NULL, NULL);
test_error(status, "NDRangeKernel failed."); test_error(status, "NDRangeKernel failed.");
cl_long* host_result = (cl_long*)malloc(NITEMS*sizeof(cl_long)); cl_long* host_result = (cl_long*)malloc(NITEMS*sizeof(cl_long));
status = clEnqueueReadBuffer(queue, result_buf, CL_TRUE, 0, status = clEnqueueReadBuffer(queue, result_buf, CL_TRUE, 0,
sizeof(cl_long)*NITEMS, host_result, 0, NULL, NULL); sizeof(cl_long)*NITEMS, host_result, 0, NULL, NULL);
test_error(status, "ReadBuffer failed."); test_error(status, "ReadBuffer failed.");
// in the non-null case, we expect NONZERO values: // in the non-null case, we expect NONZERO values:
if (type == NON_NULL_PATH) { if (type == NON_NULL_PATH) {
for (i=0; i<NITEMS; i++) { for (i=0; i<NITEMS; i++) {
if (host_result[i] == 0) { if (host_result[i] == 0) {
log_error("failure: item %d in the result buffer was unexpectedly NULL.\n", i); log_error("failure: item %d in the result buffer was unexpectedly NULL.\n", i);
test_success = FAILURE; break; test_success = FAILURE; break;
} }
} }
} else if (type == ADDROF_NULL_PATH || type == NULL_PATH) { } else if (type == ADDROF_NULL_PATH || type == NULL_PATH) {
for (i=0; i<NITEMS; i++) { for (i=0; i<NITEMS; i++) {
if (host_result[i] != 0) { if (host_result[i] != 0) {
log_error("failure: item %d in the result buffer was unexpectedly non-NULL.\n", i); log_error("failure: item %d in the result buffer was unexpectedly non-NULL.\n", i);
test_success = FAILURE; break; test_success = FAILURE; break;
} }
} }
} }
free(host_result); free(host_result);
if (test_success == SUCCESS) { if (test_success == SUCCESS) {
log_info("\t%s ok.\n", typestr); log_info("\t%s ok.\n", typestr);
} }
return test_success; return test_success;
} }
int test_null_buffer_arg(cl_device_id device, cl_context context, int test_null_buffer_arg(cl_device_id device, cl_context context,
cl_command_queue queue, int num_elements) cl_command_queue queue, int num_elements)
{ {
unsigned int test_success = 0; unsigned int test_success = 0;
unsigned int i; unsigned int i;
cl_int status; cl_int status;
cl_program program; cl_program program;
cl_kernel kernel; cl_kernel kernel;
// prep kernel: // prep kernel:
program = clCreateProgramWithSource(context, 1, &kernel_string, NULL, &status); program = clCreateProgramWithSource(context, 1, &kernel_string, NULL, &status);
test_error(status, "CreateProgramWithSource failed."); test_error(status, "CreateProgramWithSource failed.");
status = clBuildProgram(program, 0, NULL, NULL, NULL, NULL); status = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
test_error(status, "BuildProgram failed."); test_error(status, "BuildProgram failed.");
kernel = clCreateKernel(program, "test_kernel", &status); kernel = clCreateKernel(program, "test_kernel", &status);
test_error(status, "CreateKernel failed."); test_error(status, "CreateKernel failed.");
cl_mem dev_src = clCreateBuffer(context, CL_MEM_READ_ONLY, NITEMS*sizeof(cl_float), cl_mem dev_src = clCreateBuffer(context, CL_MEM_READ_ONLY, NITEMS*sizeof(cl_float),
NULL, NULL); NULL, NULL);
cl_mem dev_dst = clCreateBuffer(context, CL_MEM_WRITE_ONLY, NITEMS*sizeof(cl_long), cl_mem dev_dst = clCreateBuffer(context, CL_MEM_WRITE_ONLY, NITEMS*sizeof(cl_long),
NULL, NULL); NULL, NULL);
// set the destination buffer normally: // set the destination buffer normally:
status = clSetKernelArg(kernel, 1, sizeof(cl_mem), &dev_dst); status = clSetKernelArg(kernel, 1, sizeof(cl_mem), &dev_dst);
test_error(status, "SetKernelArg failed."); test_error(status, "SetKernelArg failed.");
// //
// we test three cases: // we test three cases:
// //
// - typical case, used everyday: non-null buffer // - typical case, used everyday: non-null buffer
// - the case of src as &NULL (the spec-compliance test) // - the case of src as &NULL (the spec-compliance test)
// - the case of src as NULL (the backwards-compatibility test, Apple only) // - the case of src as NULL (the backwards-compatibility test, Apple only)
// //
test_success = test_setargs_and_execution(queue, kernel, dev_src, dev_dst, NON_NULL_PATH); test_success = test_setargs_and_execution(queue, kernel, dev_src, dev_dst, NON_NULL_PATH);
test_success |= test_setargs_and_execution(queue, kernel, dev_src, dev_dst, ADDROF_NULL_PATH); test_success |= test_setargs_and_execution(queue, kernel, dev_src, dev_dst, ADDROF_NULL_PATH);
#ifdef __APPLE__ #ifdef __APPLE__
test_success |= test_setargs_and_execution(queue, kernel, dev_src, dev_dst, NULL_PATH); test_success |= test_setargs_and_execution(queue, kernel, dev_src, dev_dst, NULL_PATH);
#endif #endif
// clean up: // clean up:
if (dev_src) clReleaseMemObject(dev_src); if (dev_src) clReleaseMemObject(dev_src);
clReleaseMemObject(dev_dst); clReleaseMemObject(dev_dst);
clReleaseKernel(kernel); clReleaseKernel(kernel);
clReleaseProgram(program); clReleaseProgram(program);
return test_success; return test_success;
} }

View File

@@ -1,289 +1,289 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include <string.h> #include <string.h>
#define EXTENSION_NAME_BUF_SIZE 4096 #define EXTENSION_NAME_BUF_SIZE 4096
#define PRINT_EXTENSION_INFO 0 #define PRINT_EXTENSION_INFO 0
int test_platform_extensions(cl_device_id deviceID, cl_context context, int test_platform_extensions(cl_device_id deviceID, cl_context context,
cl_command_queue queue, int num_elements) cl_command_queue queue, int num_elements)
{ {
const char * extensions[] = { const char * extensions[] = {
"cl_khr_byte_addressable_store", "cl_khr_byte_addressable_store",
// "cl_APPLE_SetMemObjectDestructor", // "cl_APPLE_SetMemObjectDestructor",
"cl_khr_global_int32_base_atomics", "cl_khr_global_int32_base_atomics",
"cl_khr_global_int32_extended_atomics", "cl_khr_global_int32_extended_atomics",
"cl_khr_local_int32_base_atomics", "cl_khr_local_int32_base_atomics",
"cl_khr_local_int32_extended_atomics", "cl_khr_local_int32_extended_atomics",
"cl_khr_int64_base_atomics", "cl_khr_int64_base_atomics",
"cl_khr_int64_extended_atomics", "cl_khr_int64_extended_atomics",
// need to put in entires for various atomics // need to put in entires for various atomics
"cl_khr_3d_image_writes", "cl_khr_3d_image_writes",
"cl_khr_fp16", "cl_khr_fp16",
"cl_khr_fp64", "cl_khr_fp64",
NULL NULL
}; };
bool extensionsSupported[] = { bool extensionsSupported[] = {
false, //"cl_khr_byte_addressable_store", false, //"cl_khr_byte_addressable_store",
false, // need to put in entires for various atomics false, // need to put in entires for various atomics
false, // "cl_khr_global_int32_base_atomics", false, // "cl_khr_global_int32_base_atomics",
false, // "cl_khr_global_int32_extended_atomics", false, // "cl_khr_global_int32_extended_atomics",
false, // "cl_khr_local_int32_base_atomics", false, // "cl_khr_local_int32_base_atomics",
false, // "cl_khr_local_int32_extended_atomics", false, // "cl_khr_local_int32_extended_atomics",
false, // "cl_khr_int64_base_atomics", false, // "cl_khr_int64_base_atomics",
false, // "cl_khr_int64_extended_atomics", false, // "cl_khr_int64_extended_atomics",
false, //"cl_khr_3d_image_writes", false, //"cl_khr_3d_image_writes",
false, //"cl_khr_fp16", false, //"cl_khr_fp16",
false, //"cl_khr_fp64", false, //"cl_khr_fp64",
false //NULL false //NULL
}; };
int extensionIndex; int extensionIndex;
cl_platform_id platformID; cl_platform_id platformID;
cl_int err; cl_int err;
char platform_extensions[EXTENSION_NAME_BUF_SIZE]; char platform_extensions[EXTENSION_NAME_BUF_SIZE];
char device_extensions[EXTENSION_NAME_BUF_SIZE]; char device_extensions[EXTENSION_NAME_BUF_SIZE];
// Okay, so what we're going to do is just check the device indicated by // Okay, so what we're going to do is just check the device indicated by
// deviceID against the platform that includes this device // deviceID against the platform that includes this device
// pass CL_DEVICE_PLATFORM to clGetDeviceInfo // pass CL_DEVICE_PLATFORM to clGetDeviceInfo
// to get a result of type cl_platform_id // to get a result of type cl_platform_id
err = clGetDeviceInfo(deviceID, err = clGetDeviceInfo(deviceID,
CL_DEVICE_PLATFORM, CL_DEVICE_PLATFORM,
sizeof(cl_platform_id), sizeof(cl_platform_id),
(void *)(&platformID), (void *)(&platformID),
NULL); NULL);
if(err != CL_SUCCESS) if(err != CL_SUCCESS)
{ {
vlog_error("test_platform_extensions : could not get platformID from device\n"); vlog_error("test_platform_extensions : could not get platformID from device\n");
return -1; return -1;
} }
// now we grab the set of extensions specified by the platform // now we grab the set of extensions specified by the platform
err = clGetPlatformInfo(platformID, err = clGetPlatformInfo(platformID,
CL_PLATFORM_EXTENSIONS, CL_PLATFORM_EXTENSIONS,
sizeof(platform_extensions), sizeof(platform_extensions),
(void *)(&platform_extensions[0]), (void *)(&platform_extensions[0]),
NULL); NULL);
if(err != CL_SUCCESS) if(err != CL_SUCCESS)
{ {
vlog_error("test_platform_extensions : could not get extension string from platform\n"); vlog_error("test_platform_extensions : could not get extension string from platform\n");
return -1; return -1;
} }
#if PRINT_EXTENSION_INFO #if PRINT_EXTENSION_INFO
log_info("Platform extensions include \"%s\"\n\n", platform_extensions); log_info("Platform extensions include \"%s\"\n\n", platform_extensions);
#endif #endif
// here we parse the platform extensions, to look for the "important" ones // here we parse the platform extensions, to look for the "important" ones
for(extensionIndex=0; extensions[extensionIndex] != NULL; ++extensionIndex) for(extensionIndex=0; extensions[extensionIndex] != NULL; ++extensionIndex)
{ {
if(strstr(platform_extensions, extensions[extensionIndex]) != NULL) if(strstr(platform_extensions, extensions[extensionIndex]) != NULL)
{ {
// we found it // we found it
#if PRINT_EXTENSION_INFO #if PRINT_EXTENSION_INFO
log_info("Found \"%s\" in platform extensions\n", log_info("Found \"%s\" in platform extensions\n",
extensions[extensionIndex]); extensions[extensionIndex]);
#endif #endif
extensionsSupported[extensionIndex] = true; extensionsSupported[extensionIndex] = true;
} }
} }
// and then we grab the set of extensions specified by the device // and then we grab the set of extensions specified by the device
// (this can be turned into a "loop over all devices in this platform") // (this can be turned into a "loop over all devices in this platform")
err = clGetDeviceInfo(deviceID, err = clGetDeviceInfo(deviceID,
CL_DEVICE_EXTENSIONS, CL_DEVICE_EXTENSIONS,
sizeof(device_extensions), sizeof(device_extensions),
(void *)(&device_extensions[0]), (void *)(&device_extensions[0]),
NULL); NULL);
if(err != CL_SUCCESS) if(err != CL_SUCCESS)
{ {
vlog_error("test_platform_extensions : could not get extension string from device\n"); vlog_error("test_platform_extensions : could not get extension string from device\n");
return -1; return -1;
} }
#if PRINT_EXTENSION_INFO #if PRINT_EXTENSION_INFO
log_info("Device extensions include \"%s\"\n\n", device_extensions); log_info("Device extensions include \"%s\"\n\n", device_extensions);
#endif #endif
for(extensionIndex=0; extensions[extensionIndex] != NULL; ++extensionIndex) for(extensionIndex=0; extensions[extensionIndex] != NULL; ++extensionIndex)
{ {
if(extensionsSupported[extensionIndex] == false) if(extensionsSupported[extensionIndex] == false)
{ {
continue; // skip this one continue; // skip this one
} }
if(strstr(device_extensions, extensions[extensionIndex]) == NULL) if(strstr(device_extensions, extensions[extensionIndex]) == NULL)
{ {
// device does not support it // device does not support it
vlog_error("Platform supports extension \"%s\" but device does not\n", vlog_error("Platform supports extension \"%s\" but device does not\n",
extensions[extensionIndex]); extensions[extensionIndex]);
return -1; return -1;
} }
} }
return 0; return 0;
} }
int test_get_platform_ids(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) { int test_get_platform_ids(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) {
cl_platform_id platforms[16]; cl_platform_id platforms[16];
cl_uint num_platforms; cl_uint num_platforms;
char *string_returned; char *string_returned;
string_returned = (char*)malloc(8192); string_returned = (char*)malloc(8192);
int total_errors = 0; int total_errors = 0;
int err = CL_SUCCESS; int err = CL_SUCCESS;
err = clGetPlatformIDs(16, platforms, &num_platforms); err = clGetPlatformIDs(16, platforms, &num_platforms);
test_error(err, "clGetPlatformIDs failed"); test_error(err, "clGetPlatformIDs failed");
if (num_platforms <= 16) { if (num_platforms <= 16) {
// Try with NULL // Try with NULL
err = clGetPlatformIDs(num_platforms, platforms, NULL); err = clGetPlatformIDs(num_platforms, platforms, NULL);
test_error(err, "clGetPlatformIDs failed with NULL for return size"); test_error(err, "clGetPlatformIDs failed with NULL for return size");
} }
if (num_platforms < 1) { if (num_platforms < 1) {
log_error("Found 0 platforms.\n"); log_error("Found 0 platforms.\n");
return -1; return -1;
} }
log_info("Found %d platforms.\n", num_platforms); log_info("Found %d platforms.\n", num_platforms);
for (int p=0; p<(int)num_platforms; p++) { for (int p=0; p<(int)num_platforms; p++) {
cl_device_id *devices; cl_device_id *devices;
cl_uint num_devices; cl_uint num_devices;
size_t size; size_t size;
log_info("Platform %d (%p):\n", p, platforms[p]); log_info("Platform %d (%p):\n", p, platforms[p]);
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetPlatformInfo(platforms[p], CL_PLATFORM_PROFILE, 8192, string_returned, &size); err = clGetPlatformInfo(platforms[p], CL_PLATFORM_PROFILE, 8192, string_returned, &size);
test_error(err, "clGetPlatformInfo for CL_PLATFORM_PROFILE failed"); test_error(err, "clGetPlatformInfo for CL_PLATFORM_PROFILE failed");
log_info("\tCL_PLATFORM_PROFILE: %s\n", string_returned); log_info("\tCL_PLATFORM_PROFILE: %s\n", string_returned);
if (strlen(string_returned)+1 != size) { if (strlen(string_returned)+1 != size) {
log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size); log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size);
total_errors++; total_errors++;
} }
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetPlatformInfo(platforms[p], CL_PLATFORM_VERSION, 8192, string_returned, &size); err = clGetPlatformInfo(platforms[p], CL_PLATFORM_VERSION, 8192, string_returned, &size);
test_error(err, "clGetPlatformInfo for CL_PLATFORM_VERSION failed"); test_error(err, "clGetPlatformInfo for CL_PLATFORM_VERSION failed");
log_info("\tCL_PLATFORM_VERSION: %s\n", string_returned); log_info("\tCL_PLATFORM_VERSION: %s\n", string_returned);
if (strlen(string_returned)+1 != size) { if (strlen(string_returned)+1 != size) {
log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size); log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size);
total_errors++; total_errors++;
} }
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetPlatformInfo(platforms[p], CL_PLATFORM_NAME, 8192, string_returned, &size); err = clGetPlatformInfo(platforms[p], CL_PLATFORM_NAME, 8192, string_returned, &size);
test_error(err, "clGetPlatformInfo for CL_PLATFORM_NAME failed"); test_error(err, "clGetPlatformInfo for CL_PLATFORM_NAME failed");
log_info("\tCL_PLATFORM_NAME: %s\n", string_returned); log_info("\tCL_PLATFORM_NAME: %s\n", string_returned);
if (strlen(string_returned)+1 != size) { if (strlen(string_returned)+1 != size) {
log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size); log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size);
total_errors++; total_errors++;
} }
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetPlatformInfo(platforms[p], CL_PLATFORM_VENDOR, 8192, string_returned, &size); err = clGetPlatformInfo(platforms[p], CL_PLATFORM_VENDOR, 8192, string_returned, &size);
test_error(err, "clGetPlatformInfo for CL_PLATFORM_VENDOR failed"); test_error(err, "clGetPlatformInfo for CL_PLATFORM_VENDOR failed");
log_info("\tCL_PLATFORM_VENDOR: %s\n", string_returned); log_info("\tCL_PLATFORM_VENDOR: %s\n", string_returned);
if (strlen(string_returned)+1 != size) { if (strlen(string_returned)+1 != size) {
log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size); log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size);
total_errors++; total_errors++;
} }
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetPlatformInfo(platforms[p], CL_PLATFORM_EXTENSIONS, 8192, string_returned, &size); err = clGetPlatformInfo(platforms[p], CL_PLATFORM_EXTENSIONS, 8192, string_returned, &size);
test_error(err, "clGetPlatformInfo for CL_PLATFORM_EXTENSIONS failed"); test_error(err, "clGetPlatformInfo for CL_PLATFORM_EXTENSIONS failed");
log_info("\tCL_PLATFORM_EXTENSIONS: %s\n", string_returned); log_info("\tCL_PLATFORM_EXTENSIONS: %s\n", string_returned);
if (strlen(string_returned)+1 != size) { if (strlen(string_returned)+1 != size) {
log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size); log_error("Returned string length %ld does not equal reported one %ld.\n", strlen(string_returned)+1, size);
total_errors++; total_errors++;
} }
err = clGetDeviceIDs(platforms[p], CL_DEVICE_TYPE_ALL, 0, NULL, &num_devices); err = clGetDeviceIDs(platforms[p], CL_DEVICE_TYPE_ALL, 0, NULL, &num_devices);
test_error(err, "clGetDeviceIDs size failed.\n"); test_error(err, "clGetDeviceIDs size failed.\n");
devices = (cl_device_id *)malloc(num_devices*sizeof(cl_device_id)); devices = (cl_device_id *)malloc(num_devices*sizeof(cl_device_id));
memset(devices, 0, sizeof(cl_device_id)*num_devices); memset(devices, 0, sizeof(cl_device_id)*num_devices);
err = clGetDeviceIDs(platforms[p], CL_DEVICE_TYPE_ALL, num_devices, devices, NULL); err = clGetDeviceIDs(platforms[p], CL_DEVICE_TYPE_ALL, num_devices, devices, NULL);
test_error(err, "clGetDeviceIDs failed.\n"); test_error(err, "clGetDeviceIDs failed.\n");
log_info("\tPlatform has %d devices.\n", (int)num_devices); log_info("\tPlatform has %d devices.\n", (int)num_devices);
for (int d=0; d<(int)num_devices; d++) { for (int d=0; d<(int)num_devices; d++) {
size_t returned_size; size_t returned_size;
cl_platform_id returned_platform; cl_platform_id returned_platform;
cl_context context; cl_context context;
cl_context_properties properties[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)platforms[p], 0 }; cl_context_properties properties[] = { CL_CONTEXT_PLATFORM, (cl_context_properties)platforms[p], 0 };
err = clGetDeviceInfo(devices[d], CL_DEVICE_PLATFORM, sizeof(cl_platform_id), &returned_platform, &returned_size); err = clGetDeviceInfo(devices[d], CL_DEVICE_PLATFORM, sizeof(cl_platform_id), &returned_platform, &returned_size);
test_error(err, "clGetDeviceInfo failed for CL_DEVICE_PLATFORM\n"); test_error(err, "clGetDeviceInfo failed for CL_DEVICE_PLATFORM\n");
if (returned_size != sizeof(cl_platform_id)) { if (returned_size != sizeof(cl_platform_id)) {
log_error("Reported return size (%ld) does not match expected size (%ld).\n", returned_size, sizeof(cl_platform_id)); log_error("Reported return size (%ld) does not match expected size (%ld).\n", returned_size, sizeof(cl_platform_id));
total_errors++; total_errors++;
} }
memset(string_returned, 0, 8192); memset(string_returned, 0, 8192);
err = clGetDeviceInfo(devices[d], CL_DEVICE_NAME, 8192, string_returned, NULL); err = clGetDeviceInfo(devices[d], CL_DEVICE_NAME, 8192, string_returned, NULL);
test_error(err, "clGetDeviceInfo failed for CL_DEVICE_NAME\n"); test_error(err, "clGetDeviceInfo failed for CL_DEVICE_NAME\n");
log_info("\t\tPlatform for device %d (%s) is %p.\n", d, string_returned, returned_platform); log_info("\t\tPlatform for device %d (%s) is %p.\n", d, string_returned, returned_platform);
log_info("\t\t\tTesting clCreateContext for the platform/device...\n"); log_info("\t\t\tTesting clCreateContext for the platform/device...\n");
// Try creating a context for the platform // Try creating a context for the platform
context = clCreateContext(properties, 1, &devices[d], NULL, NULL, &err); context = clCreateContext(properties, 1, &devices[d], NULL, NULL, &err);
test_error(err, "\t\tclCreateContext failed for device with platform properties\n"); test_error(err, "\t\tclCreateContext failed for device with platform properties\n");
memset(properties, 0, sizeof(cl_context_properties)*3); memset(properties, 0, sizeof(cl_context_properties)*3);
err = clGetContextInfo(context, CL_CONTEXT_PROPERTIES, sizeof(cl_context_properties)*3, properties, &returned_size); err = clGetContextInfo(context, CL_CONTEXT_PROPERTIES, sizeof(cl_context_properties)*3, properties, &returned_size);
test_error(err, "clGetContextInfo for CL_CONTEXT_PROPERTIES failed"); test_error(err, "clGetContextInfo for CL_CONTEXT_PROPERTIES failed");
if (returned_size != sizeof(cl_context_properties)*3) { if (returned_size != sizeof(cl_context_properties)*3) {
log_error("Invalid size returned from clGetContextInfo for CL_CONTEXT_PROPERTIES. Got %ld, expected %ld.\n", log_error("Invalid size returned from clGetContextInfo for CL_CONTEXT_PROPERTIES. Got %ld, expected %ld.\n",
returned_size, sizeof(cl_context_properties)*3); returned_size, sizeof(cl_context_properties)*3);
total_errors++; total_errors++;
} }
if (properties[0] != (cl_context_properties)CL_CONTEXT_PLATFORM || properties[1] != (cl_context_properties)platforms[p]) { if (properties[0] != (cl_context_properties)CL_CONTEXT_PLATFORM || properties[1] != (cl_context_properties)platforms[p]) {
log_error("Wrong properties returned. Expected: [%p %p], got [%p %p]\n", log_error("Wrong properties returned. Expected: [%p %p], got [%p %p]\n",
(void*)CL_CONTEXT_PLATFORM, platforms[p], (void*)properties[0], (void*)properties[1]); (void*)CL_CONTEXT_PLATFORM, platforms[p], (void*)properties[0], (void*)properties[1]);
total_errors++; total_errors++;
} }
err = clReleaseContext(context); err = clReleaseContext(context);
test_error(err, "clReleaseContext failed"); test_error(err, "clReleaseContext failed");
} }
free(devices); free(devices);
} }
free(string_returned); free(string_returned);
return total_errors; return total_errors;
} }

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,174 @@
//
// Copyright (c) 2018 The Khronos Group Inc.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//
#include "testBase.h"
#include "../../test_common/harness/typeWrappers.h"
#include "../../test_common/harness/conversions.h"
#include <sstream>
#include <string>
#include <vector>
using namespace std;
/*
The test against cl_khr_create_command_queue extension. It validates if devices with Opencl 1.X can use clCreateCommandQueueWithPropertiesKHR function.
Based on device capabilities test will create queue with NULL properties, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE property and
CL_QUEUE_PROFILING_ENABLE property. Finally simple kernel will be executed on such queue.
*/
const char *queue_test_kernel[] = {
"__kernel void vec_cpy(__global int *src, __global int *dst)\n"
"{\n"
" int tid = get_global_id(0);\n"
"\n"
" dst[tid] = src[tid];\n"
"\n"
"}\n" };
int enqueue_kernel(cl_context context, const cl_queue_properties_khr *queue_prop_def, cl_device_id deviceID, clKernelWrapper& kernel, size_t num_elements)
{
clMemWrapper streams[2];
int error;
std::vector<int> buf(num_elements);
clCreateCommandQueueWithPropertiesKHR_fn clCreateCommandQueueWithPropertiesKHR = NULL;
cl_platform_id platform;
clEventWrapper event;
error = clGetDeviceInfo(deviceID, CL_DEVICE_PLATFORM, sizeof(cl_platform_id), &platform, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_PLATFORM failed");
clCreateCommandQueueWithPropertiesKHR = (clCreateCommandQueueWithPropertiesKHR_fn) clGetExtensionFunctionAddressForPlatform(platform, "clCreateCommandQueueWithPropertiesKHR");
if (clCreateCommandQueueWithPropertiesKHR == NULL)
{
log_error("ERROR: clGetExtensionFunctionAddressForPlatform failed\n");
return -1;
}
clCommandQueueWrapper queue = clCreateCommandQueueWithPropertiesKHR(context, deviceID, queue_prop_def, &error);
test_error(error, "clCreateCommandQueueWithPropertiesKHR failed");
for (int i = 0; i < num_elements; ++i)
{
buf[i] = i;
}
streams[0] = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, num_elements * sizeof(int), buf.data(), &error);
test_error( error, "clCreateBuffer failed." );
streams[1] = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR, num_elements * sizeof(int), NULL, &error);
test_error( error, "clCreateBuffer failed." );
error = clSetKernelArg(kernel, 0, sizeof(streams[0]), &streams[0]);
test_error( error, "clSetKernelArg failed." );
error = clSetKernelArg(kernel, 1, sizeof(streams[1]), &streams[1]);
test_error( error, "clSetKernelArg failed." );
error = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &num_elements, NULL, 0, NULL, &event);
test_error( error, "clEnqueueNDRangeKernel failed." );
error = clWaitForEvents(1, &event);
test_error(error, "clWaitForEvents failed.");
error = clEnqueueReadBuffer(queue, streams[1], CL_TRUE, 0, num_elements, buf.data(), 0, NULL, NULL);
test_error( error, "clEnqueueReadBuffer failed." );
for (int i = 0; i < num_elements; ++i)
{
if (buf[i] != i)
{
log_error("ERROR: Incorrect vector copy result.");
return -1;
}
}
return 0;
}
int test_queue_properties(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{
if (num_elements <= 0)
{
num_elements = 128;
}
int error = 0;
clProgramWrapper program;
clKernelWrapper kernel;
size_t strSize;
std::string strExt(0, '\0');
cl_queue_properties_khr device_props = NULL;
cl_queue_properties_khr queue_prop_def[] = { CL_QUEUE_PROPERTIES, 0, 0 };
// Query extension
error = clGetDeviceInfo(deviceID, CL_DEVICE_EXTENSIONS, 0, NULL, &strSize);
test_error(error, "clGetDeviceInfo for CL_DEVICE_EXTENSIONS failed");
strExt.resize(strSize);
error = clGetDeviceInfo(deviceID, CL_DEVICE_EXTENSIONS, strExt.size(), &strExt[0], NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_EXTENSIONS failed");
log_info("CL_DEVICE_EXTENSIONS:\n%s\n\n", strExt.c_str());
if (strExt.find("cl_khr_create_command_queue") == string::npos)
{
log_info("extension cl_khr_create_command_queue is not supported.\n");
return 0;
}
error = create_single_kernel_helper(context, &program, &kernel, 1, queue_test_kernel, "vec_cpy");
test_error(error, "create_single_kernel_helper failed");
log_info("Queue property NULL. Testing ... \n");
error = enqueue_kernel(context, NULL,deviceID, kernel, (size_t)num_elements);
test_error(error, "enqueue_kernel failed");
error = clGetDeviceInfo(deviceID, CL_DEVICE_QUEUE_PROPERTIES, sizeof(device_props), &device_props, NULL);
test_error(error, "clGetDeviceInfo for CL_DEVICE_QUEUE_PROPERTIES failed");
if (device_props & CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE)
{
log_info("Queue property CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE supported. Testing ... \n");
queue_prop_def[1] = CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE;
error = enqueue_kernel(context, queue_prop_def, deviceID, kernel, (size_t)num_elements);
test_error(error, "enqueue_kernel failed");
} else
{
log_info("Queue property CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE not supported \n");
}
if (device_props & CL_QUEUE_PROFILING_ENABLE)
{
log_info("Queue property CL_QUEUE_PROFILING_ENABLE supported. Testing ... \n");
queue_prop_def[1] = CL_QUEUE_PROFILING_ENABLE;
error = enqueue_kernel(context, queue_prop_def, deviceID, kernel, (size_t)num_elements);
test_error(error, "enqueue_kernel failed");
} else
{
log_info("Queue property CL_QUEUE_PROFILING_ENABLE not supported \n");
}
if (device_props & CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE && device_props & CL_QUEUE_PROFILING_ENABLE)
{
log_info("Queue property CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE & CL_QUEUE_PROFILING_ENABLE supported. Testing ... \n");
queue_prop_def[1] = CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE|CL_QUEUE_PROFILING_ENABLE;
error = enqueue_kernel(context, queue_prop_def, deviceID, kernel, (size_t)num_elements);
test_error(error, "enqueue_kernel failed");
}
else
{
log_info("Queue property CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE or CL_QUEUE_PROFILING_ENABLE not supported \n");
}
return 0;
}

View File

@@ -1,234 +1,234 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif // !_WIN32 #endif // !_WIN32
// Note: According to spec, the various functions to get instance counts should return an error when passed in an object // Note: According to spec, the various functions to get instance counts should return an error when passed in an object
// that has already been released. However, the spec is out of date. If it gets re-updated to allow such action, re-enable // that has already been released. However, the spec is out of date. If it gets re-updated to allow such action, re-enable
// this define. // this define.
//#define VERIFY_AFTER_RELEASE 1 //#define VERIFY_AFTER_RELEASE 1
#define GET_QUEUE_INSTANCE_COUNT(p) numInstances = ( (err = clGetCommandQueueInfo(p, CL_QUEUE_REFERENCE_COUNT, sizeof( numInstances ), &numInstances, NULL)) == CL_SUCCESS ? numInstances : 0 ) #define GET_QUEUE_INSTANCE_COUNT(p) numInstances = ( (err = clGetCommandQueueInfo(p, CL_QUEUE_REFERENCE_COUNT, sizeof( numInstances ), &numInstances, NULL)) == CL_SUCCESS ? numInstances : 0 )
#define GET_MEM_INSTANCE_COUNT(p) numInstances = ( (err = clGetMemObjectInfo(p, CL_MEM_REFERENCE_COUNT, sizeof( numInstances ), &numInstances, NULL)) == CL_SUCCESS ? numInstances : 0 ) #define GET_MEM_INSTANCE_COUNT(p) numInstances = ( (err = clGetMemObjectInfo(p, CL_MEM_REFERENCE_COUNT, sizeof( numInstances ), &numInstances, NULL)) == CL_SUCCESS ? numInstances : 0 )
#define VERIFY_INSTANCE_COUNT(c,rightValue) if( c != rightValue ) { \ #define VERIFY_INSTANCE_COUNT(c,rightValue) if( c != rightValue ) { \
log_error( "ERROR: Instance count for test object is not valid! (should be %d, really is %d)\n", rightValue, c ); \ log_error( "ERROR: Instance count for test object is not valid! (should be %d, really is %d)\n", rightValue, c ); \
return -1; } return -1; }
int test_retain_queue_single(cl_device_id deviceID, cl_context context, cl_command_queue queueNotUsed, int num_elements) int test_retain_queue_single(cl_device_id deviceID, cl_context context, cl_command_queue queueNotUsed, int num_elements)
{ {
cl_command_queue queue; cl_command_queue queue;
cl_uint numInstances; cl_uint numInstances;
int err; int err;
/* Create a test queue */ /* Create a test queue */
queue = clCreateCommandQueue( context, deviceID, 0, &err ); queue = clCreateCommandQueue( context, deviceID, 0, &err );
test_error( err, "Unable to create command queue to test with" ); test_error( err, "Unable to create command queue to test with" );
/* Test the instance count */ /* Test the instance count */
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
test_error( err, "Unable to get queue instance count" ); test_error( err, "Unable to get queue instance count" );
VERIFY_INSTANCE_COUNT( numInstances, 1 ); VERIFY_INSTANCE_COUNT( numInstances, 1 );
/* Now release the program */ /* Now release the program */
clReleaseCommandQueue( queue ); clReleaseCommandQueue( queue );
#ifdef VERIFY_AFTER_RELEASE #ifdef VERIFY_AFTER_RELEASE
/* We're not allowed to get the instance count after the object has been completely released. But that's /* We're not allowed to get the instance count after the object has been completely released. But that's
exactly how we can tell the release worked--by making sure getting the instance count fails! */ exactly how we can tell the release worked--by making sure getting the instance count fails! */
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
if( err != CL_INVALID_COMMAND_QUEUE ) if( err != CL_INVALID_COMMAND_QUEUE )
{ {
print_error( err, "Command queue was not properly released" ); print_error( err, "Command queue was not properly released" );
return -1; return -1;
} }
#endif #endif
return 0; return 0;
} }
int test_retain_queue_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queueNotUsed, int num_elements) int test_retain_queue_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queueNotUsed, int num_elements)
{ {
cl_command_queue queue; cl_command_queue queue;
unsigned int numInstances, i; unsigned int numInstances, i;
int err; int err;
/* Create a test program */ /* Create a test program */
queue = clCreateCommandQueue( context, deviceID, 0, &err ); queue = clCreateCommandQueue( context, deviceID, 0, &err );
test_error( err, "Unable to create command queue to test with" ); test_error( err, "Unable to create command queue to test with" );
/* Increment 9 times, which should bring the count to 10 */ /* Increment 9 times, which should bring the count to 10 */
for( i = 0; i < 9; i++ ) for( i = 0; i < 9; i++ )
{ {
clRetainCommandQueue( queue ); clRetainCommandQueue( queue );
} }
/* Test the instance count */ /* Test the instance count */
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
test_error( err, "Unable to get queue instance count" ); test_error( err, "Unable to get queue instance count" );
VERIFY_INSTANCE_COUNT( numInstances, 10 ); VERIFY_INSTANCE_COUNT( numInstances, 10 );
/* Now release 5 times, which should take us to 5 */ /* Now release 5 times, which should take us to 5 */
for( i = 0; i < 5; i++ ) for( i = 0; i < 5; i++ )
{ {
clReleaseCommandQueue( queue ); clReleaseCommandQueue( queue );
} }
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
test_error( err, "Unable to get queue instance count" ); test_error( err, "Unable to get queue instance count" );
VERIFY_INSTANCE_COUNT( numInstances, 5 ); VERIFY_INSTANCE_COUNT( numInstances, 5 );
/* Retain again three times, which should take us to 8 */ /* Retain again three times, which should take us to 8 */
for( i = 0; i < 3; i++ ) for( i = 0; i < 3; i++ )
{ {
clRetainCommandQueue( queue ); clRetainCommandQueue( queue );
} }
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
test_error( err, "Unable to get queue instance count" ); test_error( err, "Unable to get queue instance count" );
VERIFY_INSTANCE_COUNT( numInstances, 8 ); VERIFY_INSTANCE_COUNT( numInstances, 8 );
/* Release 7 times, which should take it to 1 */ /* Release 7 times, which should take it to 1 */
for( i = 0; i < 7; i++ ) for( i = 0; i < 7; i++ )
{ {
clReleaseCommandQueue( queue ); clReleaseCommandQueue( queue );
} }
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
test_error( err, "Unable to get queue instance count" ); test_error( err, "Unable to get queue instance count" );
VERIFY_INSTANCE_COUNT( numInstances, 1 ); VERIFY_INSTANCE_COUNT( numInstances, 1 );
/* And one last one */ /* And one last one */
clReleaseCommandQueue( queue ); clReleaseCommandQueue( queue );
#ifdef VERIFY_AFTER_RELEASE #ifdef VERIFY_AFTER_RELEASE
/* We're not allowed to get the instance count after the object has been completely released. But that's /* We're not allowed to get the instance count after the object has been completely released. But that's
exactly how we can tell the release worked--by making sure getting the instance count fails! */ exactly how we can tell the release worked--by making sure getting the instance count fails! */
GET_QUEUE_INSTANCE_COUNT( queue ); GET_QUEUE_INSTANCE_COUNT( queue );
if( err != CL_INVALID_COMMAND_QUEUE ) if( err != CL_INVALID_COMMAND_QUEUE )
{ {
print_error( err, "Command queue was not properly released" ); print_error( err, "Command queue was not properly released" );
return -1; return -1;
} }
#endif #endif
return 0; return 0;
} }
int test_retain_mem_object_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_retain_mem_object_single(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
cl_mem object; cl_mem object;
cl_uint numInstances; cl_uint numInstances;
int err; int err;
/* Create a test object */ /* Create a test object */
object = clCreateBuffer( context, CL_MEM_READ_ONLY, 32, NULL, &err ); object = clCreateBuffer( context, CL_MEM_READ_ONLY, 32, NULL, &err );
test_error( err, "Unable to create buffer to test with" ); test_error( err, "Unable to create buffer to test with" );
/* Test the instance count */ /* Test the instance count */
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
test_error( err, "Unable to get mem object count" ); test_error( err, "Unable to get mem object count" );
VERIFY_INSTANCE_COUNT( numInstances, 1 ); VERIFY_INSTANCE_COUNT( numInstances, 1 );
/* Now release the program */ /* Now release the program */
clReleaseMemObject( object ); clReleaseMemObject( object );
#ifdef VERIFY_AFTER_RELEASE #ifdef VERIFY_AFTER_RELEASE
/* We're not allowed to get the instance count after the object has been completely released. But that's /* We're not allowed to get the instance count after the object has been completely released. But that's
exactly how we can tell the release worked--by making sure getting the instance count fails! */ exactly how we can tell the release worked--by making sure getting the instance count fails! */
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
if( err != CL_INVALID_MEM_OBJECT ) if( err != CL_INVALID_MEM_OBJECT )
{ {
print_error( err, "Mem object was not properly released" ); print_error( err, "Mem object was not properly released" );
return -1; return -1;
} }
#endif #endif
return 0; return 0;
} }
int test_retain_mem_object_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_retain_mem_object_multiple(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
cl_mem object; cl_mem object;
unsigned int numInstances, i; unsigned int numInstances, i;
int err; int err;
/* Create a test object */ /* Create a test object */
object = clCreateBuffer( context, CL_MEM_READ_ONLY, 32, NULL, &err ); object = clCreateBuffer( context, CL_MEM_READ_ONLY, 32, NULL, &err );
test_error( err, "Unable to create buffer to test with" ); test_error( err, "Unable to create buffer to test with" );
/* Increment 9 times, which should bring the count to 10 */ /* Increment 9 times, which should bring the count to 10 */
for( i = 0; i < 9; i++ ) for( i = 0; i < 9; i++ )
{ {
clRetainMemObject( object ); clRetainMemObject( object );
} }
/* Test the instance count */ /* Test the instance count */
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
test_error( err, "Unable to get mem object count" ); test_error( err, "Unable to get mem object count" );
VERIFY_INSTANCE_COUNT( numInstances, 10 ); VERIFY_INSTANCE_COUNT( numInstances, 10 );
/* Now release 5 times, which should take us to 5 */ /* Now release 5 times, which should take us to 5 */
for( i = 0; i < 5; i++ ) for( i = 0; i < 5; i++ )
{ {
clReleaseMemObject( object ); clReleaseMemObject( object );
} }
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
test_error( err, "Unable to get mem object count" ); test_error( err, "Unable to get mem object count" );
VERIFY_INSTANCE_COUNT( numInstances, 5 ); VERIFY_INSTANCE_COUNT( numInstances, 5 );
/* Retain again three times, which should take us to 8 */ /* Retain again three times, which should take us to 8 */
for( i = 0; i < 3; i++ ) for( i = 0; i < 3; i++ )
{ {
clRetainMemObject( object ); clRetainMemObject( object );
} }
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
test_error( err, "Unable to get mem object count" ); test_error( err, "Unable to get mem object count" );
VERIFY_INSTANCE_COUNT( numInstances, 8 ); VERIFY_INSTANCE_COUNT( numInstances, 8 );
/* Release 7 times, which should take it to 1 */ /* Release 7 times, which should take it to 1 */
for( i = 0; i < 7; i++ ) for( i = 0; i < 7; i++ )
{ {
clReleaseMemObject( object ); clReleaseMemObject( object );
} }
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
test_error( err, "Unable to get mem object count" ); test_error( err, "Unable to get mem object count" );
VERIFY_INSTANCE_COUNT( numInstances, 1 ); VERIFY_INSTANCE_COUNT( numInstances, 1 );
/* And one last one */ /* And one last one */
clReleaseMemObject( object ); clReleaseMemObject( object );
#ifdef VERIFY_AFTER_RELEASE #ifdef VERIFY_AFTER_RELEASE
/* We're not allowed to get the instance count after the object has been completely released. But that's /* We're not allowed to get the instance count after the object has been completely released. But that's
exactly how we can tell the release worked--by making sure getting the instance count fails! */ exactly how we can tell the release worked--by making sure getting the instance count fails! */
GET_MEM_INSTANCE_COUNT( object ); GET_MEM_INSTANCE_COUNT( object );
if( err != CL_INVALID_MEM_OBJECT ) if( err != CL_INVALID_MEM_OBJECT )
{ {
print_error( err, "Mem object was not properly released" ); print_error( err, "Mem object was not properly released" );
return -1; return -1;
} }
#endif #endif
return 0; return 0;
} }

View File

@@ -1,109 +1,109 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif #endif
#include "../../test_common/harness/compat.h" #include "../../test_common/harness/compat.h"
int test_release_kernel_order(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_release_kernel_order(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
cl_program program; cl_program program;
cl_kernel kernel; cl_kernel kernel;
int error; int error;
const char *testProgram[] = { "__kernel void sample_test(__global int *data){}" }; const char *testProgram[] = { "__kernel void sample_test(__global int *data){}" };
/* Create a test program */ /* Create a test program */
program = clCreateProgramWithSource( context, 1, testProgram, NULL, &error); program = clCreateProgramWithSource( context, 1, testProgram, NULL, &error);
test_error( error, "Unable to create program to test with" ); test_error( error, "Unable to create program to test with" );
/* Compile the program */ /* Compile the program */
error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL ); error = clBuildProgram( program, 1, &deviceID, NULL, NULL, NULL );
test_error( error, "Unable to build sample program to test with" ); test_error( error, "Unable to build sample program to test with" );
/* And create a kernel from it */ /* And create a kernel from it */
kernel = clCreateKernel( program, "sample_test", &error ); kernel = clCreateKernel( program, "sample_test", &error );
test_error( error, "Unable to create kernel" ); test_error( error, "Unable to create kernel" );
/* Now try freeing the program first, then the kernel. If refcounts are right, this should work just fine */ /* Now try freeing the program first, then the kernel. If refcounts are right, this should work just fine */
clReleaseProgram( program ); clReleaseProgram( program );
clReleaseKernel( kernel ); clReleaseKernel( kernel );
/* If we got here fine, we succeeded. If not, well, we won't be able to return an error :) */ /* If we got here fine, we succeeded. If not, well, we won't be able to return an error :) */
return 0; return 0;
} }
const char *sample_delay_kernel[] = { const char *sample_delay_kernel[] = {
"__kernel void sample_test(__global float *src, __global int *dst)\n" "__kernel void sample_test(__global float *src, __global int *dst)\n"
"{\n" "{\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
" for( int i = 0; i < 1000000; i++ ); \n" " for( int i = 0; i < 1000000; i++ ); \n"
" dst[tid] = (int)src[tid];\n" " dst[tid] = (int)src[tid];\n"
"\n" "\n"
"}\n" }; "}\n" };
int test_release_during_execute( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_release_during_execute( cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
int error; int error;
cl_program program; cl_program program;
cl_kernel kernel; cl_kernel kernel;
cl_mem streams[2]; cl_mem streams[2];
size_t threads[1] = { 10 }, localThreadSize; size_t threads[1] = { 10 }, localThreadSize;
/* We now need an event to test. So we'll execute a kernel to get one */ /* We now need an event to test. So we'll execute a kernel to get one */
if( create_single_kernel_helper( context, &program, &kernel, 1, sample_delay_kernel, "sample_test" ) ) if( create_single_kernel_helper( context, &program, &kernel, 1, sample_delay_kernel, "sample_test" ) )
{ {
return -1; return -1;
} }
streams[0] = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_float) * 10, NULL, &error); streams[0] = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_float) * 10, NULL, &error);
test_error( error, "Creating test array failed" ); test_error( error, "Creating test array failed" );
streams[1] = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_int) * 10, NULL, &error); streams[1] = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), sizeof(cl_int) * 10, NULL, &error);
test_error( error, "Creating test array failed" ); test_error( error, "Creating test array failed" );
/* Set the arguments */ /* Set the arguments */
error = clSetKernelArg(kernel, 0, sizeof( streams[0] ), &streams[ 0 ]); error = clSetKernelArg(kernel, 0, sizeof( streams[0] ), &streams[ 0 ]);
test_error( error, "Unable to set indexed kernel arguments" ); test_error( error, "Unable to set indexed kernel arguments" );
error = clSetKernelArg(kernel, 1, sizeof( streams[1] ), &streams[ 1 ]); error = clSetKernelArg(kernel, 1, sizeof( streams[1] ), &streams[ 1 ]);
test_error( error, "Unable to set indexed kernel arguments" ); test_error( error, "Unable to set indexed kernel arguments" );
error = get_max_common_work_group_size( context, kernel, threads[0], &localThreadSize ); error = get_max_common_work_group_size( context, kernel, threads[0], &localThreadSize );
test_error( error, "Unable to calc local thread size" ); test_error( error, "Unable to calc local thread size" );
/* Execute the kernel */ /* Execute the kernel */
error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, &localThreadSize, 0, NULL, NULL ); error = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, threads, &localThreadSize, 0, NULL, NULL );
test_error( error, "Unable to execute test kernel" ); test_error( error, "Unable to execute test kernel" );
/* The kernel should still be executing, but we should still be able to release it. It's not terribly /* The kernel should still be executing, but we should still be able to release it. It's not terribly
useful, but we should be able to do it, if the internal refcounting is indeed correct. */ useful, but we should be able to do it, if the internal refcounting is indeed correct. */
clReleaseMemObject( streams[ 1 ] ); clReleaseMemObject( streams[ 1 ] );
clReleaseMemObject( streams[ 0 ] ); clReleaseMemObject( streams[ 0 ] );
clReleaseKernel( kernel ); clReleaseKernel( kernel );
clReleaseProgram( program ); clReleaseProgram( program );
/* Now make sure we're really finished before we go on. */ /* Now make sure we're really finished before we go on. */
error = clFinish(queue); error = clFinish(queue);
test_error( error, "Unable to finish context."); test_error( error, "Unable to finish context.");
return 0; return 0;
} }

View File

@@ -1,17 +1,17 @@
project project
: requirements : requirements
<toolset>gcc:<cflags>-xc++ <toolset>gcc:<cflags>-xc++
<toolset>msvc:<cflags>"/TP" <toolset>msvc:<cflags>"/TP"
; ;
exe test_atomics exe test_atomics
: main.c : main.c
test_atomics.c test_atomics.c
test_indexed_cases.c test_indexed_cases.c
; ;
install dist install dist
: test_atomics : test_atomics
: <variant>debug:<location>$(DIST)/debug/tests/test_conformance/atomics : <variant>debug:<location>$(DIST)/debug/tests/test_conformance/atomics
<variant>release:<location>$(DIST)/release/tests/test_conformance/atomics <variant>release:<location>$(DIST)/release/tests/test_conformance/atomics
; ;

View File

@@ -1,44 +1,44 @@
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
SRCS = main.c \ SRCS = main.c \
test_atomics.cpp \ test_atomics.cpp \
test_indexed_cases.c \ test_indexed_cases.c \
../../test_common/harness/errorHelpers.c \ ../../test_common/harness/errorHelpers.c \
../../test_common/harness/threadTesting.c \ ../../test_common/harness/threadTesting.c \
../../test_common/harness/testHarness.c \ ../../test_common/harness/testHarness.c \
../../test_common/harness/mt19937.c \ ../../test_common/harness/mt19937.c \
../../test_common/harness/conversions.c \ ../../test_common/harness/conversions.c \
../../test_common/harness/kernelHelpers.c ../../test_common/harness/kernelHelpers.c
DEFINES = DEFINES =
SOURCES = $(abspath $(SRCS)) SOURCES = $(abspath $(SRCS))
LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries
LIBPATH += -L. LIBPATH += -L.
FRAMEWORK = $(SOURCES) FRAMEWORK = $(SOURCES)
HEADERS = HEADERS =
TARGET = test_atomics TARGET = test_atomics
INCLUDE = INCLUDE =
COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32 COMPILERFLAGS = -c -Wall -g -Wshorten-64-to-32
CC = c++ CC = c++
CFLAGS = $(COMPILERFLAGS) $(RC_CFLAGS) ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CFLAGS = $(COMPILERFLAGS) $(RC_CFLAGS) ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
CXXFLAGS = $(COMPILERFLAGS) $(RC_CFLAGS) ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CXXFLAGS = $(COMPILERFLAGS) $(RC_CFLAGS) ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF} LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF}
OBJECTS := ${SOURCES:.c=.o} OBJECTS := ${SOURCES:.c=.o}
OBJECTS := ${OBJECTS:.cpp=.o} OBJECTS := ${OBJECTS:.cpp=.o}
TARGETOBJECT = TARGETOBJECT =
all: $(TARGET) all: $(TARGET)
$(TARGET): $(OBJECTS) $(TARGET): $(OBJECTS)
$(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES) $(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES)
clean: clean:
rm -f $(TARGET) $(OBJECTS) rm -f $(TARGET) $(OBJECTS)
.DEFAULT: .DEFAULT:
@echo The target \"$@\" does not exist in Makefile. @echo The target \"$@\" does not exist in Makefile.

View File

@@ -1,78 +1,78 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#include "procs.h" #include "procs.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif #endif
basefn basefn_list[] = { basefn basefn_list[] = {
test_atomic_add, test_atomic_add,
test_atomic_sub, test_atomic_sub,
test_atomic_xchg, test_atomic_xchg,
test_atomic_min, test_atomic_min,
test_atomic_max, test_atomic_max,
test_atomic_inc, test_atomic_inc,
test_atomic_dec, test_atomic_dec,
test_atomic_cmpxchg, test_atomic_cmpxchg,
test_atomic_and, test_atomic_and,
test_atomic_or, test_atomic_or,
test_atomic_xor, test_atomic_xor,
test_atomic_add_index, test_atomic_add_index,
test_atomic_add_index_bin test_atomic_add_index_bin
}; };
const char *basefn_names[] = { const char *basefn_names[] = {
"atomic_add", "atomic_add",
"atomic_sub", "atomic_sub",
"atomic_xchg", "atomic_xchg",
"atomic_min", "atomic_min",
"atomic_max", "atomic_max",
"atomic_inc", "atomic_inc",
"atomic_dec", "atomic_dec",
"atomic_cmpxchg", "atomic_cmpxchg",
"atomic_and", "atomic_and",
"atomic_or", "atomic_or",
"atomic_xor", "atomic_xor",
"atomic_add_index", "atomic_add_index",
"atomic_add_index_bin", "atomic_add_index_bin",
"all", "all",
}; };
ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0]))); ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0])));
int num_fns = sizeof(basefn_names) / sizeof(char *); int num_fns = sizeof(basefn_names) / sizeof(char *);
int main(int argc, const char *argv[]) int main(int argc, const char *argv[])
{ {
return runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 ); return runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 );
} }

View File

@@ -1,39 +1,39 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include "../../test_common/harness/kernelHelpers.h" #include "../../test_common/harness/kernelHelpers.h"
#include "../../test_common/harness/threadTesting.h" #include "../../test_common/harness/threadTesting.h"
#include "../../test_common/harness/typeWrappers.h" #include "../../test_common/harness/typeWrappers.h"
extern int create_program_and_kernel(const char *source, const char *kernel_name, cl_program *program_ret, cl_kernel *kernel_ret); extern int create_program_and_kernel(const char *source, const char *kernel_name, cl_program *program_ret, cl_kernel *kernel_ret);
extern int test_atomic_add(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_add(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_sub(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_sub(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_xchg(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_xchg(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_min(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_min(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_max(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_max(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_inc(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_inc(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_dec(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_dec(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_cmpxchg(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_cmpxchg(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_and(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_and(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_or(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_or(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_xor(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_xor(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_add_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_add_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_atomic_add_index_bin(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_atomic_add_index_bin(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);

View File

@@ -1,36 +1,36 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#ifndef _testBase_h #ifndef _testBase_h
#define _testBase_h #define _testBase_h
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <sys/types.h> #include <sys/types.h>
#include <sys/stat.h> #include <sys/stat.h>
#include "procs.h" #include "procs.h"
#endif // _testBase_h #endif // _testBase_h

File diff suppressed because it is too large Load Diff

View File

@@ -1,380 +1,380 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "testBase.h" #include "testBase.h"
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
extern cl_uint gRandomSeed; extern cl_uint gRandomSeed;
const char * atomic_index_source = const char * atomic_index_source =
"#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable\n" "#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable\n"
"// Counter keeps track of which index in counts we are using.\n" "// Counter keeps track of which index in counts we are using.\n"
"// We get that value, increment it, and then set that index in counts to our thread ID.\n" "// We get that value, increment it, and then set that index in counts to our thread ID.\n"
"// At the end of this we should have all thread IDs in some random location in counts\n" "// At the end of this we should have all thread IDs in some random location in counts\n"
"// exactly once. If atom_add failed then we will write over various thread IDs and we\n" "// exactly once. If atom_add failed then we will write over various thread IDs and we\n"
"// will be missing some.\n" "// will be missing some.\n"
"\n" "\n"
"__kernel void add_index_test(__global int *counter, __global int *counts) {\n" "__kernel void add_index_test(__global int *counter, __global int *counts) {\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
" \n" " \n"
" int counter_to_use = atom_add(counter, 1);\n" " int counter_to_use = atom_add(counter, 1);\n"
" counts[counter_to_use] = tid;\n" " counts[counter_to_use] = tid;\n"
"}"; "}";
int test_atomic_add_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_atomic_add_index(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
clProgramWrapper program; clProgramWrapper program;
clKernelWrapper kernel; clKernelWrapper kernel;
clMemWrapper counter, counters; clMemWrapper counter, counters;
size_t numGlobalThreads, numLocalThreads; size_t numGlobalThreads, numLocalThreads;
int fail = 0, succeed = 0, err; int fail = 0, succeed = 0, err;
/* Check if atomics are supported. */ /* Check if atomics are supported. */
if (!is_extension_available(deviceID, "cl_khr_global_int32_base_atomics")) { if (!is_extension_available(deviceID, "cl_khr_global_int32_base_atomics")) {
log_info("Base atomics not supported (cl_khr_global_int32_base_atomics). Skipping test.\n"); log_info("Base atomics not supported (cl_khr_global_int32_base_atomics). Skipping test.\n");
return 0; return 0;
} }
//===== add_index test //===== add_index test
// The index test replicates what particles does. // The index test replicates what particles does.
// It uses one memory location to keep track of the current index and then each thread // It uses one memory location to keep track of the current index and then each thread
// does an atomic add to it to get its new location. The threads then write to their // does an atomic add to it to get its new location. The threads then write to their
// assigned location. At the end we check to make sure that each thread's ID shows up // assigned location. At the end we check to make sure that each thread's ID shows up
// exactly once in the output. // exactly once in the output.
numGlobalThreads = 2048; numGlobalThreads = 2048;
if( create_single_kernel_helper( context, &program, &kernel, 1, &atomic_index_source, "add_index_test" ) ) if( create_single_kernel_helper( context, &program, &kernel, 1, &atomic_index_source, "add_index_test" ) )
return -1; return -1;
if( get_max_common_work_group_size( context, kernel, numGlobalThreads, &numLocalThreads ) ) if( get_max_common_work_group_size( context, kernel, numGlobalThreads, &numLocalThreads ) )
return -1; return -1;
log_info("Execute global_threads:%d local_threads:%d\n", log_info("Execute global_threads:%d local_threads:%d\n",
(int)numGlobalThreads, (int)numLocalThreads); (int)numGlobalThreads, (int)numLocalThreads);
// Create the counter that will keep track of where each thread writes. // Create the counter that will keep track of where each thread writes.
counter = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), counter = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE),
sizeof(cl_int) * 1, NULL, NULL); sizeof(cl_int) * 1, NULL, NULL);
// Create the counters that will hold the results of each thread writing // Create the counters that will hold the results of each thread writing
// its ID into a (hopefully) unique location. // its ID into a (hopefully) unique location.
counters = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), counters = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE),
sizeof(cl_int) * numGlobalThreads, NULL, NULL); sizeof(cl_int) * numGlobalThreads, NULL, NULL);
// Reset all those locations to -1 to indciate they have not been used. // Reset all those locations to -1 to indciate they have not been used.
cl_int *values = (cl_int*) malloc(sizeof(cl_int)*numGlobalThreads); cl_int *values = (cl_int*) malloc(sizeof(cl_int)*numGlobalThreads);
if (values == NULL) { if (values == NULL) {
log_error("add_index_test FAILED to allocate memory for initial values.\n"); log_error("add_index_test FAILED to allocate memory for initial values.\n");
fail = 1; succeed = -1; fail = 1; succeed = -1;
} else { } else {
memset(values, -1, numLocalThreads); memset(values, -1, numLocalThreads);
unsigned int i=0; unsigned int i=0;
for (i=0; i<numGlobalThreads; i++) for (i=0; i<numGlobalThreads; i++)
values[i] = -1; values[i] = -1;
int init=0; int init=0;
err = clEnqueueWriteBuffer(queue, counters, true, 0, numGlobalThreads*sizeof(cl_int), values, 0, NULL, NULL); err = clEnqueueWriteBuffer(queue, counters, true, 0, numGlobalThreads*sizeof(cl_int), values, 0, NULL, NULL);
err |= clEnqueueWriteBuffer(queue, counter, true, 0,1*sizeof(cl_int), &init, 0, NULL, NULL); err |= clEnqueueWriteBuffer(queue, counter, true, 0,1*sizeof(cl_int), &init, 0, NULL, NULL);
if (err) { if (err) {
log_error("add_index_test FAILED to write initial values to arrays: %d\n", err); log_error("add_index_test FAILED to write initial values to arrays: %d\n", err);
fail=1; succeed=-1; fail=1; succeed=-1;
} else { } else {
err = clSetKernelArg(kernel, 0, sizeof(counter), &counter); err = clSetKernelArg(kernel, 0, sizeof(counter), &counter);
err |= clSetKernelArg(kernel, 1, sizeof(counters), &counters); err |= clSetKernelArg(kernel, 1, sizeof(counters), &counters);
if (err) { if (err) {
log_error("add_index_test FAILED to set kernel arguments: %d\n", err); log_error("add_index_test FAILED to set kernel arguments: %d\n", err);
fail=1; succeed=-1; fail=1; succeed=-1;
} else { } else {
err = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, &numGlobalThreads, &numLocalThreads, 0, NULL, NULL ); err = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, &numGlobalThreads, &numLocalThreads, 0, NULL, NULL );
if (err) { if (err) {
log_error("add_index_test FAILED to execute kernel: %d\n", err); log_error("add_index_test FAILED to execute kernel: %d\n", err);
fail=1; succeed=-1; fail=1; succeed=-1;
} else { } else {
err = clEnqueueReadBuffer( queue, counters, true, 0, sizeof(cl_int)*numGlobalThreads, values, 0, NULL, NULL ); err = clEnqueueReadBuffer( queue, counters, true, 0, sizeof(cl_int)*numGlobalThreads, values, 0, NULL, NULL );
if (err) { if (err) {
log_error("add_index_test FAILED to read back results: %d\n", err); log_error("add_index_test FAILED to read back results: %d\n", err);
fail = 1; succeed=-1; fail = 1; succeed=-1;
} else { } else {
unsigned int looking_for, index; unsigned int looking_for, index;
for (looking_for=0; looking_for<numGlobalThreads; looking_for++) { for (looking_for=0; looking_for<numGlobalThreads; looking_for++) {
int instances_found=0; int instances_found=0;
for (index=0; index<numGlobalThreads; index++) { for (index=0; index<numGlobalThreads; index++) {
if (values[index]==(int)looking_for) if (values[index]==(int)looking_for)
instances_found++; instances_found++;
} }
if (instances_found != 1) { if (instances_found != 1) {
log_error("add_index_test FAILED: wrong number of instances (%d!=1) for counter %d.\n", instances_found, looking_for); log_error("add_index_test FAILED: wrong number of instances (%d!=1) for counter %d.\n", instances_found, looking_for);
fail = 1; succeed=-1; fail = 1; succeed=-1;
} }
} }
} }
} }
} }
} }
if (!fail) { if (!fail) {
log_info("add_index_test passed. Each thread used exactly one index.\n"); log_info("add_index_test passed. Each thread used exactly one index.\n");
} }
free(values); free(values);
} }
return fail; return fail;
} }
const char *add_index_bin_kernel[] = { const char *add_index_bin_kernel[] = {
"#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable\n" "#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable\n"
"// This test assigns a bunch of values to bins and then tries to put them in the bins in parallel\n" "// This test assigns a bunch of values to bins and then tries to put them in the bins in parallel\n"
"// using an atomic add to keep track of the current location to write into in each bin.\n" "// using an atomic add to keep track of the current location to write into in each bin.\n"
"// This is the same as the memory update for the particles demo.\n" "// This is the same as the memory update for the particles demo.\n"
"\n" "\n"
"__kernel void add_index_bin_test(__global int *bin_counters, __global int *bins, __global int *bin_assignments, int max_counts_per_bin) {\n" "__kernel void add_index_bin_test(__global int *bin_counters, __global int *bins, __global int *bin_assignments, int max_counts_per_bin) {\n"
" int tid = get_global_id(0);\n" " int tid = get_global_id(0);\n"
"\n" "\n"
" int location = bin_assignments[tid];\n" " int location = bin_assignments[tid];\n"
" int counter = atom_add(&bin_counters[location], 1);\n" " int counter = atom_add(&bin_counters[location], 1);\n"
" bins[location*max_counts_per_bin + counter] = tid;\n" " bins[location*max_counts_per_bin + counter] = tid;\n"
"}" }; "}" };
// This test assigns a bunch of values to bins and then tries to put them in the bins in parallel // This test assigns a bunch of values to bins and then tries to put them in the bins in parallel
// using an atomic add to keep track of the current location to write into in each bin. // using an atomic add to keep track of the current location to write into in each bin.
// This is the same as the memory update for the particles demo. // This is the same as the memory update for the particles demo.
int add_index_bin_test(size_t *global_threads, cl_command_queue queue, cl_context context, MTdata d) int add_index_bin_test(size_t *global_threads, cl_command_queue queue, cl_context context, MTdata d)
{ {
int number_of_items = (int)global_threads[0]; int number_of_items = (int)global_threads[0];
size_t local_threads[1]; size_t local_threads[1];
int divisor = 12; int divisor = 12;
int number_of_bins = number_of_items/divisor; int number_of_bins = number_of_items/divisor;
int max_counts_per_bin = divisor*2; int max_counts_per_bin = divisor*2;
int fail = 0; int fail = 0;
int succeed = 0; int succeed = 0;
int err; int err;
clProgramWrapper program; clProgramWrapper program;
clKernelWrapper kernel; clKernelWrapper kernel;
// log_info("add_index_bin_test: %d items, into %d bins, with a max of %d items per bin (bins is %d long).\n", // log_info("add_index_bin_test: %d items, into %d bins, with a max of %d items per bin (bins is %d long).\n",
// number_of_items, number_of_bins, max_counts_per_bin, number_of_bins*max_counts_per_bin); // number_of_items, number_of_bins, max_counts_per_bin, number_of_bins*max_counts_per_bin);
//===== add_index_bin test //===== add_index_bin test
// The index test replicates what particles does. // The index test replicates what particles does.
err = create_single_kernel_helper(context, &program, &kernel, 1, add_index_bin_kernel, "add_index_bin_test" ); err = create_single_kernel_helper(context, &program, &kernel, 1, add_index_bin_kernel, "add_index_bin_test" );
test_error( err, "Unable to create testing kernel" ); test_error( err, "Unable to create testing kernel" );
if( get_max_common_work_group_size( context, kernel, global_threads[0], &local_threads[0] ) ) if( get_max_common_work_group_size( context, kernel, global_threads[0], &local_threads[0] ) )
return -1; return -1;
log_info("Execute global_threads:%d local_threads:%d\n", log_info("Execute global_threads:%d local_threads:%d\n",
(int)global_threads[0], (int)local_threads[0]); (int)global_threads[0], (int)local_threads[0]);
// Allocate our storage // Allocate our storage
cl_mem bin_counters = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), cl_mem bin_counters = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE),
sizeof(cl_int) * number_of_bins, NULL, NULL); sizeof(cl_int) * number_of_bins, NULL, NULL);
cl_mem bins = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE), cl_mem bins = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_WRITE),
sizeof(cl_int) * number_of_bins*max_counts_per_bin, NULL, NULL); sizeof(cl_int) * number_of_bins*max_counts_per_bin, NULL, NULL);
cl_mem bin_assignments = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_ONLY), cl_mem bin_assignments = clCreateBuffer(context, (cl_mem_flags)(CL_MEM_READ_ONLY),
sizeof(cl_int) * number_of_items, NULL, NULL); sizeof(cl_int) * number_of_items, NULL, NULL);
if (bin_counters == NULL) { if (bin_counters == NULL) {
log_error("add_index_bin_test FAILED to allocate bin_counters.\n"); log_error("add_index_bin_test FAILED to allocate bin_counters.\n");
return -1; return -1;
} }
if (bins == NULL) { if (bins == NULL) {
log_error("add_index_bin_test FAILED to allocate bins.\n"); log_error("add_index_bin_test FAILED to allocate bins.\n");
return -1; return -1;
} }
if (bin_assignments == NULL) { if (bin_assignments == NULL) {
log_error("add_index_bin_test FAILED to allocate bin_assignments.\n"); log_error("add_index_bin_test FAILED to allocate bin_assignments.\n");
return -1; return -1;
} }
// Initialize our storage // Initialize our storage
cl_int *l_bin_counts = (cl_int*)malloc(sizeof(cl_int)*number_of_bins); cl_int *l_bin_counts = (cl_int*)malloc(sizeof(cl_int)*number_of_bins);
if (!l_bin_counts) { if (!l_bin_counts) {
log_error("add_index_bin_test FAILED to allocate initial values for bin_counters.\n"); log_error("add_index_bin_test FAILED to allocate initial values for bin_counters.\n");
return -1; return -1;
} }
int i; int i;
for (i=0; i<number_of_bins; i++) for (i=0; i<number_of_bins; i++)
l_bin_counts[i] = 0; l_bin_counts[i] = 0;
err = clEnqueueWriteBuffer(queue, bin_counters, true, 0, sizeof(cl_int)*number_of_bins, l_bin_counts, 0, NULL, NULL); err = clEnqueueWriteBuffer(queue, bin_counters, true, 0, sizeof(cl_int)*number_of_bins, l_bin_counts, 0, NULL, NULL);
if (err) { if (err) {
log_error("add_index_bin_test FAILED to set initial values for bin_counters: %d\n", err); log_error("add_index_bin_test FAILED to set initial values for bin_counters: %d\n", err);
return -1; return -1;
} }
cl_int *values = (cl_int*)malloc(sizeof(cl_int)*number_of_bins*max_counts_per_bin); cl_int *values = (cl_int*)malloc(sizeof(cl_int)*number_of_bins*max_counts_per_bin);
if (!values) { if (!values) {
log_error("add_index_bin_test FAILED to allocate initial values for bins.\n"); log_error("add_index_bin_test FAILED to allocate initial values for bins.\n");
return -1; return -1;
} }
for (i=0; i<number_of_bins*max_counts_per_bin; i++) for (i=0; i<number_of_bins*max_counts_per_bin; i++)
values[i] = -1; values[i] = -1;
err = clEnqueueWriteBuffer(queue, bins, true, 0, sizeof(cl_int)*number_of_bins*max_counts_per_bin, values, 0, NULL, NULL); err = clEnqueueWriteBuffer(queue, bins, true, 0, sizeof(cl_int)*number_of_bins*max_counts_per_bin, values, 0, NULL, NULL);
if (err) { if (err) {
log_error("add_index_bin_test FAILED to set initial values for bins: %d\n", err); log_error("add_index_bin_test FAILED to set initial values for bins: %d\n", err);
return -1; return -1;
} }
free(values); free(values);
cl_int *l_bin_assignments = (cl_int*)malloc(sizeof(cl_int)*number_of_items); cl_int *l_bin_assignments = (cl_int*)malloc(sizeof(cl_int)*number_of_items);
if (!l_bin_assignments) { if (!l_bin_assignments) {
log_error("add_index_bin_test FAILED to allocate initial values for l_bin_assignments.\n"); log_error("add_index_bin_test FAILED to allocate initial values for l_bin_assignments.\n");
return -1; return -1;
} }
for (i=0; i<number_of_items; i++) { for (i=0; i<number_of_items; i++) {
int bin = random_in_range(0, number_of_bins-1, d); int bin = random_in_range(0, number_of_bins-1, d);
while (l_bin_counts[bin] >= max_counts_per_bin) { while (l_bin_counts[bin] >= max_counts_per_bin) {
bin = random_in_range(0, number_of_bins-1, d); bin = random_in_range(0, number_of_bins-1, d);
} }
if (bin >= number_of_bins) if (bin >= number_of_bins)
log_error("add_index_bin_test internal error generating bin assignments: bin %d >= number_of_bins %d.\n", bin, number_of_bins); log_error("add_index_bin_test internal error generating bin assignments: bin %d >= number_of_bins %d.\n", bin, number_of_bins);
if (l_bin_counts[bin]+1 > max_counts_per_bin) if (l_bin_counts[bin]+1 > max_counts_per_bin)
log_error("add_index_bin_test internal error generating bin assignments: bin %d has more entries (%d) than max_counts_per_bin (%d).\n", bin, l_bin_counts[bin], max_counts_per_bin); log_error("add_index_bin_test internal error generating bin assignments: bin %d has more entries (%d) than max_counts_per_bin (%d).\n", bin, l_bin_counts[bin], max_counts_per_bin);
l_bin_counts[bin]++; l_bin_counts[bin]++;
l_bin_assignments[i] = bin; l_bin_assignments[i] = bin;
// log_info("item %d assigned to bin %d (%d items)\n", i, bin, l_bin_counts[bin]); // log_info("item %d assigned to bin %d (%d items)\n", i, bin, l_bin_counts[bin]);
} }
err = clEnqueueWriteBuffer(queue, bin_assignments, true, 0, sizeof(cl_int)*number_of_items, l_bin_assignments, 0, NULL, NULL); err = clEnqueueWriteBuffer(queue, bin_assignments, true, 0, sizeof(cl_int)*number_of_items, l_bin_assignments, 0, NULL, NULL);
if (err) { if (err) {
log_error("add_index_bin_test FAILED to set initial values for bin_assignments: %d\n", err); log_error("add_index_bin_test FAILED to set initial values for bin_assignments: %d\n", err);
return -1; return -1;
} }
// Setup the kernel // Setup the kernel
err = clSetKernelArg(kernel, 0, sizeof(bin_counters), &bin_counters); err = clSetKernelArg(kernel, 0, sizeof(bin_counters), &bin_counters);
err |= clSetKernelArg(kernel, 1, sizeof(bins), &bins); err |= clSetKernelArg(kernel, 1, sizeof(bins), &bins);
err |= clSetKernelArg(kernel, 2, sizeof(bin_assignments), &bin_assignments); err |= clSetKernelArg(kernel, 2, sizeof(bin_assignments), &bin_assignments);
err |= clSetKernelArg(kernel, 3, sizeof(max_counts_per_bin), &max_counts_per_bin); err |= clSetKernelArg(kernel, 3, sizeof(max_counts_per_bin), &max_counts_per_bin);
if (err) { if (err) {
log_error("add_index_bin_test FAILED to set kernel arguments: %d\n", err); log_error("add_index_bin_test FAILED to set kernel arguments: %d\n", err);
fail=1; succeed=-1; fail=1; succeed=-1;
return -1; return -1;
} }
err = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, global_threads, local_threads, 0, NULL, NULL ); err = clEnqueueNDRangeKernel( queue, kernel, 1, NULL, global_threads, local_threads, 0, NULL, NULL );
if (err) { if (err) {
log_error("add_index_bin_test FAILED to execute kernel: %d\n", err); log_error("add_index_bin_test FAILED to execute kernel: %d\n", err);
fail=1; succeed=-1; fail=1; succeed=-1;
} }
cl_int *final_bin_assignments = (cl_int*)malloc(sizeof(cl_int)*number_of_bins*max_counts_per_bin); cl_int *final_bin_assignments = (cl_int*)malloc(sizeof(cl_int)*number_of_bins*max_counts_per_bin);
if (!final_bin_assignments) { if (!final_bin_assignments) {
log_error("add_index_bin_test FAILED to allocate initial values for final_bin_assignments.\n"); log_error("add_index_bin_test FAILED to allocate initial values for final_bin_assignments.\n");
return -1; return -1;
} }
err = clEnqueueReadBuffer( queue, bins, true, 0, sizeof(cl_int)*number_of_bins*max_counts_per_bin, final_bin_assignments, 0, NULL, NULL ); err = clEnqueueReadBuffer( queue, bins, true, 0, sizeof(cl_int)*number_of_bins*max_counts_per_bin, final_bin_assignments, 0, NULL, NULL );
if (err) { if (err) {
log_error("add_index_bin_test FAILED to read back bins: %d\n", err); log_error("add_index_bin_test FAILED to read back bins: %d\n", err);
fail = 1; succeed=-1; fail = 1; succeed=-1;
} }
cl_int *final_bin_counts = (cl_int*)malloc(sizeof(cl_int)*number_of_bins); cl_int *final_bin_counts = (cl_int*)malloc(sizeof(cl_int)*number_of_bins);
if (!final_bin_counts) { if (!final_bin_counts) {
log_error("add_index_bin_test FAILED to allocate initial values for final_bin_counts.\n"); log_error("add_index_bin_test FAILED to allocate initial values for final_bin_counts.\n");
return -1; return -1;
} }
err = clEnqueueReadBuffer( queue, bin_counters, true, 0, sizeof(cl_int)*number_of_bins, final_bin_counts, 0, NULL, NULL ); err = clEnqueueReadBuffer( queue, bin_counters, true, 0, sizeof(cl_int)*number_of_bins, final_bin_counts, 0, NULL, NULL );
if (err) { if (err) {
log_error("add_index_bin_test FAILED to read back bin_counters: %d\n", err); log_error("add_index_bin_test FAILED to read back bin_counters: %d\n", err);
fail = 1; succeed=-1; fail = 1; succeed=-1;
} }
// Verification. // Verification.
int errors=0; int errors=0;
int current_bin; int current_bin;
int search; int search;
// Print out all the contents of the bins. // Print out all the contents of the bins.
// for (current_bin=0; current_bin<number_of_bins; current_bin++) // for (current_bin=0; current_bin<number_of_bins; current_bin++)
// for (search=0; search<max_counts_per_bin; search++) // for (search=0; search<max_counts_per_bin; search++)
// log_info("[bin %d, entry %d] = %d\n", current_bin, search, final_bin_assignments[current_bin*max_counts_per_bin+search]); // log_info("[bin %d, entry %d] = %d\n", current_bin, search, final_bin_assignments[current_bin*max_counts_per_bin+search]);
// First verify that there are the correct number in each bin. // First verify that there are the correct number in each bin.
for (current_bin=0; current_bin<number_of_bins; current_bin++) { for (current_bin=0; current_bin<number_of_bins; current_bin++) {
int expected_number = l_bin_counts[current_bin]; int expected_number = l_bin_counts[current_bin];
int actual_number = final_bin_counts[current_bin]; int actual_number = final_bin_counts[current_bin];
if (expected_number != actual_number) { if (expected_number != actual_number) {
log_error("add_index_bin_test FAILED: bin %d reported %d entries when %d were expected.\n", current_bin, actual_number, expected_number); log_error("add_index_bin_test FAILED: bin %d reported %d entries when %d were expected.\n", current_bin, actual_number, expected_number);
errors++; errors++;
} }
for (search=0; search<expected_number; search++) { for (search=0; search<expected_number; search++) {
if (final_bin_assignments[current_bin*max_counts_per_bin+search] == -1) { if (final_bin_assignments[current_bin*max_counts_per_bin+search] == -1) {
log_error("add_index_bin_test FAILED: bin %d had no entry at position %d when it should have had %d entries.\n", current_bin, search, expected_number); log_error("add_index_bin_test FAILED: bin %d had no entry at position %d when it should have had %d entries.\n", current_bin, search, expected_number);
errors++; errors++;
} }
} }
for (search=expected_number; search<max_counts_per_bin; search++) { for (search=expected_number; search<max_counts_per_bin; search++) {
if (final_bin_assignments[current_bin*max_counts_per_bin+search] != -1) { if (final_bin_assignments[current_bin*max_counts_per_bin+search] != -1) {
log_error("add_index_bin_test FAILED: bin %d had an extra entry at position %d when it should have had only %d entries.\n", current_bin, search, expected_number); log_error("add_index_bin_test FAILED: bin %d had an extra entry at position %d when it should have had only %d entries.\n", current_bin, search, expected_number);
errors++; errors++;
} }
} }
} }
// Now verify that the correct ones are in each bin // Now verify that the correct ones are in each bin
int index; int index;
for (index=0; index<number_of_items; index++) { for (index=0; index<number_of_items; index++) {
int expected_bin = l_bin_assignments[index]; int expected_bin = l_bin_assignments[index];
int found_it = 0; int found_it = 0;
for (search=0; search<l_bin_counts[expected_bin]; search++) { for (search=0; search<l_bin_counts[expected_bin]; search++) {
if (final_bin_assignments[expected_bin*max_counts_per_bin+search] == index) { if (final_bin_assignments[expected_bin*max_counts_per_bin+search] == index) {
found_it = 1; found_it = 1;
} }
} }
if (found_it == 0) { if (found_it == 0) {
log_error("add_index_bin_test FAILED: did not find item %d in bin %d.\n", index, expected_bin); log_error("add_index_bin_test FAILED: did not find item %d in bin %d.\n", index, expected_bin);
errors++; errors++;
} }
} }
free(l_bin_counts); free(l_bin_counts);
free(l_bin_assignments); free(l_bin_assignments);
free(final_bin_assignments); free(final_bin_assignments);
free(final_bin_counts); free(final_bin_counts);
clReleaseMemObject(bin_counters); clReleaseMemObject(bin_counters);
clReleaseMemObject(bins); clReleaseMemObject(bins);
clReleaseMemObject(bin_assignments); clReleaseMemObject(bin_assignments);
if (errors == 0) { if (errors == 0) {
log_info("add_index_bin_test passed. Each item was put in the correct bin in parallel.\n"); log_info("add_index_bin_test passed. Each item was put in the correct bin in parallel.\n");
return 0; return 0;
} else { } else {
log_error("add_index_bin_test FAILED: %d errors.\n", errors); log_error("add_index_bin_test FAILED: %d errors.\n", errors);
return -1; return -1;
} }
} }
int test_atomic_add_index_bin(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements) int test_atomic_add_index_bin(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements)
{ {
//===== add_index_bin test //===== add_index_bin test
size_t numGlobalThreads = 2048; size_t numGlobalThreads = 2048;
int iteration=0; int iteration=0;
int err, failed = 0; int err, failed = 0;
MTdata d = init_genrand( gRandomSeed ); MTdata d = init_genrand( gRandomSeed );
/* Check if atomics are supported. */ /* Check if atomics are supported. */
if (!is_extension_available(deviceID, "cl_khr_global_int32_base_atomics")) { if (!is_extension_available(deviceID, "cl_khr_global_int32_base_atomics")) {
log_info("Base atomics not supported (cl_khr_global_int32_base_atomics). Skipping test.\n"); log_info("Base atomics not supported (cl_khr_global_int32_base_atomics). Skipping test.\n");
free_mtdata( d ); free_mtdata( d );
return 0; return 0;
} }
for(iteration=0; iteration<10; iteration++) { for(iteration=0; iteration<10; iteration++) {
log_info("add_index_bin_test with %d elements:\n", (int)numGlobalThreads); log_info("add_index_bin_test with %d elements:\n", (int)numGlobalThreads);
err = add_index_bin_test(&numGlobalThreads, queue, context, d); err = add_index_bin_test(&numGlobalThreads, queue, context, d);
if (err) { if (err) {
failed++; failed++;
break; break;
} }
numGlobalThreads*=2; numGlobalThreads*=2;
} }
free_mtdata( d ); free_mtdata( d );
return failed; return failed;
} }

View File

@@ -1,75 +1,75 @@
project project
: requirements : requirements
<toolset>gcc:<cflags>-xc++ <toolset>gcc:<cflags>-xc++
<toolset>msvc:<cflags>"/TP" <toolset>msvc:<cflags>"/TP"
; ;
exe test_basic exe test_basic
: main.c : main.c
test_arraycopy.c test_arraycopy.c
test_arrayimagecopy3d.c test_arrayimagecopy3d.c
test_arrayimagecopy.c test_arrayimagecopy.c
test_arrayreadwrite.c test_arrayreadwrite.c
test_astype.cpp test_astype.cpp
test_async_copy.cpp test_async_copy.cpp
test_barrier.c test_barrier.c
test_basic_parameter_types.c test_basic_parameter_types.c
test_constant.c test_constant.c
test_createkernelsinprogram.c test_createkernelsinprogram.c
test_enqueue_map.cpp test_enqueue_map.cpp
test_explicit_s2v.cpp test_explicit_s2v.cpp
test_float2int.c test_float2int.c
test_fpmath_float2.c test_fpmath_float2.c
test_fpmath_float4.c test_fpmath_float4.c
test_fpmath_float.c test_fpmath_float.c
test_hiloeo.c test_hiloeo.c
test_hostptr.c test_hostptr.c
test_if.c test_if.c
test_imagearraycopy3d.c test_imagearraycopy3d.c
test_imagearraycopy.c test_imagearraycopy.c
test_imagecopy3d.c test_imagecopy3d.c
test_imagecopy.c test_imagecopy.c
test_imagedim.c test_imagedim.c
test_image_multipass.c test_image_multipass.c
test_imagenpot.c test_imagenpot.c
test_image_param.c test_image_param.c
test_image_r8.c test_image_r8.c
test_imagerandomcopy.c test_imagerandomcopy.c
test_imagereadwrite3d.c test_imagereadwrite3d.c
test_imagereadwrite.c test_imagereadwrite.c
test_int2float.c test_int2float.c
test_intmath_int2.c test_intmath_int2.c
test_intmath_int4.c test_intmath_int4.c
test_intmath_int.c test_intmath_int.c
test_intmath_long2.c test_intmath_long2.c
test_intmath_long4.c test_intmath_long4.c
test_intmath_long.c test_intmath_long.c
test_local.c test_local.c
test_loop.c test_loop.c
test_multireadimagemultifmt.c test_multireadimagemultifmt.c
test_multireadimageonefmt.c test_multireadimageonefmt.c
test_pointercast.c test_pointercast.c
test_readimage3d.c test_readimage3d.c
test_readimage3d_fp32.c test_readimage3d_fp32.c
test_readimage3d_int16.c test_readimage3d_int16.c
test_readimage.c test_readimage.c
test_readimage_fp32.c test_readimage_fp32.c
test_readimage_int16.c test_readimage_int16.c
test_sizeof.c test_sizeof.c
test_vec_type_hint.c test_vec_type_hint.c
test_vector_creation.cpp test_vector_creation.cpp
test_vloadstore.c test_vloadstore.c
test_work_item_functions.cpp test_work_item_functions.cpp
test_writeimage.c test_writeimage.c
test_writeimage_fp32.c test_writeimage_fp32.c
test_writeimage_int16.c test_writeimage_int16.c
test_numeric_constants.cpp test_numeric_constants.cpp
test_kernel_call_kernel_function.cpp test_kernel_call_kernel_function.cpp
; ;
install dist install dist
: test_basic : test_basic
: <variant>debug:<location>$(DIST)/debug/tests/test_conformance/basic : <variant>debug:<location>$(DIST)/debug/tests/test_conformance/basic
<variant>release:<location>$(DIST)/release/tests/test_conformance/basic <variant>release:<location>$(DIST)/release/tests/test_conformance/basic
; ;

View File

@@ -1,94 +1,94 @@
ifdef BUILD_WITH_ATF ifdef BUILD_WITH_ATF
ATF = -framework ATF ATF = -framework ATF
USE_ATF = -DUSE_ATF USE_ATF = -DUSE_ATF
endif endif
SRCS = main.c \ SRCS = main.c \
test_fpmath_float.c test_fpmath_float2.c test_fpmath_float4.c \ test_fpmath_float.c test_fpmath_float2.c test_fpmath_float4.c \
test_intmath_int.c test_intmath_int2.c test_intmath_int4.c \ test_intmath_int.c test_intmath_int2.c test_intmath_int4.c \
test_intmath_long.c test_intmath_long2.c test_intmath_long4.c \ test_intmath_long.c test_intmath_long2.c test_intmath_long4.c \
test_hiloeo.c test_local.c test_local_kernel_scope.cpp test_pointercast.c \ test_hiloeo.c test_local.c test_local_kernel_scope.cpp test_pointercast.c \
test_if.c test_sizeof.c test_loop.c \ test_if.c test_sizeof.c test_loop.c \
test_readimage.c test_readimage_int16.c test_readimage_fp32.c \ test_readimage.c test_readimage_int16.c test_readimage_fp32.c \
test_readimage3d.c test_readimage3d_int16.c test_readimage3d_fp32.c \ test_readimage3d.c test_readimage3d_int16.c test_readimage3d_fp32.c \
test_writeimage.c test_writeimage_int16.c test_writeimage_fp32.c \ test_writeimage.c test_writeimage_int16.c test_writeimage_fp32.c \
test_multireadimageonefmt.c test_multireadimagemultifmt.c \ test_multireadimageonefmt.c test_multireadimagemultifmt.c \
test_imagedim.c \ test_imagedim.c \
test_vloadstore.c \ test_vloadstore.c \
test_int2float.c test_float2int.c \ test_int2float.c test_float2int.c \
test_createkernelsinprogram.c \ test_createkernelsinprogram.c \
test_hostptr.c \ test_hostptr.c \
test_explicit_s2v.cpp \ test_explicit_s2v.cpp \
test_constant.c \ test_constant.c \
test_constant_source.cpp \ test_constant_source.cpp \
test_image_multipass.c \ test_image_multipass.c \
test_imagereadwrite.c test_imagereadwrite3d.c \ test_imagereadwrite.c test_imagereadwrite3d.c \
test_bufferreadwriterect.c \ test_bufferreadwriterect.c \
test_image_param.c \ test_image_param.c \
test_imagenpot.c \ test_imagenpot.c \
test_image_r8.c \ test_image_r8.c \
test_barrier.c \ test_barrier.c \
test_arrayreadwrite.c \ test_arrayreadwrite.c \
test_arraycopy.c \ test_arraycopy.c \
test_imagearraycopy.c \ test_imagearraycopy.c \
test_imagearraycopy3d.c \ test_imagearraycopy3d.c \
test_imagecopy.c \ test_imagecopy.c \
test_imagerandomcopy.c \ test_imagerandomcopy.c \
test_arrayimagecopy.c \ test_arrayimagecopy.c \
test_arrayimagecopy3d.c\ test_arrayimagecopy3d.c\
test_imagecopy3d.c \ test_imagecopy3d.c \
test_enqueue_map.cpp \ test_enqueue_map.cpp \
test_work_item_functions.cpp \ test_work_item_functions.cpp \
test_astype.cpp \ test_astype.cpp \
test_async_copy.cpp \ test_async_copy.cpp \
test_async_strided_copy.cpp \ test_async_strided_copy.cpp \
test_numeric_constants.cpp \ test_numeric_constants.cpp \
test_kernel_call_kernel_function.cpp \ test_kernel_call_kernel_function.cpp \
test_basic_parameter_types.c \ test_basic_parameter_types.c \
test_vector_creation.cpp \ test_vector_creation.cpp \
test_vec_type_hint.c \ test_vec_type_hint.c \
test_preprocessors.cpp \ test_preprocessors.cpp \
test_kernel_memory_alignment.cpp \ test_kernel_memory_alignment.cpp \
test_global_work_offsets.cpp \ test_global_work_offsets.cpp \
../../test_common/harness/errorHelpers.c \ ../../test_common/harness/errorHelpers.c \
../../test_common/harness/threadTesting.c \ ../../test_common/harness/threadTesting.c \
../../test_common/harness/testHarness.c \ ../../test_common/harness/testHarness.c \
../../test_common/harness/rounding_mode.c \ ../../test_common/harness/rounding_mode.c \
../../test_common/harness/kernelHelpers.c \ ../../test_common/harness/kernelHelpers.c \
../../test_common/harness/typeWrappers.cpp \ ../../test_common/harness/typeWrappers.cpp \
../../test_common/harness/imageHelpers.cpp \ ../../test_common/harness/imageHelpers.cpp \
../../test_common/harness/mt19937.c \ ../../test_common/harness/mt19937.c \
../../test_common/harness/conversions.c ../../test_common/harness/conversions.c
DEFINES = DEFINES =
SOURCES = $(abspath $(SRCS)) SOURCES = $(abspath $(SRCS))
LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries LIBPATH += -L/System/Library/Frameworks/OpenCL.framework/Libraries
LIBPATH += -L. LIBPATH += -L.
FRAMEWORK = $(SOURCES) FRAMEWORK = $(SOURCES)
HEADERS = HEADERS =
TARGET = test_basic TARGET = test_basic
INCLUDE = INCLUDE =
COMPILERFLAGS = -c -Wall -g -O0 -Wshorten-64-to-32 COMPILERFLAGS = -c -Wall -g -O0 -Wshorten-64-to-32
CC = c++ CC = c++
CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE) CXXFLAGS = $(COMPILERFLAGS) ${RC_CFLAGS} ${USE_ATF} $(DEFINES:%=-D%) $(INCLUDE)
LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF} LIBRARIES = -framework OpenCL -framework OpenGL -framework GLUT -framework AppKit ${ATF}
OBJECTS := ${SOURCES:.c=.o} OBJECTS := ${SOURCES:.c=.o}
OBJECTS := ${OBJECTS:.cpp=.o} OBJECTS := ${OBJECTS:.cpp=.o}
TARGETOBJECT = TARGETOBJECT =
all: $(TARGET) all: $(TARGET)
$(TARGET): $(OBJECTS) $(TARGET): $(OBJECTS)
$(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES) $(CC) $(RC_CFLAGS) $(OBJECTS) -o $@ $(LIBPATH) $(LIBRARIES)
clean: clean:
rm -f $(TARGET) $(OBJECTS) rm -f $(TARGET) $(OBJECTS)
.DEFAULT: .DEFAULT:
@echo The target \"$@\" does not exist in Makefile. @echo The target \"$@\" does not exist in Makefile.

View File

@@ -1,264 +1,264 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#if !defined(_WIN32) #if !defined(_WIN32)
#include <unistd.h> #include <unistd.h>
#endif #endif
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#if !defined(_WIN32) #if !defined(_WIN32)
#include <stdbool.h> #include <stdbool.h>
#endif #endif
#include <math.h> #include <math.h>
#include <string.h> #include <string.h>
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#include "procs.h" #include "procs.h"
basefn basefn_list[] = { basefn basefn_list[] = {
test_hostptr, test_hostptr,
test_fpmath_float, test_fpmath_float,
test_fpmath_float2, test_fpmath_float2,
test_fpmath_float4, test_fpmath_float4,
test_intmath_int, test_intmath_int,
test_intmath_int2, test_intmath_int2,
test_intmath_int4, test_intmath_int4,
test_intmath_long, test_intmath_long,
test_intmath_long2, test_intmath_long2,
test_intmath_long4, test_intmath_long4,
test_hiloeo, test_hiloeo,
test_if, test_if,
test_sizeof, test_sizeof,
test_loop, test_loop,
test_pointer_cast, test_pointer_cast,
test_local_arg_def, test_local_arg_def,
test_local_kernel_def, test_local_kernel_def,
test_local_kernel_scope, test_local_kernel_scope,
test_constant, test_constant,
test_constant_source, test_constant_source,
test_readimage, test_readimage,
test_readimage_int16, test_readimage_int16,
test_readimage_fp32, test_readimage_fp32,
test_writeimage, test_writeimage,
test_writeimage_int16, test_writeimage_int16,
test_writeimage_fp32, test_writeimage_fp32,
test_multireadimageonefmt, test_multireadimageonefmt,
test_multireadimagemultifmt, test_multireadimagemultifmt,
test_image_r8, test_image_r8,
test_barrier, test_barrier,
test_int2float, test_int2float,
test_float2int, test_float2int,
test_imagereadwrite, test_imagereadwrite,
test_imagereadwrite3d, test_imagereadwrite3d,
test_readimage3d, test_readimage3d,
test_readimage3d_int16, test_readimage3d_int16,
test_readimage3d_fp32, test_readimage3d_fp32,
test_bufferreadwriterect, test_bufferreadwriterect,
test_arrayreadwrite, test_arrayreadwrite,
test_arraycopy, test_arraycopy,
test_imagearraycopy, test_imagearraycopy,
test_imagearraycopy3d, test_imagearraycopy3d,
test_imagecopy, test_imagecopy,
test_imagecopy3d, test_imagecopy3d,
test_imagerandomcopy, test_imagerandomcopy,
test_arrayimagecopy, test_arrayimagecopy,
test_arrayimagecopy3d, test_arrayimagecopy3d,
test_imagenpot, test_imagenpot,
test_vload_global, test_vload_global,
test_vload_local, test_vload_local,
test_vload_constant, test_vload_constant,
test_vload_private, test_vload_private,
test_vstore_global, test_vstore_global,
test_vstore_local, test_vstore_local,
test_vstore_private, test_vstore_private,
test_createkernelsinprogram, test_createkernelsinprogram,
test_imagedim_pow2, test_imagedim_pow2,
test_imagedim_non_pow2, test_imagedim_non_pow2,
test_image_param, test_image_param,
test_image_multipass_integer_coord, test_image_multipass_integer_coord,
test_image_multipass_float_coord, test_image_multipass_float_coord,
test_explicit_s2v_bool, test_explicit_s2v_bool,
test_explicit_s2v_char, test_explicit_s2v_char,
test_explicit_s2v_uchar, test_explicit_s2v_uchar,
test_explicit_s2v_short, test_explicit_s2v_short,
test_explicit_s2v_ushort, test_explicit_s2v_ushort,
test_explicit_s2v_int, test_explicit_s2v_int,
test_explicit_s2v_uint, test_explicit_s2v_uint,
test_explicit_s2v_long, test_explicit_s2v_long,
test_explicit_s2v_ulong, test_explicit_s2v_ulong,
test_explicit_s2v_float, test_explicit_s2v_float,
test_explicit_s2v_double, test_explicit_s2v_double,
test_enqueue_map_buffer, test_enqueue_map_buffer,
test_enqueue_map_image, test_enqueue_map_image,
test_work_item_functions, test_work_item_functions,
test_astype, test_astype,
test_async_copy_global_to_local, test_async_copy_global_to_local,
test_async_copy_local_to_global, test_async_copy_local_to_global,
test_async_strided_copy_global_to_local, test_async_strided_copy_global_to_local,
test_async_strided_copy_local_to_global, test_async_strided_copy_local_to_global,
test_prefetch, test_prefetch,
test_kernel_call_kernel_function, test_kernel_call_kernel_function,
test_host_numeric_constants, test_host_numeric_constants,
test_kernel_numeric_constants, test_kernel_numeric_constants,
test_kernel_limit_constants, test_kernel_limit_constants,
test_kernel_preprocessor_macros, test_kernel_preprocessor_macros,
test_basic_parameter_types, test_basic_parameter_types,
test_vector_creation, test_vector_creation,
test_vec_type_hint, test_vec_type_hint,
test_kernel_memory_alignment_local, test_kernel_memory_alignment_local,
test_kernel_memory_alignment_global, test_kernel_memory_alignment_global,
test_kernel_memory_alignment_constant, test_kernel_memory_alignment_constant,
test_kernel_memory_alignment_private, test_kernel_memory_alignment_private,
test_global_work_offsets, test_global_work_offsets,
test_get_global_offset test_get_global_offset
}; };
const char *basefn_names[] = { const char *basefn_names[] = {
"hostptr", "hostptr",
"fpmath_float", "fpmath_float",
"fpmath_float2", "fpmath_float2",
"fpmath_float4", "fpmath_float4",
"intmath_int", "intmath_int",
"intmath_int2", "intmath_int2",
"intmath_int4", "intmath_int4",
"intmath_long", "intmath_long",
"intmath_long2", "intmath_long2",
"intmath_long4", "intmath_long4",
"hiloeo", "hiloeo",
"if", "if",
"sizeof", "sizeof",
"loop", "loop",
"pointer_cast", "pointer_cast",
"local_arg_def", "local_arg_def",
"local_kernel_def", "local_kernel_def",
"local_kernel_scope", "local_kernel_scope",
"constant", "constant",
"constant_source", "constant_source",
"readimage", "readimage",
"readimage_int16", "readimage_int16",
"readimage_fp32", "readimage_fp32",
"writeimage", "writeimage",
"writeimage_int16", "writeimage_int16",
"writeimage_fp32", "writeimage_fp32",
"mri_one", "mri_one",
"mri_multiple", "mri_multiple",
"image_r8", "image_r8",
"barrier", "barrier",
"int2float", "int2float",
"float2int", "float2int",
"imagereadwrite", "imagereadwrite",
"imagereadwrite3d", "imagereadwrite3d",
"readimage3d", "readimage3d",
"readimage3d_int16", "readimage3d_int16",
"readimage3d_fp32", "readimage3d_fp32",
"bufferreadwriterect", "bufferreadwriterect",
"arrayreadwrite", "arrayreadwrite",
"arraycopy", "arraycopy",
"imagearraycopy", "imagearraycopy",
"imagearraycopy3d", "imagearraycopy3d",
"imagecopy", "imagecopy",
"imagecopy3d", "imagecopy3d",
"imagerandomcopy", "imagerandomcopy",
"arrayimagecopy", "arrayimagecopy",
"arrayimagecopy3d", "arrayimagecopy3d",
"imagenpot", "imagenpot",
"vload_global", "vload_global",
"vload_local", "vload_local",
"vload_constant", "vload_constant",
"vload_private", "vload_private",
"vstore_global", "vstore_global",
"vstore_local", "vstore_local",
"vstore_private", "vstore_private",
"createkernelsinprogram", "createkernelsinprogram",
"imagedim_pow2", "imagedim_pow2",
"imagedim_non_pow2", "imagedim_non_pow2",
"image_param", "image_param",
"image_multipass_integer_coord", "image_multipass_integer_coord",
"image_multipass_float_coord", "image_multipass_float_coord",
"explicit_s2v_bool", "explicit_s2v_bool",
"explicit_s2v_char", "explicit_s2v_char",
"explicit_s2v_uchar", "explicit_s2v_uchar",
"explicit_s2v_short", "explicit_s2v_short",
"explicit_s2v_ushort", "explicit_s2v_ushort",
"explicit_s2v_int", "explicit_s2v_int",
"explicit_s2v_uint", "explicit_s2v_uint",
"explicit_s2v_long", "explicit_s2v_long",
"explicit_s2v_ulong", "explicit_s2v_ulong",
"explicit_s2v_float", "explicit_s2v_float",
"explicit_s2v_double", "explicit_s2v_double",
"enqueue_map_buffer", "enqueue_map_buffer",
"enqueue_map_image", "enqueue_map_image",
"work_item_functions", "work_item_functions",
"astype", "astype",
"async_copy_global_to_local", "async_copy_global_to_local",
"async_copy_local_to_global", "async_copy_local_to_global",
"async_strided_copy_global_to_local", "async_strided_copy_global_to_local",
"async_strided_copy_local_to_global", "async_strided_copy_local_to_global",
"prefetch", "prefetch",
"kernel_call_kernel_function", "kernel_call_kernel_function",
"host_numeric_constants", "host_numeric_constants",
"kernel_numeric_constants", "kernel_numeric_constants",
"kernel_limit_constants", "kernel_limit_constants",
"kernel_preprocessor_macros", "kernel_preprocessor_macros",
"parameter_types", "parameter_types",
"vector_creation", "vector_creation",
"vec_type_hint", "vec_type_hint",
"kernel_memory_alignment_local", "kernel_memory_alignment_local",
"kernel_memory_alignment_global", "kernel_memory_alignment_global",
"kernel_memory_alignment_constant", "kernel_memory_alignment_constant",
"kernel_memory_alignment_private", "kernel_memory_alignment_private",
"global_work_offsets", "global_work_offsets",
"get_global_offset", "get_global_offset",
"all", "all",
}; };
ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0]))); ct_assert((sizeof(basefn_names) / sizeof(basefn_names[0]) - 1) == (sizeof(basefn_list) / sizeof(basefn_list[0])));
int num_fns = sizeof(basefn_names) / sizeof(char *); int num_fns = sizeof(basefn_names) / sizeof(char *);
int main(int argc, const char *argv[]) int main(int argc, const char *argv[])
{ {
int err = runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 ); int err = runTestHarness( argc, argv, num_fns, basefn_list, basefn_names, false, false, 0 );
return err; return err;
} }

View File

@@ -1,142 +1,142 @@
// //
// Copyright (c) 2017 The Khronos Group Inc. // Copyright (c) 2017 The Khronos Group Inc.
// //
// Licensed under the Apache License, Version 2.0 (the "License"); // Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. // you may not use this file except in compliance with the License.
// You may obtain a copy of the License at // You may obtain a copy of the License at
// //
// http://www.apache.org/licenses/LICENSE-2.0 // http://www.apache.org/licenses/LICENSE-2.0
// //
// Unless required by applicable law or agreed to in writing, software // Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, // distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
#include "../../test_common/harness/kernelHelpers.h" #include "../../test_common/harness/kernelHelpers.h"
#include "../../test_common/harness/testHarness.h" #include "../../test_common/harness/testHarness.h"
#include "../../test_common/harness/errorHelpers.h" #include "../../test_common/harness/errorHelpers.h"
#include "../../test_common/harness/typeWrappers.h" #include "../../test_common/harness/typeWrappers.h"
#include "../../test_common/harness/conversions.h" #include "../../test_common/harness/conversions.h"
#include "../../test_common/harness/rounding_mode.h" #include "../../test_common/harness/rounding_mode.h"
extern void memset_pattern4(void *dest, const void *src_pattern, size_t bytes ); extern void memset_pattern4(void *dest, const void *src_pattern, size_t bytes );
extern int test_hostptr(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_hostptr(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_fpmath_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_fpmath_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_fpmath_float2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_fpmath_float2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_fpmath_float4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_fpmath_float4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_int2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_int2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_int4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_int4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_long(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_long(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_long2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_long2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_intmath_long4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_intmath_long4(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_hiloeo(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_hiloeo(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_if(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_if(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_sizeof(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_sizeof(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_loop(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_loop(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_pointer_cast(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_pointer_cast(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_local_arg_def(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_local_arg_def(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_local_kernel_def(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_local_kernel_def(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_local_kernel_scope(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_local_kernel_scope(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_constant_source(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_constant_source(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_writeimage(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_writeimage(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_writeimage_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_writeimage_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_writeimage_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_writeimage_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_multireadimageonefmt(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_multireadimageonefmt(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_multireadimagemultifmt(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_multireadimagemultifmt(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_image_r8(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_image_r8(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_simplebarrier(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_simplebarrier(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_barrier(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_barrier(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_int2float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_int2float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_float2int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_float2int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagearraycopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagearraycopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagearraycopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagearraycopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagereadwrite(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagereadwrite(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagereadwrite3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagereadwrite3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage3d_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage3d_int16(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_readimage3d_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_readimage3d_fp32(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_bufferreadwriterect(cl_device_id device, cl_context context, cl_command_queue queue_, int num_elements); extern int test_bufferreadwriterect(cl_device_id device, cl_context context, cl_command_queue queue_, int num_elements);
extern int test_imagecopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagecopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagecopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagecopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagerandomcopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagerandomcopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_arraycopy(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems); extern int test_arraycopy(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems);
extern int test_arrayimagecopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_arrayimagecopy(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_arrayimagecopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_arrayimagecopy3d(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagenpot(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagenpot(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_sampler_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_sampler_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_sampler_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_sampler_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_createkernelsinprogram(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_createkernelsinprogram(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_single_large_allocation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_single_large_allocation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_multiple_max_allocation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_multiple_max_allocation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_arrayreadwrite(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_arrayreadwrite(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagedim_pow2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagedim_pow2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_imagedim_non_pow2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_imagedim_non_pow2(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_image_param(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_image_param(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_image_multipass_integer_coord(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_image_multipass_integer_coord(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_image_multipass_float_coord(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_image_multipass_float_coord(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vload_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vload_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vload_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vload_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vload_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vload_constant(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vload_private(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vload_private(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vstore_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vstore_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vstore_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vstore_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vstore_private(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vstore_private(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_bool(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_bool(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_char(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_char(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_uchar(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_uchar(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_short(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_short(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_ushort(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_ushort(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_int(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_uint(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_uint(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_long(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_long(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_ulong(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_ulong(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_float(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_explicit_s2v_double(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_explicit_s2v_double(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_enqueue_map_buffer(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_enqueue_map_buffer(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_enqueue_map_image(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_enqueue_map_image(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_work_item_functions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_work_item_functions(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_astype(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_astype(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_native_kernel(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_native_kernel(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_async_copy_global_to_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_async_copy_global_to_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_async_copy_local_to_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_async_copy_local_to_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_async_strided_copy_global_to_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_async_strided_copy_global_to_local(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_async_strided_copy_local_to_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_async_strided_copy_local_to_global(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_prefetch(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_prefetch(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_host_numeric_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_host_numeric_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_numeric_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_numeric_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_limit_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_limit_constants(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_preprocessor_macros(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_preprocessor_macros(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_call_kernel_function(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_kernel_call_kernel_function(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_basic_parameter_types(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements); extern int test_basic_parameter_types(cl_device_id device, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vector_creation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vector_creation(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_vec_type_hint(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements); extern int test_vec_type_hint(cl_device_id deviceID, cl_context context, cl_command_queue queue, int num_elements);
extern int test_kernel_memory_alignment_local(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_kernel_memory_alignment_local(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_kernel_memory_alignment_global(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_kernel_memory_alignment_global(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_kernel_memory_alignment_constant(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_kernel_memory_alignment_constant(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_kernel_memory_alignment_private(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_kernel_memory_alignment_private(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_global_work_offsets(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_global_work_offsets(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );
extern int test_get_global_offset(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems ); extern int test_get_global_offset(cl_device_id device, cl_context context, cl_command_queue queue, int n_elems );

Some files were not shown because too many files have changed in this diff Show More