Hi, I recently upgraded to a newer card for the main purpose of developing on OpenCL 2.0. I have done the following: installed the latest Catlyst, SDK 3.0 Beta, and everything that I could find to make use of OpenCL 2.0. When I tried running the samples, such as BinarySearch_DeviceSideEnqueue I get the following error message:
Warning!!! Device-Side Kernel Enqueue Feature not supported in 1.x platform , fallback to openCL 1.2 Features
Change the flags in the BinarySearchDeviceSideEnqueue_oclflag.txt to cl-std=CL1.2 -g
When I query clinfo, I get the below return which shows the Platform as having 2.0 but neither of the devices. I am at a loss, I have tried everything I could and read every FAQ and install page.
Thanks!
Number of platforms: | | | 1 |
Platform Profile: | | | FULL_PROFILE |
Platform Version: | | | OpenCL 2.0 AMD-APP (1729.3) |
Platform Name: | | | AMD Accelerated Parallel Processing |
Platform Vendor: | | | Advanced Micro Devices, Inc. |
Platform Extensions: | | | cl_khr_icd cl_amd_event_callback cl_amd_offline_devices |
Platform Name: | | | AMD Accelerated Parallel Processing |
Number of devices: | | | 2 |
Device Type: | | | | CL_DEVICE_TYPE_GPU |
Vendor ID: | | | | 1002h |
Board name: | | | | DIR:67B1 RID:80 |
Device Topology: | | | PCI[ B#1, D#0, F#0 ] |
Max compute units: | | | 40 |
Max work items dimensions: | | 3 |
| Max work items[0]: | | | 256 |
| Max work items[1]: | | | 256 |
| Max work items[2]: | | | 256 |
Max work group size: | | | 256 |
Preferred vector width char: | | 4 |
Preferred vector width short: | | 2 |
Preferred vector width int: | | 1 |
Preferred vector width long: | | 1 |
Preferred vector width float: | | 1 |
Preferred vector width double: | 1 |
Native vector width char: | | 4 |
Native vector width short: | | 2 |
Native vector width int: | | 1 |
Native vector width long: | | 1 |
Native vector width float: | | 1 |
Native vector width double: | | 1 |
Max clock frequency: | | | 1010Mhz |
Address bits: | | | | 32 |
Max memory allocation: | | 2558263296 |
Image support: | | | Yes |
Max number of images read arguments: | 128 |
Max number of images write arguments: | 8 |
Max image 2D width: | | | 16384 |
Max image 2D height: | | | 16384 |
Max image 3D width: | | | 2048 |
Max image 3D height: | | | 2048 |
Max image 3D depth: | | | 2048 |
Max samplers within kernel: | | 16 |
Max size of kernel argument: | | 1024 |
Alignment (bits) of base address: | 2048 |
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
| Denorms: | | | | No |
| Quiet NaNs: | | | | Yes |
| Round to nearest even: | | Yes |
| Round to zero: | | | Yes |
| Round to +ve and infinity: | | Yes |
| IEEE754-2008 fused multiply-add: | Yes |
Cache type: | | | | Read/Write |
Cache line size: | | | 64 |
Cache size: | | | | 16384 |
Global memory size: | | | 3221225472 |
Constant buffer size: | | | 65536 |
Max number of constant args: | | 8 |
Local memory type: | | | Scratchpad |
Local memory size: | | | 32768 |
Kernel Preferred work group size multiple: 64
Error correction support: | | 0 |
Unified memory for Host and Device: | 0 |
Profiling timer resolution: | | 1 |
Device endianess: | | | Little |
Available: | | | | Yes |
Compiler available: | | | Yes |
Execution capabilities: | | | |
| Execute OpenCL kernels: | | Yes |
| Execute native function: | | No |
Queue properties: | | | |
| Out-of-Order: | | | No |
| Profiling : | | | | Yes |
Platform ID: | | | | 0xb736e048 |
Name: | | | | | Hawaii |
Vendor: | | | | Advanced Micro Devices, Inc. |
Device OpenCL C version: | | OpenCL C 1.2 |
Driver version: | | | 1729.3 (VM) |
Profile: | | | | FULL_PROFILE |
Version: | | | | OpenCL 1.2 AMD-APP (1729.3) |
Extensions: | | | | cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event |
Device Type: | | | | CL_DEVICE_TYPE_CPU |
Vendor ID: | | | | 1002h |
Board name: | | | | |
Max compute units: | | | 4 |
Max work items dimensions: | | 3 |
| Max work items[0]: | | | 1024 |
| Max work items[1]: | | | 1024 |
| Max work items[2]: | | | 1024 |
Max work group size: | | | 1024 |
Preferred vector width char: | | 16 |
Preferred vector width short: | | 8 |
Preferred vector width int: | | 4 |
Preferred vector width long: | | 2 |
Preferred vector width float: | | 4 |
Preferred vector width double: | 2 |
Native vector width char: | | 16 |
Native vector width short: | | 8 |
Native vector width int: | | 4 |
Native vector width long: | | 2 |
Native vector width float: | | 4 |
Native vector width double: | | 2 |
Max clock frequency: | | | 2666Mhz |
Address bits: | | | | 32 |
Max memory allocation: | | 1073741824 |
Image support: | | | Yes |
Max number of images read arguments: | 128 |
Max number of images write arguments: | 64 |
Max image 2D width: | | | 8192 |
Max image 2D height: | | | 8192 |
Max image 3D width: | | | 2048 |
Max image 3D height: | | | 2048 |
Max image 3D depth: | | | 2048 |
Max samplers within kernel: | | 16 |
Max size of kernel argument: | | 4096 |
Alignment (bits) of base address: | 1024 |
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
| Denorms: | | | | Yes |
| Quiet NaNs: | | | | Yes |
| Round to nearest even: | | Yes |
| Round to zero: | | | Yes |
| Round to +ve and infinity: | | Yes |
| IEEE754-2008 fused multiply-add: | Yes |
Cache type: | | | | Read/Write |
Cache line size: | | | 64 |
Cache size: | | | | 32768 |
Global memory size: | | | 3758096384 |
Constant buffer size: | | | 65536 |
Max number of constant args: | | 8 |
Local memory type: | | | Global |
Local memory size: | | | 32768 |
Kernel Preferred work group size multiple: 1
Error correction support: | | 0 |
Unified memory for Host and Device: | 1 |
Profiling timer resolution: | | 1 |
Device endianess: | | | Little |
Available: | | | | Yes |
Compiler available: | | | Yes |
Execution capabilities: | | | |
| Execute OpenCL kernels: | | Yes |
| Execute native function: | | Yes |
Queue properties: | | | |
| Out-of-Order: | | | No |
| Profiling : | | | | Yes |
Platform ID: | | | | 0xb736e048 |
Name: | | | | | Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz |
Vendor: | | | | GenuineIntel |
Device OpenCL C version: | | OpenCL C 1.2 |
Driver version: | | | 1729.3 (sse2) |
Profile: | | | | FULL_PROFILE |
Version: | | | | OpenCL 1.2 AMD-APP (1729.3) |
Extensions: | | | | cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_spir cl_khr_gl_event |