Hi!
I'm curious which features of the ATI 5870 (or similar) cards are exposed in the current version of ATI's OpenCL driver. Can anybody who has such a device and the most recent ATI Streaming SDK installed please post the output of the SDK sample program "CLInfo"?
Thanks & kind regards,
Markus
For 2 x 5870 in Crossfire I get this:
Number of platforms: 1 Plaform Profile: FULL_PROFILE Plaform Version: OpenCL 1.0 ATI-Stream-v2.0.1 Plaform Name: ATI Stream Plaform Vendor: Advanced Micro Devices, Inc. Plaform Extensions: cl_khr_icd Plaform Name: ATI Stream Number of devices: 3 Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Max compute units: 4 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Max clock frequency: 3984Mhz Address bits: 64 Max memeory allocation: 1073741824 Image support: No Max size of kernel argument: 4096 Alignment (bits) of base address: 32768 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: Read/Write Cache line size: 64 Cache size: 65536 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000000000100F598 Name: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz Vendor: GenuineIntel Driver version: 1.0 Profile: FULL_PROFILE Version: OpenCL 1.0 ATI-Stream-v2.0.1 Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Max compute units: 20 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Max clock frequency: 875Mhz Address bits: 32 Max memeory allocation: 268435456 Image support: No Max size of kernel argument: 1024 Alignment (bits) of base address: 4096 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 268435456 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000000000100F598 Name: Cypress Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.556 Profile: FULL_PROFILE Version: OpenCL 1.0 ATI-Stream-v2.0.1 Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Max compute units: 20 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Max clock frequency: 875Mhz Address bits: 32 Max memeory allocation: 268435456 Image support: No Max size of kernel argument: 1024 Alignment (bits) of base address: 4096 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 268435456 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000000000100F598 Name: Cypress Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.556 Profile: FULL_PROFILE Version: OpenCL 1.0 ATI-Stream-v2.0.1 Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
And for 2 cards connected by crossfire bridges but crossfire disabled in ccc, the output is:
Number of platforms: 1 Plaform Profile: FULL_PROFILE Plaform Version: OpenCL 1.0 ATI-Stream-v2.0.1 Plaform Name: ATI Stream Plaform Vendor: Advanced Micro Devices, Inc. Plaform Extensions: cl_khr_icd Plaform Name: ATI Stream Number of devices: 2 Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Max compute units: 4 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Max clock frequency: 3984Mhz Address bits: 64 Max memeory allocation: 1073741824 Image support: No Max size of kernel argument: 4096 Alignment (bits) of base address: 32768 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: Read/Write Cache line size: 64 Cache size: 65536 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000000000106F598 Name: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz Vendor: GenuineIntel Driver version: 1.0 Profile: FULL_PROFILE Version: OpenCL 1.0 ATI-Stream-v2.0.1 Extensions: cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store Device Type: CL_DEVICE_TYPE_GPU Device ID: 4098 Max compute units: 20 Max work items dimensions: 3 Max work items[0]: 256 Max work items[1]: 256 Max work items[2]: 256 Max work group size: 256 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Max clock frequency: 875Mhz Address bits: 32 Max memeory allocation: 268435456 Image support: No Max size of kernel argument: 1024 Alignment (bits) of base address: 4096 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: No Quiet NaNs: Yes Round to nearest even: Yes Round to zero: No Round to +ve and infinity: No IEEE754-2008 fused multiply-add: No Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 268435456 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: No Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 000000000106F598 Name: Cypress Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.556 Profile: FULL_PROFILE Version: OpenCL 1.0 ATI-Stream-v2.0.1 Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics
Thanks a lot, your feedback is very helpful for me! I'm surprised, however, that it says "Image support: No". Even without owning one, I am very sure that the HD5870 has at least one texture unit :-), so why is it not exposed to OpenCL? Is this an issue of the current driver implementation which is going to be fixed soon? I would be glad if somebody (maybe from ATI) could comment on that!
Thanks & kind regards,
Markus
Thanks for this information! Will it work with the current ATI Stream SDK release (2.01), or will there be a software update, too? Is there an approximate timeline for these things?
Kind regards,
Markus
SDK are released around 2 months. so i think that new SDK is close by.
in meanwhile you can try set GPU_IMAGES_SUPPORT=1 and you get partialy supported images. by partialy a mean that only *_imagef() functions work.
Very interesting hint! Is there a similar hack to get partial image support on an ATI FirePro V7750? Will future versions of ATI's OpenCL driver fully support images on the V7750, or is a more recent card required for this?
Thanks & kind regards,
Markus
Hi,
If the 5870 is either 1 or 2 GB, why does CLInfo claim that the global memory size is 268435456 bytes which is about 268 mb?
mine is 2 GB and i get 536870912 (about 537 mb) bytes from CLInfo.
Also, I tried running a calculation that requires about 800mb of memory and at failed with CL_MEM_OBJECT_ALLOCATION_FAILURE. Does anyone have any explanation for this?
Thanks!
This is a limitation which will be removed in near future.