AnsweredAssumed Answered

clAmdBlasTune errors

Question asked by rahulgarg on Apr 3, 2012
Latest reply on Apr 13, 2012 by solver

I tried running clAmdBlasTune, but got strange errors.

Platform: Linux (ubuntu 11.10) 64-bit

clAmdBlas version: 1.6.236

 

Errors are something along:

/tmp/OCLAlZ3aw.cl(89): warning: variable "ax" was declared but never referenced

              const uint ax = k / 0;

                         ^

 

3 errors detected in the compilation of "/tmp/OCLAlZ3aw.cl".

 

Internal error: clc compiler invocation failed.

 

========================================================

 

An internal kernel build error occurred!

 

 

 

 

Output of clinfo:

umber of platforms:                         1
  Platform Profile:                          FULL_PROFILE
  Platform Version:                          OpenCL 1.1 AMD-APP (851.4)
  Platform Name:                             AMD Accelerated Parallel Processing
  Platform Vendor:                           Advanced Micro Devices, Inc.
  Platform Extensions:                       cl_khr_icd cl_amd_event_callback cl_amd_offline_devices

 

 

  Platform Name:                             AMD Accelerated Parallel Processing
Number of devices:                           2
  Device Type:                               CL_DEVICE_TYPE_GPU
  Device ID:                                 4098
  Board name:                                ATI Radeon HD 5800 Series
  Device Topology:                           PCI[ B#2, D#0, F#0 ]
  Max compute units:                         18
  Max work items dimensions:                 3
Max work items[0]:                       256
Max work items[1]:                       256
Max work items[2]:                       256
  Max work group size:                       256
  Preferred vector width char:               16
  Preferred vector width short:              8
  Preferred vector width int:                4
  Preferred vector width long:               2
  Preferred vector width float:              4
  Preferred vector width double:             2
  Native vector width char:                  16
  Native vector width short:                 8
  Native vector width int:                   4
  Native vector width long:                  2
  Native vector width float:                 4
  Native vector width double:                2
  Max clock frequency:                       599Mhz
  Address bits:                              32
  Max memory allocation:                     134217728
  Image support:                             Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      8
  Max image 2D width:                        8192
  Max image 2D height:                       8192
  Max image 3D width:                        2048
  Max image 3D height:                       2048
  Max image 3D depth:                        2048
  Max samplers within kernel:                16
  Max size of kernel argument:               1024
  Alignment (bits) of base address:          2048
  Minimum alignment (bytes) for any datatype:128

  Single precision floating point capability

Denorms:                                 No
Quiet NaNs:                              Yes
Round to nearest even:                   Yes
Round to zero:                           Yes
Round to +ve and infinity:               Yes
IEEE754-2008 fused multiply-add:         Yes
  Cache type:                                None
  Cache line size:                           0
  Cache size:                                0
  Global memory size:                        536870912
  Constant buffer size:                      65536
  Max number of constant args:               8
  Local memory type:                         Scratchpad
  Local memory size:                         32768
  Kernel Preferred work group size multiple: 64
  Error correction support:                  0
  Unified memory for Host and Device:        0
  Profiling timer resolution:                1
  Device endianess:                          Little
  Available:                                 Yes
  Compiler available:                        Yes
  Execution capabilities:                          
Execute OpenCL kernels:                  Yes
Execute native function:                 No
  Queue properties:                        
Out-of-Order:                            No
Profiling :                              Yes
  Platform ID:                               0x7faa4267a100
  Name:                                      Cypress
  Vendor:                                    Advanced Micro Devices, Inc.
  Device OpenCL C version:                   OpenCL C 1.1
  Driver version:                            CAL 1.4.1664
  Profile:                                   FULL_PROFILE
  Version:                                   OpenCL 1.1 AMD-APP (851.4)
  Extensions:                                cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

 

 

  Device Type:                               CL_DEVICE_TYPE_CPU
  Device ID:                                 4098
  Board name:                              
  Max compute units:                         4
  Max work items dimensions:                 3
Max work items[0]:                       1024
Max work items[1]:                       1024
Max work items[2]:                       1024
  Max work group size:                       1024
  Preferred vector width char:               16
  Preferred vector width short:              8
  Preferred vector width int:                4
  Preferred vector width long:               2
  Preferred vector width float:              4
  Preferred vector width double:             0
  Native vector width char:                  16
  Native vector width short:                 8
  Native vector width int:                   4
  Native vector width long:                  2
  Native vector width float:                 4
  Native vector width double:                0
  Max clock frequency:                       2800Mhz
  Address bits:                              64
  Max memory allocation:                     2147483648
  Image support:                             Yes
  Max number of images read arguments:       128
  Max number of images write arguments:      8
  Max image 2D width:                        8192
  Max image 2D height:                       8192
  Max image 3D width:                        2048
  Max image 3D height:                       2048
  Max image 3D depth:                        2048
  Max samplers within kernel:                16
  Max size of kernel argument:               4096
  Alignment (bits) of base address:          1024
  Minimum alignment (bytes) for any datatype:128

  Single precision floating point capability

Denorms:                                 Yes
Quiet NaNs:                              Yes
Round to nearest even:                   Yes
Round to zero:                           Yes
Round to +ve and infinity:               Yes
IEEE754-2008 fused multiply-add:         Yes
  Cache type:                                Read/Write
  Cache line size:                           64
  Cache size:                                65536
  Global memory size:                        7860842496
  Constant buffer size:                      65536
  Max number of constant args:               8
  Local memory type:                         Global
  Local memory size:                         32768
  Kernel Preferred work group size multiple: 1
  Error correction support:                  0
  Unified memory for Host and Device:        1
  Profiling timer resolution:                1
  Device endianess:                          Little
  Available:                                 Yes
  Compiler available:                        Yes
  Execution capabilities:                          
Execute OpenCL kernels:                  Yes
Execute native function:                 Yes
  Queue properties:                        
Out-of-Order:                            No
Profiling :                              Yes
  Platform ID:                               0x7faa4267a100
  Name:                                      AMD Phenom(tm) II X4 925 Processor
  Vendor:                                    AuthenticAMD
  Device OpenCL C version:                   OpenCL C 1.1
  Driver version:                            2.0
  Profile:                                   FULL_PROFILE
  Version:                                   OpenCL 1.1 AMD-APP (851.4)
  Extensions:                                cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt

Outcomes