cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

rwcrosby
Journeyman III

Getting "Failed to generate profile result" from codeXL 1.6

I've just started trying to profile my application using codeXL on Ubuntu 14.10. Regardless of whether I run the GUI or the command line version I get "Failed to generate profile result /home/rwc/somefile.atp". According to some things I've seen I there was an issue with the version of libstdc++ and I should try sprofileRun. sprofileRun gives the same results as sprofile (and the gui) and I verified that my libstdc++ is current.  Running a cl info program I get:

rwc@sudoku:~/Dropbox/GPU/OpenCL/CLInfo $ /opt/AMD/CodeXL_1.6-7247/x86_64/sprofileRun -t CLInfo.posix

/opt/AMD/CodeXL_1.6-7247/x86_64/sprofileRun

AMD CodeXL GPU Profiler V3.1.7247 is Enabled

Found 1 platform(s).

platform[0x7fd6df55fe00]: profile: FULL_PROFILE

platform[0x7fd6df55fe00]: version: OpenCL 1.2 AMD-APP (1526.3)

platform[0x7fd6df55fe00]: name: AMD Accelerated Parallel Processing

platform[0x7fd6df55fe00]: vendor: Advanced Micro Devices, Inc.

platform[0x7fd6df55fe00]: Found 2 device(s).

Device number  1

  device[0xed5810]: NAME: Capeverde

  device[0xed5810]: VENDOR: Advanced Micro Devices, Inc.

  device[0xed5810]: PROFILE: FULL_PROFILE

  device[0xed5810]: VERSION: OpenCL 1.2 AMD-APP (1526.3)

  device[0xed5810]: DRIVER_VERSION: 1526.3 (VM)

  device[0xed5810]: Type: GPU

  device[0xed5810]: EXECUTION_CAPABILITIES: Kernel

  device[0xed5810]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)

  device[0xed5810]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)

  device[0xed5810]: SINGLE_FP_CONFIG: 0xbe

  device[0xed5810]: DOUBLE_FP_CONFIG: 0x3f

  device[0xed5810]: QUEUE_PROPERTIES: 0x2

  device[0xed5810]: VENDOR_ID: 4098

  device[0xed5810]: MAX_COMPUTE_UNITS: 10

  device[0xed5810]: MAX_WORK_ITEM_DIMENSIONS: 3

  device[0xed5810]: MAX_WORK_GROUP_SIZE: 256

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_CHAR: 4

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_SHORT: 2

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_INT: 1

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_LONG: 1

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_FLOAT: 1

  device[0xed5810]: PREFERRED_VECTOR_WIDTH_DOUBLE: 1

  device[0xed5810]: MAX_CLOCK_FREQUENCY: 1000

  device[0xed5810]: ADDRESS_BITS: 32

  device[0xed5810]: MAX_MEM_ALLOC_SIZE: 900464640

  device[0xed5810]: IMAGE_SUPPORT: 1

  device[0xed5810]: MAX_READ_IMAGE_ARGS: 128

  device[0xed5810]: MAX_WRITE_IMAGE_ARGS: 8

  device[0xed5810]: IMAGE2D_MAX_WIDTH: 16384

  device[0xed5810]: IMAGE2D_MAX_HEIGHT: 16384

  device[0xed5810]: IMAGE3D_MAX_WIDTH: 2048

  device[0xed5810]: IMAGE3D_MAX_HEIGHT: 2048

  device[0xed5810]: IMAGE3D_MAX_DEPTH: 2048

  device[0xed5810]: MAX_SAMPLERS: 16

  device[0xed5810]: MAX_PARAMETER_SIZE: 1024

  device[0xed5810]: MEM_BASE_ADDR_ALIGN: 2048

  device[0xed5810]: MIN_DATA_TYPE_ALIGN_SIZE: 128

  device[0xed5810]: GLOBAL_MEM_CACHELINE_SIZE: 64

  device[0xed5810]: GLOBAL_MEM_CACHE_SIZE: 16384

  device[0xed5810]: GLOBAL_MEM_SIZE: 1727004672

  device[0xed5810]: MAX_CONSTANT_BUFFER_SIZE: 65536

  device[0xed5810]: MAX_CONSTANT_ARGS: 8

  device[0xed5810]: LOCAL_MEM_SIZE: 32768

  device[0xed5810]: ERROR_CORRECTION_SUPPORT: 0

  device[0xed5810]: PROFILING_TIMER_RESOLUTION: 1

  device[0xed5810]: ENDIAN_LITTLE: 1

  device[0xed5810]: AVAILABLE: 1

  device[0xed5810]: COMPILER_AVAILABLE: 1

Device number  2

  device[0x1643e00]: NAME: Intel(R) Core(TM)2 Quad CPU    Q8300  @ 2.50GHz

  device[0x1643e00]: VENDOR: GenuineIntel

  device[0x1643e00]: PROFILE: FULL_PROFILE

  device[0x1643e00]: VERSION: OpenCL 1.2 AMD-APP (1526.3)

  device[0x1643e00]: DRIVER_VERSION: 1526.3 (sse2)

  device[0x1643e00]: Type: CPU

  device[0x1643e00]: EXECUTION_CAPABILITIES: Kernel Native

  device[0x1643e00]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)

  device[0x1643e00]: CL_DEVICE_LOCAL_MEM_TYPE: Global (2)

  device[0x1643e00]: SINGLE_FP_CONFIG: 0xbf

  device[0x1643e00]: DOUBLE_FP_CONFIG: 0x3f

  device[0x1643e00]: QUEUE_PROPERTIES: 0x2

  device[0x1643e00]: VENDOR_ID: 4098

  device[0x1643e00]: MAX_COMPUTE_UNITS: 4

  device[0x1643e00]: MAX_WORK_ITEM_DIMENSIONS: 3

  device[0x1643e00]: MAX_WORK_GROUP_SIZE: 1024

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_CHAR: 16

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_SHORT: 8

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_INT: 4

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_LONG: 2

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4

  device[0x1643e00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2

  device[0x1643e00]: MAX_CLOCK_FREQUENCY: 1998

  device[0x1643e00]: ADDRESS_BITS: 64

  device[0x1643e00]: MAX_MEM_ALLOC_SIZE: 2147483648

  device[0x1643e00]: IMAGE_SUPPORT: 1

  device[0x1643e00]: MAX_READ_IMAGE_ARGS: 128

  device[0x1643e00]: MAX_WRITE_IMAGE_ARGS: 8

  device[0x1643e00]: IMAGE2D_MAX_WIDTH: 8192

  device[0x1643e00]: IMAGE2D_MAX_HEIGHT: 8192

  device[0x1643e00]: IMAGE3D_MAX_WIDTH: 2048

  device[0x1643e00]: IMAGE3D_MAX_HEIGHT: 2048

  device[0x1643e00]: IMAGE3D_MAX_DEPTH: 2048

  device[0x1643e00]: MAX_SAMPLERS: 16

  device[0x1643e00]: MAX_PARAMETER_SIZE: 4096

  device[0x1643e00]: MEM_BASE_ADDR_ALIGN: 1024

  device[0x1643e00]: MIN_DATA_TYPE_ALIGN_SIZE: 128

  device[0x1643e00]: GLOBAL_MEM_CACHELINE_SIZE: 64

  device[0x1643e00]: GLOBAL_MEM_CACHE_SIZE: 32768

  device[0x1643e00]: GLOBAL_MEM_SIZE: 4143226880

  device[0x1643e00]: MAX_CONSTANT_BUFFER_SIZE: 65536

  device[0x1643e00]: MAX_CONSTANT_ARGS: 8

  device[0x1643e00]: LOCAL_MEM_SIZE: 32768

  device[0x1643e00]: ERROR_CORRECTION_SUPPORT: 0

  device[0x1643e00]: PROFILING_TIMER_RESOLUTION: 1

  device[0x1643e00]: ENDIAN_LITTLE: 1

  device[0x1643e00]: AVAILABLE: 1

  device[0x1643e00]: COMPILER_AVAILABLE: 1

Failed to generate profile result /home/rwc/cltrace.atp.

Anybody out there, anybody from AMD for instance??? I'm still looking for help here....

Message was edited by: Ralph Crosby

0 Likes
1 Reply
chesik
Staff

Hi Ralph,

Are you able to get profile results for any of the APP SDK sample applications?  Something simple like the MatrixMultiplication sample?  What about one that runs a bit longer like NBody?

Also, are you able to collect GPU perf counters or is that failing as well?  Try ./sprofile -p YourApplication (or ./sprofileRun -p YourApplication).

One other thing to try is to use the the -i switch with a really low number (./sprofile -t -i 1 YourApplication), to see is that makes any difference)

Thanks,
Chris

0 Likes