1 Reply Latest reply on Apr 9, 2015 10:06 AM by chesik

    Getting "Failed to generate profile result" from codeXL 1.6

    rwcrosby

      I've just started trying to profile my application using codeXL on Ubuntu 14.10. Regardless of whether I run the GUI or the command line version I get "Failed to generate profile result /home/rwc/somefile.atp". According to some things I've seen I there was an issue with the version of libstdc++ and I should try sprofileRun. sprofileRun gives the same results as sprofile (and the gui) and I verified that my libstdc++ is current.  Running a cl info program I get:

       

      rwc@sudoku:~/Dropbox/GPU/OpenCL/CLInfo $ /opt/AMD/CodeXL_1.6-7247/x86_64/sprofileRun -t CLInfo.posix

      /opt/AMD/CodeXL_1.6-7247/x86_64/sprofileRun

      AMD CodeXL GPU Profiler V3.1.7247 is Enabled

      Found 1 platform(s).

      platform[0x7fd6df55fe00]: profile: FULL_PROFILE

      platform[0x7fd6df55fe00]: version: OpenCL 1.2 AMD-APP (1526.3)

      platform[0x7fd6df55fe00]: name: AMD Accelerated Parallel Processing

      platform[0x7fd6df55fe00]: vendor: Advanced Micro Devices, Inc.

      platform[0x7fd6df55fe00]: Found 2 device(s).

       

       

       

       

      Device number  1

        device[0xed5810]: NAME: Capeverde

        device[0xed5810]: VENDOR: Advanced Micro Devices, Inc.

        device[0xed5810]: PROFILE: FULL_PROFILE

        device[0xed5810]: VERSION: OpenCL 1.2 AMD-APP (1526.3)

        device[0xed5810]: DRIVER_VERSION: 1526.3 (VM)

       

       

        device[0xed5810]: Type: GPU

        device[0xed5810]: EXECUTION_CAPABILITIES: Kernel

        device[0xed5810]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)

        device[0xed5810]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)

        device[0xed5810]: SINGLE_FP_CONFIG: 0xbe

        device[0xed5810]: DOUBLE_FP_CONFIG: 0x3f

        device[0xed5810]: QUEUE_PROPERTIES: 0x2

       

       

        device[0xed5810]: VENDOR_ID: 4098

        device[0xed5810]: MAX_COMPUTE_UNITS: 10

        device[0xed5810]: MAX_WORK_ITEM_DIMENSIONS: 3

        device[0xed5810]: MAX_WORK_GROUP_SIZE: 256

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_CHAR: 4

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_SHORT: 2

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_INT: 1

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_LONG: 1

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_FLOAT: 1

        device[0xed5810]: PREFERRED_VECTOR_WIDTH_DOUBLE: 1

        device[0xed5810]: MAX_CLOCK_FREQUENCY: 1000

        device[0xed5810]: ADDRESS_BITS: 32

        device[0xed5810]: MAX_MEM_ALLOC_SIZE: 900464640

        device[0xed5810]: IMAGE_SUPPORT: 1

        device[0xed5810]: MAX_READ_IMAGE_ARGS: 128

        device[0xed5810]: MAX_WRITE_IMAGE_ARGS: 8

        device[0xed5810]: IMAGE2D_MAX_WIDTH: 16384

        device[0xed5810]: IMAGE2D_MAX_HEIGHT: 16384

        device[0xed5810]: IMAGE3D_MAX_WIDTH: 2048

        device[0xed5810]: IMAGE3D_MAX_HEIGHT: 2048

        device[0xed5810]: IMAGE3D_MAX_DEPTH: 2048

        device[0xed5810]: MAX_SAMPLERS: 16

        device[0xed5810]: MAX_PARAMETER_SIZE: 1024

        device[0xed5810]: MEM_BASE_ADDR_ALIGN: 2048

        device[0xed5810]: MIN_DATA_TYPE_ALIGN_SIZE: 128

        device[0xed5810]: GLOBAL_MEM_CACHELINE_SIZE: 64

        device[0xed5810]: GLOBAL_MEM_CACHE_SIZE: 16384

        device[0xed5810]: GLOBAL_MEM_SIZE: 1727004672

        device[0xed5810]: MAX_CONSTANT_BUFFER_SIZE: 65536

        device[0xed5810]: MAX_CONSTANT_ARGS: 8

        device[0xed5810]: LOCAL_MEM_SIZE: 32768

        device[0xed5810]: ERROR_CORRECTION_SUPPORT: 0

        device[0xed5810]: PROFILING_TIMER_RESOLUTION: 1

        device[0xed5810]: ENDIAN_LITTLE: 1

        device[0xed5810]: AVAILABLE: 1

        device[0xed5810]: COMPILER_AVAILABLE: 1

       

       

       

       

       

       

      Device number  2

        device[0x1643e00]: NAME: Intel(R) Core(TM)2 Quad CPU    Q8300  @ 2.50GHz

        device[0x1643e00]: VENDOR: GenuineIntel

        device[0x1643e00]: PROFILE: FULL_PROFILE

        device[0x1643e00]: VERSION: OpenCL 1.2 AMD-APP (1526.3)

        device[0x1643e00]: DRIVER_VERSION: 1526.3 (sse2)

       

       

        device[0x1643e00]: Type: CPU

        device[0x1643e00]: EXECUTION_CAPABILITIES: Kernel Native

        device[0x1643e00]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)

        device[0x1643e00]: CL_DEVICE_LOCAL_MEM_TYPE: Global (2)

        device[0x1643e00]: SINGLE_FP_CONFIG: 0xbf

        device[0x1643e00]: DOUBLE_FP_CONFIG: 0x3f

        device[0x1643e00]: QUEUE_PROPERTIES: 0x2

       

       

        device[0x1643e00]: VENDOR_ID: 4098

        device[0x1643e00]: MAX_COMPUTE_UNITS: 4

        device[0x1643e00]: MAX_WORK_ITEM_DIMENSIONS: 3

        device[0x1643e00]: MAX_WORK_GROUP_SIZE: 1024

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_CHAR: 16

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_SHORT: 8

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_INT: 4

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_LONG: 2

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4

        device[0x1643e00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2

        device[0x1643e00]: MAX_CLOCK_FREQUENCY: 1998

        device[0x1643e00]: ADDRESS_BITS: 64

        device[0x1643e00]: MAX_MEM_ALLOC_SIZE: 2147483648

        device[0x1643e00]: IMAGE_SUPPORT: 1

        device[0x1643e00]: MAX_READ_IMAGE_ARGS: 128

        device[0x1643e00]: MAX_WRITE_IMAGE_ARGS: 8

        device[0x1643e00]: IMAGE2D_MAX_WIDTH: 8192

        device[0x1643e00]: IMAGE2D_MAX_HEIGHT: 8192

        device[0x1643e00]: IMAGE3D_MAX_WIDTH: 2048

        device[0x1643e00]: IMAGE3D_MAX_HEIGHT: 2048

        device[0x1643e00]: IMAGE3D_MAX_DEPTH: 2048

        device[0x1643e00]: MAX_SAMPLERS: 16

        device[0x1643e00]: MAX_PARAMETER_SIZE: 4096

        device[0x1643e00]: MEM_BASE_ADDR_ALIGN: 1024

        device[0x1643e00]: MIN_DATA_TYPE_ALIGN_SIZE: 128

        device[0x1643e00]: GLOBAL_MEM_CACHELINE_SIZE: 64

        device[0x1643e00]: GLOBAL_MEM_CACHE_SIZE: 32768

        device[0x1643e00]: GLOBAL_MEM_SIZE: 4143226880

        device[0x1643e00]: MAX_CONSTANT_BUFFER_SIZE: 65536

        device[0x1643e00]: MAX_CONSTANT_ARGS: 8

        device[0x1643e00]: LOCAL_MEM_SIZE: 32768

        device[0x1643e00]: ERROR_CORRECTION_SUPPORT: 0

        device[0x1643e00]: PROFILING_TIMER_RESOLUTION: 1

        device[0x1643e00]: ENDIAN_LITTLE: 1

        device[0x1643e00]: AVAILABLE: 1

        device[0x1643e00]: COMPILER_AVAILABLE: 1

       

       

       

       

       

       

      Failed to generate profile result /home/rwc/cltrace.atp.

       

       

      Anybody out there, anybody from AMD for instance??? I'm still looking for help here....

       

      Message was edited by: Ralph Crosby

        • Re: Getting "Failed to generate profile result" from codeXL 1.6
          chesik

          Hi Ralph,

           

          Are you able to get profile results for any of the APP SDK sample applications?  Something simple like the MatrixMultiplication sample?  What about one that runs a bit longer like NBody?

           

          Also, are you able to collect GPU perf counters or is that failing as well?  Try ./sprofile -p YourApplication (or ./sprofileRun -p YourApplication).

           

          One other thing to try is to use the the -i switch with a really low number (./sprofile -t -i 1 YourApplication), to see is that makes any difference)

           

          Thanks,
          Chris