Hi,
A program that runs in 1s in the shell is still running after 3 hours in CodeXL (GPU Application Trace profiling). The executable is running at 150% CPU and no trace data has been written in the trace file in the ~/.amd/AMD/CodeXL... directory.
AMD CodeXL GPU Profiler Kernel occupancy module is enabled
** hangs **
^CFailed to merge temp files.
Failed to generate profile result /home/lionel/.amd/AMD/CodeXL/gnrg_rtm2D_ProfilerOutput/Mar-14-2013_15-55-46/Mar-14-2013_15-55-46.atp.
Failed to generate profile result /home/lionel/.amd/AMD/CodeXL/gnrg_rtm2D_ProfilerOutput/Mar-14-2013_15-55-46/Mar-14-2013_15-55-46.occupancy.
Any ideas?
Thanks,
Lionel
Hi Lionel,
can you please provide more information on the hardware and software that you are running on?
What is the OS that you are running CodeXL on?
Please do the following:
1. Run CodeXL. Click on "Tools -> System Information"
2. Click on "Save" button in "System Information" dialog and give appropriate name.
Attach the file to the report. This will have the hardware information and AMD catalyst information in it.
Looks like you are trying "GPU: Application Trace" profile configuration.
Can you check "Write trace data in intervals during program execution" option in "Profile -> Project Settings -> GPU Profile: Application Trace" tab, click OK and run profile again to see if the issue occurs?
Attaching the ".atp" and ".occupancy" files with the report will also help us diagnosis the problem. In your case, they are present in "/home/lionel/.amd/AMD/CodeXL/gnrg_rtm2D_ProfilerOutput/" folder.
Thanks,
Kalyan P
Hi,
The attached file seems to be corrupted.
What i'm looking for is:
Platform that you are running CodeXL on, AMD Catalyst driver version installed, GPU that you are using.
Thanks,
Kalyan P
4x HD 7970
Catalyst 13.1 driver on CentOS 6.3
Operating System Version (name), Linux version 2.6.32-279.19.1.el6.centos.plus.x86_64 (mockbuild@c6b7.bsys.dev.centos.org) (gcc version 4.4.6 20120305 (Red Hat 4.4.6-4) (GCC) ) #1 SMP Wed Dec 19 06:20:23 UTC 2012
Operating System Version (number), 2.6.32
Number Of Processors, 32
System Type, Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Total Physical Memory, 64392 MB
Available Physical Memory, 62184 MB
Total Virtual Memory, 33554431 MB
Available Virtual Memory, 33519322 MB
Total Page Files, 8191 MB
Available Page Files, 8191 MB
Platform ID, 1, 1, 1, 1, 1
Device Type, GPU, GPU, GPU, GPU, CPU
Device Name, Tahiti, Tahiti, Tahiti, Tahiti, Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Vendor, Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., Advanced Micro Devices, Inc., GenuineIntel
Command Queue Properties, Queue profiling, Queue profiling, Queue profiling, Queue profiling, Queue profiling
Is Available, Yes, Yes, Yes, Yes, Yes
Is Compiler Available, Yes, Yes, Yes, Yes, Yes
Is Little Endian, Yes, Yes, Yes, Yes, Yes
Error Correction Support, No, No, No, No, No
Execution Capabilities, Kernel Execution, Kernel Execution, Kernel Execution, Kernel Execution, Kernel Execution, Native Kernel Execution
Global Memory Cache Size, 16 KB, 16 KB, 16 KB, 16 KB, 32 KB
Memory Cache Type, Read Write, Read Write, Read Write, Read Write, Read Write
Global Memory Cache Line Size, 64 bytes, 64 bytes, 64 bytes, 64 bytes, 64 bytes
Global Memory Size, 2,048 MB, 2,048 MB, 2,048 MB, 2,048 MB, 64,393 MB
Host Unified Memory, No, No, No, No, Yes
Are Images Supported, Yes, Yes, Yes, Yes, Yes
Max Image 2D Dimensions, (256w, 256h), (256w, 256h), (256w, 256h), (256w, 256h), (1024w, 1024h)
Max Image 3D Dimensions, (256w, 256h, 256d), (256w, 256h, 256d), (256w, 256h, 256d), (256w, 256h, 256d), (1024w, 1024h, 1024d)
Local Memory Size, 32 KB, 32 KB, 32 KB, 32 KB, 32 KB
Local Memory Type, Local, Local, Local, Local, Global
Max Clock Frequency, 1050, 1050, 1050, 1050, 1200
Max Compute Units, 32, 32, 32, 32, 32
Max Constant Arguments, 8, 8, 8, 8, 8
Max Constant Buffer Size, 64 KB, 64 KB, 64 KB, 64 KB, 64 KB
Max Memory Allocation Size, 512 MB, 512 MB, 512 MB, 512 MB, 16,099 MB
Max Parameter Size, 1,024 bytes, 1,024 bytes, 1,024 bytes, 1,024 bytes, 4 KB
Read Image Arguments, 128, 128, 128, 128, 128
Max Samplers, 16, 16, 16, 16, 16
Max Workgroup Size, 256, 256, 256, 256, 1024
Max Work Item Dimensions, 3, 3, 3, 3, 3
Max Work Item Sizes, (256,256,256), (256,256,256), (256,256,256), (256,256,256), (1024,1024,1024)
Max Write Image Arguments, 8, 8, 8, 8, 8
Memory Base Address Alignment, 2048, 2048, 2048, 2048, 1024
Minimal Data Type Alignment Size, 128 bytes, 128 bytes, 128 bytes, 128 bytes, 128 bytes
OpenCL C Version, OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2 , OpenCL C 1.2
Native Char Vector Width, 4, 4, 4, 4, 16
Native Short Vector Width, 2, 2, 2, 2, 8
Native Int Vector Width, 1, 1, 1, 1, 4
Native Long Vector Width, 1, 1, 1, 1, 2
Native Float Vector Width, 1, 1, 1, 1, 8
Native Double Vector Width, 1, 1, 1, 1, 4
Native Half Vector Width, 1, 1, 1, 1, 4
Preferred Char Vector Width, 4, 4, 4, 4, 16
Preferred Short Vector Width, 2, 2, 2, 2, 8
Preferred Int Vector Width, 1, 1, 1, 1, 4
Preferred Long Vector Width, 1, 1, 1, 1, 2
Preferred Float Vector Width, 1, 1, 1, 1, 8
Preferred Double Vector Width, 1, 1, 1, 1, 4
Preferred Half Vector Width, 1, 1, 1, 1, 4
Profile, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE, FULL_PROFILE
Profiling Timer Resolution, 1, 1, 1, 1, 1
Vendor ID, OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2), OpenCL 1.2 AMD-APP (1113.2)
Extensions, cl_khr_fp64, cl_khr_fp64, cl_khr_fp64, cl_khr_fp64, cl_khr_fp64
, cl_amd_fp64, cl_amd_fp64, cl_amd_fp64, cl_amd_fp64, cl_amd_fp64
, cl_khr_global_int32_base_atomics, cl_khr_global_int32_base_atomics, cl_khr_global_int32_base_atomics, cl_khr_global_int32_base_atomics, cl_khr_global_int32_base_atomics
, cl_khr_global_int32_extended_atomics, cl_khr_global_int32_extended_atomics, cl_khr_global_int32_extended_atomics, cl_khr_global_int32_extended_atomics, cl_khr_global_int32_extended_atomics
, cl_khr_local_int32_base_atomics, cl_khr_local_int32_base_atomics, cl_khr_local_int32_base_atomics, cl_khr_local_int32_base_atomics, cl_khr_local_int32_base_atomics
, cl_khr_local_int32_extended_atomics, cl_khr_local_int32_extended_atomics, cl_khr_local_int32_extended_atomics, cl_khr_local_int32_extended_atomics, cl_khr_local_int32_extended_atomics
, cl_khr_int64_base_atomics, cl_khr_int64_base_atomics, cl_khr_int64_base_atomics, cl_khr_int64_base_atomics, cl_khr_int64_base_atomics
, cl_khr_int64_extended_atomics, cl_khr_int64_extended_atomics, cl_khr_int64_extended_atomics, cl_khr_int64_extended_atomics, cl_khr_int64_extended_atomics
, cl_khr_3d_image_writes, cl_khr_3d_image_writes, cl_khr_3d_image_writes, cl_khr_3d_image_writes, cl_khr_3d_image_writes
, cl_khr_byte_addressable_store, cl_khr_byte_addressable_store, cl_khr_byte_addressable_store, cl_khr_byte_addressable_store, cl_khr_byte_addressable_store
, cl_khr_gl_sharing, cl_khr_gl_sharing, cl_khr_gl_sharing, cl_khr_gl_sharing, cl_khr_gl_sharing
, cl_ext_atomic_counters_32, cl_ext_atomic_counters_32, cl_ext_atomic_counters_32, cl_ext_atomic_counters_32, cl_ext_device_fission
, cl_amd_device_attribute_query, cl_amd_device_attribute_query, cl_amd_device_attribute_query, cl_amd_device_attribute_query, cl_amd_device_attribute_query
, cl_amd_vec3, cl_amd_vec3, cl_amd_vec3, cl_amd_vec3, cl_amd_vec3
, cl_amd_printf, cl_amd_printf, cl_amd_printf, cl_amd_printf, cl_amd_printf
, cl_amd_media_ops, cl_amd_media_ops, cl_amd_media_ops, cl_amd_media_ops, cl_amd_media_ops
, cl_amd_popcnt, cl_amd_popcnt, cl_amd_popcnt, cl_amd_popcnt, cl_amd_popcnt
, cl_amd_c1x_atomics, cl_amd_c1x_atomics, cl_amd_c1x_atomics, cl_amd_c1x_atomics,
Have similar issue but with another app and on Windows instead of Linux. But similar symptoms. Would like to see any advices/comments from AMD side.