First of all I would thank CodeXL developer team for improvements they make in Code XL 1.4. Now it's much more convenient tool.
Now the question:
On Loveland device there is no cache at all accordingly CLinfo-like enumerating:
OpenCL Platform Name: | | | | | AMD Accelerated Parallel Processing | |
Number of devices: | | | | 1 | |
Max compute units: | | | | 2 | |
Max work group size: | | | | 256 | |
Max clock frequency: | | | | 275Mhz | |
Max memory allocation: | | | 175374336 | |
Cache type: | | | | | None | |
Cache line size: | | | | 0 | |
Cache size: | | | | | 0 | |
Global memory size: | | | | 701497344 | |
Constant buffer size: | | | | 65536 | |
Max number of constant args: | | | 8 | |
Local memory type: | | | | Scratchpad | |
Local memory size: | | | | 32768 | |
Queue properties: | | | | | |
Out-of-Order: | | | | No | |
Name: | | | | | | Loveland |
Vendor: | | | | | Advanced Micro Devices, Inc. | |
Driver version: | | | | 1268.1 (VM) | |
Version: | | | | | OpenCL 1.2 AMD-APP (1268.1) | |
Extensions: | | | | | cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_dx9_media_sharing cl_amd_image2d_from_buffer_read_only | |
But CodeXL still shows cache hit %, for example:
Method, ExecutionOrder, ThreadID, CallIndex, GlobalWorkSize, WorkGroupSize, Time, LocalMemSize, VGPRs, SGPRs, ScratchRegs, FCStacks, KernelOccupancy, Wavefronts, LDSFetchInsts, LDSWriteInsts, FetchSize, CacheHit (%), FetchUnitBusy (%), FetchUnitStalled (%), WriteUnitStalled (%), FastPath, PathUtilization (%), LDSBankConflict (%)
GPU_fetch_array_kernel_twin_1D_persistent_cl__k12_Loveland1, 1540, 548, 9609, { 512 1 1}, { 256 1 1}, 436.32020, 32768, 30, NA, 0, 4, 12.5, 16, 87245.63, 32, 193527.63, 13.31, 30.87, 0.01, 1.75, 0, 0, 0
What its meaning then? Or OpenCL device capabilities quering returns wrong data and Loveland (C-60) device has cache? Please make some comments on this.