Using device.getInfo<A>() one can query device info. For HD4770 on a linux machine I get the following information:
1) A=CL_DEVICE_NAME : ATI RV770
Shouldn't it be RV740?
2) A=CL_DEVICE_GLOBAL_MEM_SIZE: 128Mb
Shouldn't it be 512Mb?
3) A=CL_DEVICE_LOCAL_MEM_TYPE : 2,
where 2==CL_GLOBAL.
Shouldn't it be CL_LOCAL==1?
Doesn't __local corespond to the shared memory of each SIMD, which has 16Kb size?
4) A=CL_DEVICE_GLOBAL_MEM_CACHE_TYPE | CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE | CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 0
Should those values vanish for my device?
Micah, thanks for the quick reply!
It raised though a few other questions:
1) Is the emulation of __local by global memory in 7xx architecture permanent, or it is just an artefact of the beta version of your OpenCL implementation?
2) If this emulation is permanent, does it mean that OpenCL on 7xx architecture is seriously crippled?
3) If I need more that 128Mb, and I do, does it mean that the space will be dynamically allocated, or I need to adjust some environment variable?
4) In point 4) of the previous post, I meant that for each of the mentioned device properties one gets zero, i.e. global memmory is not cached. Are there devices where global memory gets cached?
Originally posted by: MicahVillmow 3) on the 7XX series of cards, __local is emulated in global memory since the 7XX does not have fully read/write local memory like 8XX does
Would it be possible for the compiler to detect the special case where the kernel only performs writes to get_local_id(0), i.e.
__kernel void mykenel(__local float4* shmem, ...) {
shmem[get_local_id(0)] = val
}
(which is supported by RV770) and use shared memory, rather than global memory in that case? For predictable performance, it would be nice if the user could be warned of such hardware support violations resulting in reduced performance. Perhaps in Stream Kernel Analyzer, for example.
@ Micah:
I'm waiting for a 2gb 5870 to come out ... in the meantime, are the 5750 and 5770 equivalent to a 5870 in terms of hardware support for OpenCL? (hardware shmem support, etc.)
One difference which I know - 5700 series doesn't have DP support.
Shared memory is 32KB for both 58xx and 57xx series cards.