According to the programming guide "OpenCL read-only images always use FastPath" but it also says you cannot use the fastpath unless you are using 32bit datatypes. This seems to be a conflict for images which are shorts. I would think a test would tell me what would happen but reading and writing to a single channel CL_R signed short opencl image gives me 25% utlization which doesn't make any sense to me because I would have thought 50% if reads were using the fastpath and writes were not, or 0% if non were using them.
Can anyone clarify this (hopefully from AMD).
Thanks.
Update: Looking into the ISA produced by even the simlest code seems to indicate that ANY call to image_writeX calls the CompletePat by using the "MEM_RAT_STORE_TYPED" instruction. So clearly the guide is incorrect about saying that 32bit operations will use the fast path.
I don't remember this always be the case. Am I wrong? Anyway to force the fastpath?
"OpenCL read-only images always use FastPath"
ANY call to image_writeX calls the CompletePat by using the "MEM_RAT_STORE_TYPED" instruction
So ... how can you write to a read-only image?
@notzed,
I was not trying to write to the a read-only image I was saying that if reads always used the fastpath and writes did not then if you had a kernel with one read and one write you should have 50% utilization of the fastpath. Intead I get 25% which makes little sense. Well now that I think of it it may be reporting total bytes and maybe it is counting the reads as 16bit reads and the writes as 32bit writes (even though I am reading and writing to 16bit textures). That would explain the ratio but I don't know if that is the case.
Can you pack 2 shorts into an int, load then, and then convert then with as_short2()? I know you have to do dirty tricks like this to store double precision data in images and don't see why it wouldn't work for shorts.
@rick,
That is a good idea. I wasn't thinking the "as_" functions worked with the vector types. I don't know if it solves this problem as it seems that any writes are using the complete_path regardless of their datasize but I have a couple of places I will be using that.