cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Hill_Groove
Journeyman III

Fast Writes To Global Memory

any hints

Hello,

I am interested in writing big arrays to global memory from kernels. Is there any way I can accellerate this process (i.e. using local memory)? 

0 Likes
4 Replies
LeeHowes
Staff

That's a rather vague question... make sure you try to do 128-bit writes that align from one work item to the next. That should help you achieve peak bandwidth.

 

More than that would depend on what you're doing.

0 Likes

LeeHowes

Thank You for your reply.

My kernel consists of 3 parts: reading data, count, writing to global memory. The last step takes half of the total time.  The problem is very comfortable for OpenCL, so i can write to global memory any way i want. how it should be done correctly (with local memory or write from private and so on)? 

0 Likes

Oh, well try to arrange it as 128-bit writes from registers, then. Preferably a vector register:

 

float4 stuffinhere = somethingorother;

((float4*)outputPointer)[offsetFromOutputPointer] = stuffinhere;

 

0 Likes

Thanks a lot, LeeHowes

0 Likes