Discussion created by matze_de on Apr 9, 2011
Latest reply on Apr 12, 2011 by genaganna
Hello every one,

i have a problem with  async_work_group_copy if i use it twice in a kernel it somehow does not work. 

The Task is to create a MxN Matrix where every field is calculated with 2 x 45 x float3 (the 48 in copy is because of alignment (without using local memory it wirks fine).

The fields are calculated independently, i choose a 2-dim 16x16 ndrange.

So i thought i could copy the 2x16x48(45) float 3 to local memory because of multiple access.

My caching works if i only cache 16x48(45)