I some questoins about CS.
1. To execute CS i have to use calCtxRunProgramGrid(). For this i have to fill structure CALprogramGrid.
What is a gridBlock, gridSize and dcl_num_thread_per_group (used in gridBlock) in this structure?
For example. if i have dcl_num_thread_per_group 64, that means, that my domain of execution (256x256 matrix as input), which settings i have to adjust in CALprogramGrid?
2. If dcl_num_thread_per_group = 64, that means that only one SIMD are used? And if 1024 - all?
3. Is it correct, that g[vaTid0.x] have four ellements? That mean, that every g has x,y,z,w? And is it possible to have 1D 1 float array?