I want to work on rectangular subsets of a 2-dim (CPU) matrix in kernels. Are the (2-dimensional and 1-dimensional) .domain operators efficient in the sense that no element copying is made? Does efficiency of the .domain operators depend on the compiler-switch of address virtualization (all vector dimensions are below the 8192 element limit in my application)?
// simple vector dot product
reduce void dot(double v1<>, double v2<>, reduce double result<>
result += v1*v2;
int main(int argc, char** argv)
double a<100, 100>;
double ret <100>;
for (i = 0; i < 100; ++i)
start = int2(3,0);
end = int2(3,50);
// now ret = dot product of row i of a with v
// restricted to the first 50 columns of a / elements of v
The example is for demonstration, only (I know that the loop over i
could be performed by the kernel in this simple example).
My question is: do both domain operators perform only address calculations like computing offsets and step sizes, which are in constant time and negligable on large matrices, or do they allocate new arrays in memory which are filled with the selected elements?