Hi,
I just moved from Brook to CAL, and I find some difficulty dealing with
Stream sub-domain as input to a kernel.
Here it a short example, assume I have 3 CAL stream memory resources:
CALresource A_Res; // 256*256 float4
CALresource B_Res; // 16*16 float4
CALresource C_Res; // 16*16 float4
// Resource allocation
calResAllocLocal2D(&A_Res, device, 256, 256, CAL_FORMAT_FLOAT4, 0);
calResAllocLocal2D(&B_Res, device, 16, 16, CAL_FORMAT_FLOAT4, 0);
calResAllocLocal2D(&C_Res, device, 16, 16, CAL_FORMAT_FLOAT4, 0);
// Memory binding
CALmem A_Mem; calCtxGetmem(&A_Mem,ctx,A_Res);
CALmem B_Mem; calCtxGetmem(&B_Mem,ctx,B_Res);
CALmem C_Mem; calCtxGetmem(&C_Mem,ctx,C_Res);
Assume now, I have a kernel that computes the 16x16 C values by using
B and 16x16 subdomains of A (ie sliding 16x16 windows originated at
x,y coordinates) such as:
C = function of ( A.domain(x,y,x+16,y+16) , B ) with x and y variables
void kernel k_compute(output float4 c<>, float4 a<>, float4 b<> {
....
}
So ahead of kernel execution there is a memory binding
calCtxSetMem(ctx, i0, A_Mem); // should I specify the input Stream domain here ????
calCtxSetMem(ctx, i1, B_Mem);
calCtxSetMem(ctx, o0, C_Mem);
domain = {0,0,16,16}; // this is the OUTPUT computation domain
calCtxRunProgram(&e, ctx, func, domain);
QUESTIONS:
(1) How do I specify A stream sub_domain as an INPUT to my CAL kernel??
(2) Is this to be done when calCtxGetmem mapping the resource to mem? If yes how?
Thanks for guidance.
Specifying subDomain for any resource is not possible in CAL. Though you can specify the domain of execution, while running the program using calCtxRunProgram.
The other way of doing this is to pass constants (with domain offset value) in your IL shader and there you can calculate the input address to fetch from textures.
Hi Gaurav,
So if I understand correctly what I should do is to specify my kernel as follows
(rough sketch)
void kernel k_compute(out float4 c<>, float4 a[][], float4 b<>, int x, int y ) {
int2 indexC = instance().xy;
int2 indexA = { indexC.x + x, indexC.y + y };
c = a[indexA] + b;
}
Compile it (with Brook Compiler) and use the il as CAL kernel, while still calling
calCtxSetMem(ctx, i0, A_Mem);
calCtxSetMem(ctx, i1, B_Mem);
calCtxSetMem(ctx, o0, C_Mem);
domain = {0,0,16,16};
calCtxRunProgram(&e, ctx, func, domain);
Correct?
Thanks
Jean-Claude
Exactly, this needs to be done.
Ok, will do
Thanks a lot for your guidance
Jean-Claude