I want to use DMA asynchronously while the stream processor is computing. So i use "calCtxRunProgram()" to run a computing on the GPU, and use "calMemCopy()" to strat a DMA transfer. But i find they do not finish out of order, and can not inprove the performance.
The code and result are like followPS:tempMem is on the remote memory, temp1Mem is on the local memory, and the kernel "func" does not access tempMem or temp1Mem)
1.code:
calCtxRunProgram(&e, ctx, func, &domain);
while(calCtxIsEventDone(ctx,e)==CAL_RESULT_PENDING)
count++;
calMemCopy(&e1,ctx,tempMem,temp1Mem,0);
while(calCtxIsEventDone(ctx,e1)==CAL_RESULT_PENDING)
count1++;
result:
count=1579; count1=2807;
2.code:
calCtxRunProgram(&e, ctx, func, &domain);
calMemCopy(&e1,ctx,tempMem,temp1Mem,0);
while(calCtxIsEventDone(ctx,e1)==CAL_RESULT_PENDING)
count1++;
result:
count1=4450;
3.code:
calCtxRunProgram(&e, ctx, func, &domain);
calMemCopy(&e1,ctx,tempMem,temp1Mem,0);
while(calCtxIsEventDone(ctx,e)==CAL_RESULT_PENDING)
count++;
result:
count1=4392;
From the three tests, i find in code2 and code3"calCtxRunProgram()" and "calMemCopy()" were always done together and the finish time is the sum of the two. Who can tell me the reason and how can i use DMA asynchronously while the stream processor is computing?Please!
It's a good question for better performance.
Does someone progamming in CAL?
xxhlyf,
Could you post some more information about your OS, Card, SDK, driver version?
windows xp + FS9170 + SDK1.3beta + FS Driver 8.561 for winXP
Is there anybody know how to use DMA asynchronously while the stream processor is computing? Give me a example code please!