Hello,
I have made my first project with brook with my ATI RADEON HD 3870
I make sum of 2 matrix (same kind of code given in samples)
I have one version with float and one version with float4
My two kernels are:
kernel void sum(float a<>, float b<>, out float c<>
{
c = exp(a) + exp(b)
}
kernel void sum(float4 a<>, float4 b<>, out float4 c<>
{
c = exp(a) + exp(b)
}
And with 10000 iterations of the kernel , i don't see that with float4 my code are more faster , i have the same time approximately (in CPU i have 10 times the time on GPU)
Thanks a lot
regards
Jonathan