A float8 structure should work fine. Brook+ will convert this structure into two seperate float4 CAL buffers and it is similar to using two float4 streams.
8192x8192 limitation applies on the stream. I think we can have execution domain larger than 8192x8192 but I have not tested it.
Hope Brook+ also will do correct CAL buffers merging when I will call stream write() to put results back into host memory
Yes, it does. But, as you can guess it would have some performance overhead because of this merging.
Originally posted by: gaurav.garg Hope Brook+ also will do correct CAL buffers merging when I will call stream write() to put results back into host memory
Yes, it does. But, as you can guess it would have some performance overhead because of this merging.
over 8192 producing error