I have been trying to create a vector dot product using the GPU with Brook+ and have been getting strange results (should be 4 but I get 7.012). This dot-product is being performed on a vector with itself. Can anyone tell me why this is?

The y[] consists of {1.0f, 1.0f, 1.0f, 1.0f}

kernel void multiply(float a<>, float b<>, out float c<>) { c = a*b; } reduce void constSum(float x<>, reduce float result){ result += x; } unsigned int m = 4; float dp = 0.0f; Stream<float> yStrm(1, &m); Stream<float> tmpStrm(1, &m); yStrm.read(y); // Kernel Calls: multiply(r0Strm, r0Strm, tmpStrm); constSum(tmpStrm, dp); std::cout << "GPU Residual = " << dp << std::endl;

Did you check errorLog on your tmpStream? Does CPU backend gives correct result.

As a side note, you can directly use intrinsic function dot(x, y) to calculate vector dot product.

kernel void test(float4 a<>, float4 b<>, out float c<>

{

c = dot(a, b);

}