cancel
Showing results for
Did you mean:

# Archives Discussions ## Best GPU algorithm for solving 4x4 linear equation system

Hi,

What is the best GPU algorithm for solving a linear equation system Ax = b, where A is a 4x4 or 3x3 matrix?

The equation is to be solved in one work item.

Vis Cocoa

10 Replies I'm not aware of a backslash operator available yet for OpenCL.  Does anyone know of one?  If so, we'd love to add it to Jacket and ArrayFire.

Also, I assume you're going to do many 4x4 or 3x3 matrices in a big batch.  Doing just one of these is going to be slower on the GPU than it would be on the CPU (i.e. not enough data to get benefit from exploiting data-parallelism).

-John I have just got an idea for 3x3

[a b c] x = d

det [a b c] = dot (a, cross (b, c));

det [d b c] = dot (d, cross (b, c));

det [a d c] = dot (a, cross (d, c));

det [a b d] = dot (a, cross (b, d));

x = (float3)(det[d b c], det [a d c], det [a b d]) / det [a b c];

Seems to be an efficient way? Any better ideas?  Staff

Your 3x3 method assumes the matrix is invertible.  Is that guaranteed for your case? Hi Jeff,

Nice to see your reply. No, the matrix is not guaranteed to be invertible. I wonder what if A is a singular matrix? I hope it results in an infinite float number I have not make the kernel work yet. I am currently struggling with printf, which sometimes prints wrong characters, sometimes does not work at all. I appreciate if you give me some ideas on printing floats.

Thank you!

Vis Cocoa  Staff

Row reduction is guaranteed to work even for singular matrices, but then I don't have a GPU friendly algorithm for that!  Thanks Jeff.

For Ax = b, if A is singular, the equation has either no solutions or infinitely many solutions.

I am working on a ray tracer. I can safely treat both cases as no solutions.

I have tested the triple product based method. It is more efficient than my previous method using scalar calculations  Journeyman III

Since you said you are working on a ray tracer I assume the 4x4 matrix would be a transformation matrix in which case it would be guaranteed to be invertible (that is if the transformation is linear, which I think it is, you would have to check). In this case I would do something similar to your 3x3 method of going for the determinant but for a 4x4 matrix, I suggest you use mathcad or matlab with the simbolic toolbox to make things easier and less error prone. Iterative methods for a matrix this size are not efficient, by the second iteration you probably could have already solved it with gauss, LU, etc.. thank you John!  When working on CPUs, I like Singular-Value-Decomposition based method. A CPU core is much more powerful than a GPU core  