cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

dmitriysabitov
Journeyman III

OpenCl and power iteration method (eigendecomposition)

I'm new in OpenCL and I'm trying to implement power iteration method (described over here)

matrix sizes over 100000x100000!

Actually I have no idea how to implement this.

It's because workgroup have restriction CL_DEVICE_MAX_WORK_GROUP_SIZE (so I can't make one workgoup with 1000000 work-items)

But on each step of iterating I need to synchronize and normalize vector.

1) So is it possible to make all calculations inside one kernel? (I think that answer is no if matrix sizes is more than CL_DEVICE_MAX_WORK_GROUP_SIZE)

2) Can I make "while" loop in the host code? and is it still profitable to use GPU in this case?

something like:

while (condition)

{

kernel calling

synchronization

}

0 Likes
1 Reply
nou
Exemplar

1) is k+1 iteration dependent on result from k iteration and  you need global synchronization? global synchronization is done at separate kernel execution

2) yes 100000 work items can be profitable. if you are enqueue small kernels then it is best to enqueue multiple then before calling any synchronization. for example


while(i<100000){


     clEnqueueNDRange();


     if(i%100)clFlush();


     i++;


}


clFinish()


so you OpenCL driver can batch them together and make it more effective.

0 Likes