I'm new in OpenCL and I'm trying to implement power iteration method (described over here)
matrix sizes over 100000x100000!
Actually I have no idea how to implement this.
It's because workgroup have restriction CL_DEVICE_MAX_WORK_GROUP_SIZE (so I can't make one workgoup with 1000000 work-items)
But on each step of iterating I need to synchronize and normalize vector.
1) So is it possible to make all calculations inside one kernel? (I think that answer is no if matrix sizes is more than CL_DEVICE_MAX_WORK_GROUP_SIZE)
2) Can I make "while" loop in the host code? and is it still profitable to use GPU in this case?