Code doesn't seem to compile when source code length is big
Hi;
I'm trying to build source code programatically to try to speed up sparse matrix multiplication by a vector. I have a matrix A that contains many zeros and I want to compile a code that automatically ignores those zeros without testing anything.
So the output of the code is a string (kernel) that hardcodes a given matrix to accelerate matrix/vector product. An example of output is the following kernel, for a matrix A[6,6] which contains some zeroes.
As one might expect, the source becomes quite big when the dimension of the matrix increases, and I can't compile it in a reasonable time for a matrix of, say, dimension 600x600.
Is there any problem with this approach? Maybe the nested if's become too complicated to compile? They are important to maintain (more or less)the same execution path in the warp.
Thanks in advance
__kernel void MatrMult(__global float * b, __global float * resp) { int i = get_global_id(0); if (i <= 2) { if (i <= 1) { if (i == 0) {resp[0] = -b[0]+b[1]+0.03*b[3];} else {resp[1] = b[0]-0.07*b[1]+0.1*b[4]-0.11*b[5];} } else { resp[2] = -0.13*b[1]-b[2]-b[3]+b[4]+b[5]; } } else { if (i <= 4) { if (i == 3) {resp[3] = b[3]+b[4];} else {resp[4] = b[1]-0.26*b[2]-0.27*b[3]+0.28*b[4];} } else { resp[5] = -0.32*b[2]-0.33*b[3]+0.34*b[4]-0.35*b[5]; } } }