Hi
I am using a lookup table in the kernel and defined in the kernel itself.It seems the kernel is using rightly this table in CPU mode but not in GPU mode...shouldn't we define a tables like this in kernel in GPU...or we should send the table as an arg to the kernel.
I defined the table something like this is .cl file
const unsigned char lookup_table[256]={.....};
I am getting weired results in GPU mode when accessing the table.
Please confirm
i use lookup table too and i do not have any problem.
Hi nou,
I am still facing the issue in GPU mode ..,please find the part of the kernel code where lookup table is used.
output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 0] = lookup_table[output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 0]];
output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 1] = lookup_table[output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 1]];
output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 2] = lookup_table[output[(ty - 1) * 3 * lineLen + (tx - 1) * 3 + 2]];
I am not sure why this is not working properly and guess there is some issue with the lookup_table definition and usage in kernel.
Out of curiosity, does the CL compiler do common subexpression elimination? Does the expression ((ty - 1) * 3 * lineLen + (tx - 1) * 3) get handled efficiently?
Originally posted by: pavandsp
const unsigned char lookup_table[256]={.....};
Shouldn't be "constant" instead of "const" ?