But, it would cause a conversion of sad and testsad before checking the condition and can produce incorrect results
Hmm, a many days try to understand what is going on with my kernel code.
Probably you can help me. Are these code chuncks similar? I mean logic.
This chunk I execute on cpu
int xleft = 0, xright = 16;
int ytop = 0, ybottom = 16;
int temp = 0;
int mvlength = 100000000;
for (int j = 0; j < height; j += 16)
{
for (int i = 0; i < width; i += 16)
{
// set top and bottom range
ytop = - min(j, 16);
ybottom = min(height - 16 - j + 1, 16);
// set left and right range
xleft = - min(i, 16);
xright = min(width - 16 - i + 1, 16);
refsad
for (int y = ytop; y < ybottom; y++)
{
for (int x = xleft; x < xright; x++)
{
int srcidx = i + (j * width);
int index = i + x + ((j + y) * width);
// calculate SAD
//--------------------------------
for (m = 0; m < 16; m++)
{
for (n = 0; n < 16; n++)
{
temp += abs((src[srcidx + n] - ref[index + n]));
}
srcidx += width;
index += width;
}
//-------------------------------
if ((refsad
{
refsad
mvlength = abs(x) + abs(y);
refmvx
refmvy
refmvl
}
temp = 0.0;
}
}
l++;
mvlength = 100000000;
}
}
And this as kernel
int ytop = - min(jy, 16);
int ybottom = min(height - 16 - jy + 1, 16);
// set left and right range
int xleft = - min(ix, 16);
int xright = min(width - 16 - ix + 1, 16);
int x, y;
int m, n;
int mvlength = 100000000;
sad = 100000000;
for (y = ytop; y < ybottom; y++)
{
for (x = xleft; x < xright; x++)
{
int testsad = 0;
int srcidx = ix + (jy * width);
int idx = ix + x + ((jy + y) * width);
for (m = 0; m < 16; m++)
{
for (n = 0; n < 16; n++)
{
testsad += (abs((((int)src[srcidx + n]) - ((int)ref[idx + n]))));
}
srcidx += width;
idx += width;
}
if ((sad >= testsad) && (mvlength > (abs(y) + abs(x))))
{
sad = testsad;
mvlength = (abs(y) + abs(x));
mvy = y;
mvx = x;
mvl = mvlength;
}
}
}
}
What do you think is the same logic? I have different results in mvx and mvy. Probably you see mistakes in kernel code. Because I expect absolutely the same behaviour.
I think problem exists in latest if.
if ((sad >= testsad) && (mvlength > (abs(y) + abs(x))))
If you need additional code let me know.
I'm sure it is driver problem again.
I have debugged in cpu mode (everytime forgot about debug mode) and there are no problems.
The problems are only in cal mode.
Can I send code by email? It is not comfortable to publish on the forum.
Yes, you can email on the address mentioned in my profile. I would take a look as soon as I get some free cycles.
Done. Please let me know asap.
Thank you very much!
Gaurav,
Please confirm that you have received my email.
Yes, I have received your mail, but I couldn't find any issue with your code. It seems an issue on driver side? Which Catalyst are you using?
Could you try it with 9.2?
What is output of my test? Are the similar results on cpu and cal modes?