I found a line in my kernel code that works fine for int2 or int3, but not for int. In the following code the "Do something" expression is never executed when using type int as shown, even when it should.
int dfdx;
...
if ( any(dfdx !=0) )
; // Do something
else
; // Do something else
I'm wondering if this is a bug or not. I checked the specification and seems to me that it is. I'm using APP SDK 2.7 and Catalyst 12.8 under Windows 7 64-bit. I ran the program on both my FX 8120 and HD Radeon 7950; both devices exhibited the same problem. When I changed "any(dfdx != 0)" to just "dfdx != 0" it worked as expected for int, but then I can't use my normal templates.
Has anyone else encountered this behavior?
Thanks!
I think what you are seeing is a quirk of booleans in OpenCL C that has been on my list of annoyances since the first version was released. You see, truth for vectors is -1, truth for scalars is 1.
The result then would be that:
if dfdx is 0
dfdx != 0 == 1
any(1) is false
However, if you had int2(dfdx)
int2(dfdx) != int2(0,0) == (-1, -1)
any((-1, -1)) == true
See 6.12.6:
"The functions isequal, isnotequal, isgreater, isgreaterequal, isless, islessequal, islessgreater,
isfinite, isinf, isnan, isnormal, isordered, isunordered and signbit described in table 6.14 shall
return a 0 if the specified relation is false and a 1 if the specified relation is true for scalar
argument types. These functions shall return a 0 if the specified relation is false and a –1 (i.e. all
bits set) if the specified relation is true for vector argument types"
Hi Lee,
Thank you for pointing out that subtlety in the specification. I now recall reading it once a long time ago, but this is the first time it really hit me. Do you happen to know why it is defined this way? I guess for compatibility that we're stuck with this annoyance.
I've been thinking of an alternative we could use to get the desired effect for intn, n = 1, ..., 16.
intn dfdx;
...
if any( select((intn)(0), (intn)(-1), dfdx != 0) )
; // Do something
else
; // Do something else
What do you think?
By the way, due to the above subtlety I just realized that fast_normalize(0.f)--I assume also normalize(0.f)--is implementation dependent, whereas fast_normalize( (float2)(0.f) ), etc., are strictly defined to be within 8192 ulps of (float2)(0.f), see Table 6.13 Scalar and Vector Argument Built-in Geometric Function Table.
floatn fast_normalize (floatn p)
Returns a vector in the same direction as p but with a length of 1. fast_normalize is computed as:
p * half_rsqrt (p.x2 + p.y2 + … )
The result shall be within 8192 ulps error from the infinitely precise result of
if (all(p == 0.0f))
result = p;
else
result = p / sqrt (p.x2 + p.y2 + ... );
with the following exceptions:
...
2) If the sum of squares is less than FLT_MIN thenthe implementation may return back p.
Cheers,
Sean
I've been thinking of an alternative we could use to get the desired effect for intn, n = 1, ..., 16.
If you left shift the boolean expression 31 bits it should work for both "truth systems", where 1 or -1 equal "true".
This will work with variables or with comparisons like below. I've had the same problems!
if (any((dfdx !=0)<<31))
; // Do something
else
; // Do something else
I like the shifting idea as well, but we would have to be a little more careful for it to work in general for a templated dfdx. If dfdx is longn, ulongn, or doublen, then as shown above the shift would be 32 bits too few, and similarly for charn, ucharn, short, or ushort, except 16 bits too many (although I don't recall if shifting by too many bits is a compile error).
if ( any( (dfdx != 0) << (8 * sizeof(dfdx) / vec_step(dfdx) - 1) ) )
; // Do something
else
; // Do something else
However, the above won't work for user created data types, and maybe neither for bool or half. I think we can all agree that this is overly complicated for something that should be intuitive.
I like the shifting idea as well, but we would have to be a little more careful for it to work in general for a templated dfdx. If dfdx is longn, ulongn, or doublen, then as shown above the shift would be 32 bits too few, and similarly for charn, ucharn, short, or ushort, except 16 bits too many
No (or yes! ), if will work with any data type because the compiler converts the conditional expression inside the parens ( 0 ! = any_var ) to a logical or boolean type, which is 32 bits on the opencl platform. You can check by evaluating sizeof(bool);
I think we can all agree that this is overly complicated for something that should be intuitive.
Agreed. When I first saw the any() function, I thought now that's simple.........
I presume it's done that way because someone felt it would be a good idea for OpenCL vectors to map trivially to SSE intrinsics but for the rest of OpenCL C to map directly to C standard operations. Beyond that I don't understand it, I originally reported it as a bug when I found the issue just like you did in this thread.