I've found in win XP sp3 x86 32 bit ATI stream SDK v2.3 cl_platform.h
/* cl_float3 is identical in size, alignment and behavior to cl_float4. See section 6.1.5. */
typedef cl_float4 cl_float3;
IHMO this is wrong way...
Addressing to w needn't give an error, really. What is important is that if you pass float3 to function calls they behave correctly. For example, a dot product on a float3 is not the same as a dot product on a float4. If w is initialised to 0 this isn't really a problem barring marginally inefficient code generation. As long as once the data is on the device it is treated as a true float3, having the host cost correctly deal with the alignment characteristics is the most important thing, surely?
How else would you deal with it? Create a host structure containing x, y, z, _padding? Or just have x, y, z and hope compiler alignment guarantees are effective (which is something I don't have much faith in).