Thanks for the information Ceq. Indeed, double2 works properly but not double4.
I was fearing it would be something like that.
I have no idea whether it would be unreasonably difficult to implement for AMD, but
if they could abstract the 128 bit limit out, it would be great for scientific programmers
who have programmed and tested something in single precision, using float4, and who then
want to go to double precision as painlessly as possible!