I was going to try out some OpenCL 2.0 features an ran into two strange issues after adding the -cl-std=CL2.0 flag:
- The preprocessing stage fails and the compiler complains about includes missing; the same code compiled just fine with CL1.2 did.
- After copying the missing files to /tmp to satisfy the compiler, I got about 25-35% performance drop from just adding the flag not changing anything else. Is there something in the OpenCL 2.0 specs or is AMD implementation that (in)directly affects performance of 1.2 code? Or is this just a bug?