Although OpenCL 1.2 and 2.0 syntax is very similar, there is a big internal difference for the compiler. The differences are both in language semantics and compiler internals. There are at least two big factors, which might affect kernel performance if 2.0 syntax is forced on an 1.2 source:
- Unqualified pointer passed to a function treated as generic vs private. That is a good idea to complete declarations of functions so that private pointer arguments are properly marked with __private attribute. That will make source compatible with both 1.2 and 2.0 syntax.
- OpenCL 2.0 supports non-uniform workgroups, so expansion of get_local_size() becomes substantially bigger than with 1.2. You can mitigate the impact by setting OpenCL 2.0 specific option -cl-uniform-work-group-size, although it will not remove all issues in the current release and will be improved in the future. Meanwhile you can achieve better results by using kernel attribute reqd_work_group_size if it is known.
Other than that compiler is really different internally for 1.2 and 2.0 now, so there can be differences if performance and behavior in both directions given a specific source.