I've got 2 OpenCL kernels - one is faster for NVIDIA, second - for AMD VLIW5.
After getting the new AMD GCN card, I noticed that NVIDIA kernel is faster for GCN than AMD one.
The question is: how to detect GCN architecture, except of parsing GPU name? Is there something like
"compute capability" for AMD?
We have some upcoming extensions that will allow applications to query information about the device, such as VLIW width, etc. So, currently, you can either detect by device name (there are only 3 GCN parts released right now), or do a quick perf test at init time to see which method is faster for the current device.