I am using OPENCL for last two months and pretty much understood the basics of it. I am working on NVIDIA QUADRO 410 card. At this point I can write simple kernels and good host programs.
Now I wanted to know if it is possible to fetch the core details of my card when I am executing my kernel. I want to map a particular thread to that core where it has performed the operation. The card I am using has 192 CUDA core (cuda core coz its NVIDIA) when I define the total number of thread and the number of threads in the group , and when these threads perform any operation say matrix multiplication , is it possible to to see which thread gets executed in which cuda core. If yes how?
It would be very helpful if u can come up wih this answer as i have searched in internet and came up with nothing.