I am running the OpenCL code with debug and release mode under Visual Studio. The time it takes to run the OpenCL kernel is the same. I am wondering if there is anything wrong with the settings. I did check the compiler option and see -O2 is on, but not sure if this is relavant to OpenCL code or the application C/C++ code only. How do I tell that the kernel is optimized?
"-O2" performs optimizations that the compiler developers considered the best combination for compilation speed and runtime performance. So the binary may run faster, which could be reflected in the output of the profiler.
The problem is, the debug and release version (with /O2) are the same. Is the profiler supposed to be always run in debug mode? All the runs are of the same speed: without profiler on debug and release mode, and with the profiler.
how about using an option (-O3?) to make the fastest binary and see if there is any difference? BTW, there are post on the forum saying that the profiler is not always reliable. So maybe you need to insert a timer manually.
The -O2 you set in the VS can't be applied in the OpenCL kernel, they are using different compilers, whatever you use debug or release build, the clBuildProram() API runs the same way, means they generate the same binaries. By default, the compile option for the opencl kernel is optimized. you can set "-cl-opt-disable" in the clBuildProgram API, then you may see the different effect on gpu.