2 Replies Latest reply on Mar 18, 2010 12:07 AM by windy96

    Measuring Kernel performance only in multi-core CPU environment

    windy96

      I want to run OpenCL in Multi-core CPU environment.  Because OpenCL is architecture-independent, it should run and I succeeded in running.

      However, when evaluating performance and checking performance element, I got trouble.  I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

      Is there any method to have kernel's data only?

        • Measuring Kernel performance only in multi-core CPU environment
          omkaranathan

           

          I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

           

          Is there any method to have kernel's data only?

           

          Which counter are you using?

          Its better to use the profiling commands provided by OpenCL to measure exact timings. Refer to OpenCL Spec. Section 5.9 for more details

            • Measuring Kernel performance only in multi-core CPU environment
              windy96

               

              Originally posted by: omkaranathan
              I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

               

               

               

              Is there any method to have kernel's data only?

               

               

               

               

              Which counter are you using?

               

              Its better to use the profiling commands provided by OpenCL to measure exact timings. Refer to OpenCL Spec. Section 5.9 for more details

               

               

              Thank you for your answer.  I was using Intel VTune for measuring CPU performance counter. 

              I am still curious about cache miss ratio.  Maybe I should use both OpenCL profiling and CPU performance counter.  Hmm...  I'm afraid of some interference when using two profiling methods.