cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

windy96
Journeyman III

Measuring Kernel performance only in multi-core CPU environment

I want to run OpenCL in Multi-core CPU environment.  Because OpenCL is architecture-independent, it should run and I succeeded in running.

However, when evaluating performance and checking performance element, I got trouble.  I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

Is there any method to have kernel's data only?

0 Likes
2 Replies
omkaranathan
Adept I

I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

 

Is there any method to have kernel's data only?

 

Which counter are you using?

Its better to use the profiling commands provided by OpenCL to measure exact timings. Refer to OpenCL Spec. Section 5.9 for more details

0 Likes

Originally posted by: omkaranathan
I used CPU performance counter, but I cannot distinguish how long OpenCL runtime consumes and how long my own kernel consumes.  I also cannot distinguish cache miss count.

 

 

 

Is there any method to have kernel's data only?

 

 

 

 

Which counter are you using?

 

Its better to use the profiling commands provided by OpenCL to measure exact timings. Refer to OpenCL Spec. Section 5.9 for more details

 

 

Thank you for your answer.  I was using Intel VTune for measuring CPU performance counter. 

I am still curious about cache miss ratio.  Maybe I should use both OpenCL profiling and CPU performance counter.  Hmm...  I'm afraid of some interference when using two profiling methods.

0 Likes