AnsweredAssumed Answered

Why single CPU core's performance on the same workload different in pthread and OpenCL?

Question asked by acekiller on Aug 21, 2014
Latest reply on Aug 21, 2014 by nou

Hi there! I have a very simple task: scan over a char array multiple times (16*1024). I implemented it with pthread with one thread on one CPU core. The time is 23's. Then I use device fission to create a device containing only one CPU Compute Unit (i.e., one CPU core), the time is only 17's. In my opinion, the OpenCL implementation should be slower than pthread (because C is more hardware-close). How come I get this results?