So I just got a new development laptop with an Firepro M8900 to use as a reference for our 6970M processors we will be using in the field.
We have been using a 6850 desktop card as this was the closest desktop card to the 6970M. I just launched my code and it took almost 3 minutes to compile. On a quad core i7 top of the line mobile chip and a very fast SSD (which our desktop systems didn't have). I though "okay well that is just inital compile" but no. When I tried to launch again there was no improvent. On the desktop this same code has always taken longer that it would seem is nessesary but never to the point I needed to look into it. This is unusably slow (for some of our purposes.
I will try binary kernels but we have some code that we are going to add that dynamically creates and compiles OpenCL code that does load balancing based on current workloads and so that isn't the only solution.
The kernel compilation time should not depend on the card. The kernel compilation will happen on the CPU.
Originally posted by: himanshu.gautam The kernel compilation time should not depend on the card.
That doesnt make any sence of course the compiler takes different code paths for different GPUs as they have different architectures different memory subsystems they are entirely different. I hve cofirmed a similar issue with a laptop taking longer to compile despite ein the being a much stronger machine than the two desktop systems we are working on. Anyone from amd have any suggestions recommendations?Righ
Hello you have to know that nVIDIA is caching the binary for you and in fact when you ask a second time for the compilation of your kernel source on an nVIDIA GPU, it loads by itself from the binary. On win7, the nvidia cache usually hide in C:/USERS/%YOURACCOUNT%/AppData/Roaming/NVIDIA/ComputeCache/ you will have here a lot of folders named 0 to f look in one of them, after a second level of folders you will find files with names ala MD5(but shorter) you can open one of them in a text-editor and ohhh you will see ptx (CUDA binary) files !!
Unfortunately, I think AMD does not cache the binary as NVIDIA and you have to do it by yourself.... I hope AMD will soon do the caching by itself.
Originally posted by: Rom1 Hello you have to know that nVIDIA is caching the binary for you and in fact when you ask a second time for the compilation of your kernel source on an nVIDIA GPU, it loads by itself from the binary. On win7, the nvidia cache usually hide in C:/USERS/%YOURACCOUNT%/AppData/Roaming/NVIDIA/ComputeCache/ you will have here a lot of folders named 0 to f look in one of them, after a second level of folders you will find files with names ala MD5(but shorter) you can open one of them in a text-editor and ohhh you will see ptx (CUDA binary) files !!
Unfortunately, I think AMD does not cache the binary as NVIDIA and you have to do it by yourself.... I hope AMD will soon do the caching by itself.
While this is true it has nothing to do with my issue. My issue is with compilation deltas between desktop and mobile cards being very high.