So I just got a new development laptop with an Firepro M8900 to use as a reference for our 6970M processors we will be using in the field.
We have been using a 6850 desktop card as this was the closest desktop card to the 6970M. I just launched my code and it took almost 3 minutes to compile. On a quad core i7 top of the line mobile chip and a very fast SSD (which our desktop systems didn't have). I though "okay well that is just inital compile" but no. When I tried to launch again there was no improvent. On the desktop this same code has always taken longer that it would seem is nessesary but never to the point I needed to look into it. This is unusably slow (for some of our purposes.
I will try binary kernels but we have some code that we are going to add that dynamically creates and compiles OpenCL code that does load balancing based on current workloads and so that isn't the only solution.