More information from running it through GPU PerfStudio:
CPU time 4.56ms
GPU time 71.32ms
Claiming application is GPU bound requiring 100% of GPU.
However, same application runs at 30fps+ on same machine with a 4850 GPU.
Looking at the frame debugger a 60,000 vertex piece of geometry is taking 7.06ms of GPU time to draw - this is insane as it's a solid colour with a reflection map - about 20 instructions for the pixel shader maximum.
I'm guessing the 7.06ms is because it's falling back to software (vertex?) shading for some - is there any way to determine this as Perfstudio seems to think it's GPU time.
I'm going to pull this card and get some metrics with the 4850 for comparison.
Turns out the card had popped out of the slot slightly because the SATA connectors are in the way and there was about 1mm of the gold connector showing from the pci-e x16 connector but not the x1 connector.
Interestingly no warnings at all and I only noticed because GPU-Z reported only x1 PCI not x16 PCI so I was about to swap the PSU out in case there was insufficient power.
Slowdown makes perfect sense as apps which were preloading to vram or procedural were fine because there was no bus traffic.
Perhaps a feature for future catalyst installations would be to detect that it can only achieve x1 on a card which should be capable of x16 and warn about it?