I've implemented a texture synthesis algorithm in Brook+ which works fine when I try 64x64 textures. However, once I go larger my screen will go black after a while and my driver (atikmdag) will crash.
Now my kernel takes about 7 seconds to complete in 64x64 mode, mostly due to a rather long while loop but unfortunately I cannot avoid that.
So what is causing this? I've used 4096x4096 streams in Brook before without problems but the kernels were a lot faster. My guess would be it simply takes too long.
Should I be cutting the data up in pieces? Or do I need to work on kernel performance some more? I could post the code if that would make things more clear but it's quite lengthy.
My system is Vista 64, 9.10 drivers. ATI 3870x2.
All advice would be welcome. Thanks in advance!