My institute has invested in 'small' GPU cluster based on Radeon GPUs, all in hopes of doing rough OpenCL computations. I have installed Ubuntu 13.04 and Ubuntu 12.10, both show the same faulty behavior, namely that the default adapter get's lost almost instantly.
In my first encounter of the issue, I have run watch -n 0.5 'aticonfig --adapter=ALL --odgc' and it worked fine, however when I ran firefox, it broke saying "Maximum number of clients reached". When I closed it, it all worked fine. Then after many fiddling around with configuration (both HW and SW), I have gotten to a point where after boot all 4 cards were visible. aticonfig, lspci, then I ran clinfo, and it already whined about the number of clients reached, after that I practically lost my default adapter. Terminal output is attached as file.
It would be nice, if someone could shed some light as to what might be going on, because we invested quite a lot into these machines, and right now I have no idea what could be the cause. 2 cards worked fine, 4 do not, and I highly doubt it would be power related, as there are 2*1600W PSUs inside that are aggregated together practically working as auto-redundant power supplies, so even 1 of the PSUs must manage with the computer, not to mention 2.
The computer is ASUS ESC4000 G2/FDR with 4X HD7970, OS is Ubuntu 12.10 64-bit, drivers are Catalyst 13.4. Any suggestions?