cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

pwvdendr
Adept II

generic drivers for windows? or anything that actually works?

I'm having trouble with the windows drivers provided by AMD. I now have built two machines: let's call them Alpha (1xHD5450, 3xHD6990) with 7 GPUs, and Beta (8xHD7970) with 8 GPUs. I got Alpha working under windows after some trouble; I still haven't got Beta working under windows so any help is appreciated.

When building Alpha, every time with a 64-bit windows 7 professional,

  • the first time I installed the entire catalyst center, with default options, on an existing windows. Result: BSOD whenever I login, consistently.
  • the second time (on a new vanilla windows) I installed only graphics drivers + APP SDK. Result: OpenCL sees only 5 Cayman GPUs (out of 6), spread over all 3 cards.
  • the third time (again on a new vanilla windows) I did exactly the same, but with/without (don't remember exactly the order) all windows updates first. Now OpenCL sees only 3 out of 6 Cayman GPUs (so it's even random behaviour?)
  • the fourth time (again on a new vanilla windows) I did not install the AMD graphics drivers at all. Instead, I let windows update install their graphics driver and installed only the APP SDK. Now, all GPUs worked as expected! Hurray for windows updates, boo for AMD.

Now I'm buiding Beta, still every time with a 64-bit windows 7 professional,

  • the first time (on a new vanilla windows) I install the newest Catalyst package from the website. Result: BSOD whenever I login, consistently. Yup, exactly the same scenario as with Alpha. Seems like the drivers are the only thing Alpha and Beta have in common, so I'm going to point a finger that way.
  • the second time (again on a new vanilla windows) I let windows update everything and I installed only the graphics driver + APP SDK. Result: OpenCL sees only 5 Tahiti GPUs (out of 8). Seems familiar with the Alpha problems eh?
  • the third time (again on a new vanilla windows) I installed just the graphics drivers + APP SDK. Result: OpenCL sees only 5 Tahiti GPUs (out of 8). Seems familiar with the Alpha problems eh?

Alas, windows updates does not propose me to install HD7970 drivers. So the fourth thing that solved it for Alpha, doesn't work here. Under Linux (Ubuntu 12.04), OpenCL sees all 8 cards, so there's no hardware problem. It's purely a problem with the AMD graphics drivers for windows (or at least win7 64-bit).

It's also not a problem with the APP SDK, since the problems arise already when just installing the graphics drivers. Looking in the device manager, it lists 5 HD7970 and 3 standard VGA display adapters.

I'm out of clue here. Are there any generic drivers for windows, like the linux people have made for linux? Or do you have any other suggestions for me to make all 8 cards work under windows?

0 Likes
10 Replies
pwvdendr
Adept II

Oh this might might be worthwile: when trying to disable&re-enable the non-recognized device through the Device Manager, another BSOD came, but this time it did not automatically rebooted, so I could take a picture from it: http://dl.dropbox.com/u/3060536/bsod.jpg

Seems atikmdag.sys is the culprit. Hope this helps in debugging the drivers.

0 Likes

Please uninstall all the driver files with Driver Sweeper from Guru3D

http://downloads.guru3d.com/Guru3D---Driver-Sweeper-%28Setup%29_d1655.html#download

This tool is very helpful an clean really everything. It is always possible, that there are some file still even if you uninstall the driver with the AMD uninstall function.

Please report, if it have helped or not.

EDIT:

You can also try this:

Start the system in secured mode

press: windows+R key

enter: services.msc

disable "ATI External Event Utility"

restart the system and look if the problem have solved

And the last possibility is, that your DDR3 Ram is broken. Check the Memory with the Memtest from linux. if i remember right memtest86+ is the name

You can also remove all DIMMs, and use only one.

EDIT2:

Ok another possible solution is to reduce the Core/RAM clocks for the GPUs.

How big is your powersupply?

EDIT3:

Ok and one more

http://board.zuxxez.com/showthread.php?t=35435

The atikmdag.sys is a file from MS as far i see. How i understand, it handle the response from the GPU. I remember, that i have a problem when i run a OpenCL Programm on the GPU, that tooks longer than 2 secons, the driver get resetted.

Just try to change the values in the registry. Perhaps it helps, because you have so many GPUs, and the system is not fast enough to get a response from the cards intime.

0 Likes

Skysnake wrote:

Please uninstall all the driver files with Driver Sweeper from Guru3D

http://downloads.guru3d.com/Guru3D---Driver-Sweeper-%28Setup%29_d1655.html#download

This tool is very helpful an clean really everything. It is always possible, that there are some file still even if you uninstall the driver with the AMD uninstall function.

Please report, if it have helped or not.

Uninstall, and then do what? I have tried it on a clean vanilla install of windows, several times. So I don't get what 'just' uninstalling them would help.

Start the system in secured mode

press: windows+R key

enter: services.msc

disable "ATI External Event Utility"

restart the system and look if the problem have solved

You mean safe mode? What exactly would I be disabling here?

I will reinstall windows and try this, once I have time.

And the last possibility is, that your DDR3 Ram is broken. Check the Memory with the Memtest from linux. if i remember right memtest86+ is the name

You can also remove all DIMMs, and use only one.

There are no memory problems. It's a problem with the windows drivers provided by AMD.

EDIT2:

Ok another possible solution is to reduce the Core/RAM clocks for the GPUs.

How big is your powersupply?

I have roughly 300W available per GPU, that's really not the problem. Under linux, I can run my computations overclocked to 1150 MHz without any problem.

EDIT3:

Ok and one more

http://board.zuxxez.com/showthread.php?t=35435

The atikmdag.sys is a file from MS as far i see. How i understand, it handle the response from the GPU. I remember, that i have a problem when i run a OpenCL Programm on the GPU, that tooks longer than 2 secons, the driver get resetted.

Just try to change the values in the registry. Perhaps it helps, because you have so many GPUs, and the system is not fast enough to get a response from the cards intime.

Indeed, the watchdog timer kills a process after 2 seconds, but it's not clear to me how this could ever help the problem that only 5 out of 8 cards are being recognized. But I will try, thanks.

0 Likes

It is just everything i found

And of cours, your Memory should be ok, but i read, that this kind of failure can happen because of memory problems. I don´t know how big the differences are between linux and windows in handling such problems.

I hope something helps. If not, it would become realy realy realy hard to solve it.

Let it me know if you need more help. I have also a contact to Microsoft, so perhaps this could help.

0 Likes

Let it me know if you need more help. I have also a contact to Microsoft, so perhaps this could help.

Actually, that might be very handy. What solved it for the other machine (with 1x HD5450 and 3x HD6990) was that Windows Update provided the graphics drivers, instead of AMD Catalyst. The drivers installed by Windows Update did recognize all GPUs flawlessly out of the box.

So it would be very helpful if you could gain more information on the difference between both machines I described:

  • Why does Windows Updates automatically recognize and install drivers for HD6990, and not for HD7970?
  • Is it only a matter of time (too new card) before Windows Updates will provide drivers for HD7970? Or do they only do this for dual GPUs? Or...?
  • Could it ever be relevant that I used a larger-resolution screen for the first machine? I'm not sure if this could ever make a difference in triggering Windows Updates to install these drivers.
0 Likes

Puh, you have really hard questions

How i told you, how far i know, the file with the problem is a Microsoft file. So it could be, that Windows updates this file, when it loads the default driver, and when you install the catalyst, this is not done.

If you want, i can call/send a Mail to my Microsoft contact.

But please give me as much informations as possible.

0 Likes

Skysnake wrote:

Puh, you have really hard questions

Of course, they are unsolvably hard for us. But perhaps someone with internal information from MS may be able to answer them. And yes, it is very well possible that the problematic file gets updated, but then it stays a mystery to me why this update was triggered on the first machine and not on the second.

If you want, i can call/send a Mail to my Microsoft contact.

But please give me as much informations as possible.

That would be very useful. What more information do you want? I can give you my email address if you want more instant reply than in this topic. But other than what is listed here, I wouldn't know what to add.

(Except maybe that the motherboard is a MSI Marshal Big Bang (B3) with chipset/cpu/ethernet drivers installed.)

0 Likes

Yeah, that would be helpful. You find my E-Mailadress in my profile-informations. Just send me a short mail, so that i have your adress and we can exchange the informations faster.

I will call later my contact and ask, if they could help.

EDIT:

Ok, the Mail to MS is done. I will make a node, when there is a answer.

But i have also one more idear!

Can you remove one of the HD7970 cards, and insert one of the 6990 or the small 5450? Perhaps this helps.

0 Likes

Yeah, that would be helpful. You find my E-Mailadress in my profile-informations. Just send me a short mail, so that i have your adress and we can exchange the informations faster.

Done.

But i have also one more idear!

Can you remove one of the HD7970 cards, and insert one of the 6990 or the small 5450? Perhaps this helps.

Could you tell me why you expect it to help? I don't own the 6990s, I only borrowed them for testing and getting them back now is rather difficult. And the machine is now 2m high on a server rack in our research unit's datacenter for some more serious stress testing. So while it's not impossible, it'd be quite a hassle to try this.

0 Likes

The reason why i hope this could help is, that the FASTRA guys also have had added a different card to make there machine bootable, and on the windows machine with the 6990 it also works with the one different card.

So perhaps it also works with 7x 7970 and 1xsomething else.

It is just something you could try. Perhaps it works. It is definetly something you should try. From MS no answer until now

0 Likes