cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

firespot
Adept I

suggested user driver update policy

Hi,

What is AMDs 'general' recommendation regarding OpenCL driver updates? Stay at the current driver if working, or updating to newest?

To give more specific reasoning for that question I have a scientific application which is run under 4 platforms for testing purposes (but only on Tahitis in production mode):

C / C++ (VS 2012)

OpenCL - AMD GPU (Tahiti) + AMD CPU platform (CPU is Intel's though); Catalyst 13.9 / SDK 2.9

Intel - CPU (SDK 2013)

Nvidia - GPU

Code is always the same, where a few macro tweaks make the code C/C++ and OpenCL compatible.

The program runs very much fine under C/C++, Tahitis and Nivida GPU. An earlier version always crashed on the AMD CPU implementation, and the latest version now also under the Intel CPU implementation (at a point very similar to the AMD implemenation). I have invested a _lot_ of time into the crash reason (the crash is 100% reproduceable), and at least for the Intel implementation I concluded that it's due to bad code generation (incidentally, are both AMD and Intel using LLVM and any bugs therein may propagate to both?). I have also observed that in general (various platforms affected, though in different ways) the option -cl-opt-disable seems to cause more troubles.

I hesitate upgrading current drivers (e.g. the Intel from 2013 to 2014 to test if that works) because in the past I have experienced that when multiple OpenCL platforms are available on a system this is quite a sensitive constellation, and upgrading one driver can break overall binary compatibility for other platforms as well (been there, done that). Has anyone experienced anything similar? What's the recommendation with respect to that?

Most importantly now: a new production system gets setup. I can stick to 13.9, which has for the Tahitis not caused troubles yet, or upgrade to 14.4 (or even the newest 2.0 support when officially out). Googling easily reveals comments over the net that people report troubles with some driver version (whether that is due to faulty programs written or faulty drivers I cannot judge). What's AMD's 'official' suggestion here? Stick with a driver if that works for one, or upgrade to newest in the hope of getting other problems fixed but possibly risking that a functioning program then stops working?

thanks!

0 Likes
1 Reply
jtrudeau
Staff

In general, I'd say there is no "official" AMD recommendation here. There is, however, good common sense. And, I do have some guidance. I pinged engineering to get some insight.

First

We would love to have a detailed report on the 100% reproducible crash. You can send me a direct message, and I'll get the information into the engineering team. To minimize being a pipeline, we need the things you would typically provide in a defect report:

  • your configuration (OS, hardware, versions, etc)
  • what you expected to happen and what actually happened
  • any logs related to the crash or the defect
  • a reproducible case - which might mean sample code (I know that can be a problem)

My personal advice on how to handle your situation is

  • don't fix what isn't broken - so in general stay with what's working BUT...

If there is significant new capability that you want to use, then you decide if it's worth moving forward, which as you say, for you means modifying a sensitive constellation of potentially incompatible drivers from multiple vendors. Any company can only control its own products. When you mix and match drivers, conflicts are certainly possible. We always try to improve our drivers either by fixing functional bugs and improving performance, and I'm sure the same is true for other companies. Unfortunately, it’s not uncommon that new software can introduce regressions. It is really your call to evaluate and to manage the risk of a driver update. Depending on your tolerance for hunting down problems.

Do this when you have time to explore your entire system to determine what might be causing problems, if any arise. Keeping in mind of course that incompatibility between drivers doesn't mean the newest is at fault, or the older. You may want to avoid betas and stick with final releases, unless you're being very experimental.

Finally, advice from our engineering team. OpenCL supports the co-existence of multiple OpenCL platforms. The key is to ensure that the ICD (installable client driver) being loaded (libOpenCL.so or OpenCL.dll ) by the system supports the latest OpenCL standard among all the platforms. For example, if a user wants to have an AMD GPU along with an Nvidia GPU on the same system, it’s recommended to use the ICD from AMD since the AMD platform supports a more recent version of OpenCL. This isn't going to guarantee success, but it should minimize problems based on the simple principle that it is impossible to be forward compatible, but the latest driver should be backward compatible.

I hope this helps provide some insight.

0 Likes