Archives Discussions

albeld · ‎04-27-2011

Running GPU computation in a real-time environment

I'm currently working on a Master's thesis in Computer Engineering, with the main subject of GPU Computing with Real-Time requirements.

As part of my work, I'm evaluating the feasibility of moving computation from a DSP to a GPU in an industrial Monitoring and Control application.

Does AMD/ATI plan to support any Real-Time operating systems through OpenCL or other interfaces? Is there any documentation on the current time consumption of operations through the AMD/ATI OpenCL provider? Is there any specification on deterministic time consumption for AMD/ATI GPUs/Accelerators?

Does the AMD linux drivers for GPU computation work with an RTLinux/RTAI/... modified kernel?

MicahVillmow · ‎04-27-2011

Except for the last question, which is an I don't know, the rest of your answers are no. It isn't that it won't work, it might not, it is that we don't support or test on these types of systems and don't have any documentation for these types of systems. Also a lot of the operations in our OpenCL stack when it comes to the GPU are asynchronous, so time consumption can very drastically with the current system state.

albeld · ‎04-28-2011

Thank you for your reply!

What about lower-level interfaces to the GPU? Does AMD's open source driver commitment extend to the programmable bits of the architecture as well, such that RTOS vendors might provide their own drivers at some point?

ScacPyuf6Ob1 · ‎04-28-2011

Why not just measure these operations under every system condition? Please keep in mind the real-time data processing is a scheduling problem.

I would like to suggest a paper to you:

Uri Verner, Assaf Schuster, Mark Silberstein: Processing Data Streams with Hard Real-time constraints on heterogenous system.

Regards,

S

albeld · ‎04-28-2011

Originally posted by: ScacPyuf6Ob1 Why not just measure these operations under every system condition? Please keep in mind the real-time data processing is a scheduling problem.

Yes, and the most pressing requirement is just that, being able to guarantee schedulability of a periodic task. Measuring the execution times as you suggest depends on the tasks being deterministic, i.e. a guarantee that they behave the same each time they are called or at least in a predictable manner. Is there any such guarantee?

Where can I find documentation on which extra tasks (e.g. memory initialization) may be performed implicitly by the system as opposed to explicitly by the programmer? Could, for example, a call to clmalloc() take longer the first time it's called? Every 64 times? Whenever some backend memory area needs to be enlarged?

Originally posted by: ScacPyuf6Ob1

I would like to suggest a paper to you:

Uri Verner, Assaf Schuster, Mark Silberstein: Processing Data Streams with Hard Real-time constraints on heterogenous system.

Regards,

S

A very good suggestion, Thank you!

ScacPyuf6Ob1 · ‎04-28-2011

Well, if you want guarantied responses you have to choose different data structures and algorithms to do that. Every execution path must be bound with a specified WCET. I am 99% sure that the current implementation does not target this. This will require a whole new driver architecture and graphics stack implementation.

There are special OpenGL HW/SW implementations (OpenGL Safety Critical) that guarantie these timed responses (it will cost your arm and leg) but there is no OpenCL Safety Critical profile yet, at least publically. You should ask Khronos about OpenCL SC plans.

Regards,

S

albeld · ‎04-29-2011

Originally posted by: ScacPyuf6Ob1 Well, if you want guarantied responses you have to choose different data structures and algorithms to do that. Every execution path must be bound with a specified WCET. I am 99% sure that the current implementation does not target this. This will require a whole new driver architecture and graphics stack implementation.

There are special OpenGL HW/SW implementations (OpenGL Safety Critical) that guarantie these timed responses (it will cost your arm and leg) but there is no OpenCL Safety Critical profile yet, at least publically. You should ask Khronos about OpenCL SC plans.

Regards,

S

Thank you very much, this is exactly the kind of answer I was looking for. I'll get in touch with Khronos.

Thank you both for your helpful information!

MicahVillmow · ‎04-28-2011

albeld,
AMD publicly discloses its register documents here:
http://developer.amd.com/docum...default.aspx#open_gpu

So the documentation is out there to write a RTOS driver, most likely it would be better to piggyback off the gallium driver work.

diepchess · ‎04-29-2011

hi,

Originally posted by: albeld I'm currently working on a Master's thesis in Computer Engineering, with the main subject of GPU Computing with Real-Time requirements.

As part of my work, I'm evaluating the feasibility of moving computation from a DSP to a GPU in an industrial Monitoring and Control application.

Most interesting, what sort of DSP?

Does AMD/ATI plan to support any Real-Time operating systems through OpenCL or other interfaces? Is there any documentation on the current time consumption of operations through the AMD/ATI OpenCL provider? Is there any specification on deterministic time consumption for AMD/ATI GPUs/Accelerators?

I'm running the Realtime Kernel at linux with OpenCL at the videocard. Please realize you can already see the current kernels at all the gpu's, be it nvidia or amd, as realtime kernels, as they can switch very quickly.

Fastest switch latencies of the linux realtime kernel are pretty ugly. It's roughly about underneath 70 microseconds or so.

Also realtime kernels do really bad for example in networking; all TCP/IP and even all UDP/RAW traffic goes with every read and write or packet via a lock in the linux kernel. So also the realtime kernel is locking every action.

In that sense the GPU's are doing better.

Now a problem however; that's stability. GPU's are not so stable. They are very delicate and cheap number crunching hardware. For sustained control and monitoring, using GPU's is very risky.

These cards eat far too much power to even be able to use.

Compare it with the powerlines. The big ones that are high in the air here which carry between 120Kv and 400Kv are completely fail safe in this sense, that they can just drop and then either the other set takes over or some other line takes over.

If we look however to what's in the ground, it's overclocked a lot. So 50 years ago put cables in the ground that supply small towns or a single small factory, was like 10Kv cables but in reality they put already 25Kv on it.

That's outside of original specs, so sometimes they blow up. Happens seldom, but you can't garantuee anything then. They just take the risk. It's highschool boys with respect to safety of course.

Same with GPU's. They are total overpowered. Eat more power than PCI-E. We now see reports the GTX590 of nvidia at some games eats 50 watt more than the radeon 6990. There is no limits. This eats 450 watt or more

diepchess · ‎04-29-2011

Bah again text i wrote disappeared. Very buggy forum.

I type this from OS/X using safari.

One note on linux. Linux is total outdated OS of course just like all other OS-es for todays processors. They are all monolithic kernels that lock too much.

Sometimes they hack something away. But fundamentally seen it's total unsafe, any driver can do anything, control anything, and hack you. Realtime kernel extensions or not don't adress any of those issues. So they don't touch the locking of the UDP/RAW sockets nor TCP/IP.

So it's a joke to call this a realtime kernel that linux extension. Compared to that the stuff the gpu's do is pretty much realtime, as they can't centrally block cores unlike the cpu's are doing all the time.

That's why there is so many drivers to 'fix' that problem in the kernel for realtime hardware manufacturers, which make so much cash with that in the financial world.

So patching the kernel a tad to a realtime kernel is not a problem at all of course for the ATI-AMD drivers.

Yet realize how unreliable GPU's are. It's lik emany supercomputers. Those are very very instable platforms those supercomputers. SGI crashed a lot with all its supers, especially when using intel, but just as well when using the MIPS processors it did.

GPU's get made for kids and are inherently instable when used at a sustained manner. It's funny you are interested then in realtime behaviour of the gpu.

What you can not fix is the hardware latency from cpu to gpu. That will of course always suck.

Vincent

diepchess · ‎04-29-2011

PCI-E specs are like eating up 300 watts, the 6990 is over 450 watts or so when using in a sustained gpgpu manner doing 'realtime work'.

So that's far beyond the limits of the spec.

There is no way that in a realtime environment any engineer can accept using gpu's.

They can deliver huge calculation power at a low price, thanks to all the kids and teenagers using them massively to game. The kids overclock with software already those gpu's too much and they in the first place already get clocked too high by the manufacturers.

How are you going to convince *ever* a manager that a gpu can be used in a sustained manner in a realtime environment?

One thing you can never fix of course is the millisecond latency or so you'll get when you need latency from the CPU to the GPU.

Knowing they fixed the realtime latencies from the linux kernel from 0.5+ milliseconds to under 70 microseconds to then communicate at a latency of 1 millisecond to the gpu is pretty funny

Vincent

albeld · ‎05-02-2011

Thank you for your insight, Vincent.

Originally posted by: diepchess

I'm running the Realtime Kernel at linux with OpenCL at the videocard. Please realize you can already see the current kernels at all the gpu's, be it nvidia or amd, as realtime kernels, as they can switch very quickly.

The designation "Real-Time" depends not in switching fast (that's quite a moving target!) but on being predictable. Essentially, you don't want to trust your run tests to guarantee that the system will never be too slow. You want to be able to guarantee it with math.

Originally posted by: diepchess

Yet realize how unreliable GPU's are. It's lik emany supercomputers. Those are very very instable platforms those supercomputers. SGI crashed a lot with all its supers, especially when using intel, but just as well when using the MIPS processors it did.
GPU's get made for kids and are inherently instable when used at a sustained manner. It's funny you are interested then in realtime behaviour of the gpu.

The scientific community seems to disagree. There are plenty of large clusters using GPUs and accelerators based on the same architectures, all running sustained computation. Timing might be unstable, but the results can certainly be said to be otherwise. If you boot up a graphics-intensive game and set the view to one and the same scene (i.e. every frame rendered is roughly the same) and watch it for a while, do you get drops in framerate (unstable performance)? I don't.

Originally posted by: diepchess

What you can not fix is the hardware latency from cpu to gpu. That will of course always suck.

Sure, as long as the GPU is on the PCI-e bus. What about AMD Fusion and Intel's Sandy Bridge, which include an APU (AMD) or a GPU (Intel) directly on the CPU die? That ought to cut down on the latencies of data transfer, since both can presumably use the same RAM. Admittedly, the Intel solution does not seem to have any GPGPU capabilities, but Fusion is explicitly geared towards heavy number crunching.

Originally posted by: diepchess

How are you going to convince *ever* a manager that a gpu can be used in a sustained manner in a realtime environment?

To be fair, my manager asked me to perform this prestudy. Part of that is looking at future prospects for doing what can't quite be done today (e.g. Hard Real-Time with the support of a GPU).

Edit:

Originally posted by: diepchess

Bah again text i wrote disappeared. Very buggy forum.

Yes, I received an E-mail update with one of your posts which does not appear here on the forum. I see now that you were referring to the reliability of the actual hardware, and not the processing. That is certainly something to take note of and evaluate before rolling out a system with GPUs in a critical function.

As for the (lacking) accuracy of the results, do you have any sources on your claims? Is it only for overclocked cards? Cards run outside spec? Cards run at excess temperature for longer durations? I already know about the single- vs. double-precision float problem and any deviations from IEEE754 are bound to be documented.

Archives Discussions

OpenCL and RTOS