i want run 20 "Monte Carlo Simulation" parallel on an HD5870 Card.
Because the simulation is complex, with many branches and loops and random behavior, i think one execution pro one stream processor is ok.
As the HD5870 Card has 20 stream processors, i got 20 parallel runs.
I hear, that the new 5870 Chip has the ability to run different kernels at the same time.
But how can i do this.
Exist a demo to run several kernel at the same time?
Sorry for my bat englisch.
Thanks in advanced
Originally posted by: egonotto I hear, that the new 5870 Chip has the ability to run different kernels at the same time.
Hi, egonotto. Where did you hear that?
Concurrent executing kernels is one key feature of Nvidia's 'Fermi', while AMD seems does not mention it.
I am one of the hunters for this new feature.
i have 2 sources.
In the german ct (http://www.heise.de/ct/artikel/Fermis-goldene-Regel-811487.html) there is in an article over the new Fermi an sentence:
"Bei ATIs RV8xx soll nach den Angaben von AMDs Direktor für Stream Computing, Patricia Harrell, ebenfalls die parallele Ausführung möglich sein, in den bislang veröffentlichten Unterlagen findet man zum Thema „concurrent kernels“ allerdings kein Wort, vielleicht ist das Feature bei ATI einfach selbstverständlich"
The information in ct is from AMD's Direktor for Stream Computing Patricia Harrel.
It should be a good soure.
The other is in an article from internet olso about Fermi (http://techreport.com/articles.x/17670/2) .
There is a sentence:
"(Incidentally, AMD tells us its Cypress chip can also run multiple kernels concurrently on its different SIMDs. In fact, different kernels can be interleaved on one SIMD.) "
It's actually offical that AMD's 5800 cards support concurrent kernel execution:
I find it interesting though that AMD didn't step up to respond to this question. Why would they be hush hush about a superior feature their product has???
BTW, AFAIK OpenCL makes no assumption to the number of kernels executed concurrently on a device. If the command-queue is in the out-of-order execution mode then the runtime is free to issue multiple kernel commands at the same time (suppose they are not waiting for some event).
AMD was quiet about this (concurrent kernel) feature probably because their CAL driver doesn't support it yet. However, I believe it is necessary for things like Eyefinity to work.
I remember now, it exists in DirectX 11 as better multi-threading for graphics rendering
Still has to check it in DirectX Compute though, all DirectX interface is a whole different universe to me...
so any updates? am very interested on how the r8xx concurrently processes kernels, as a "YES" in a slide isnt enough you know
I mean like how many kernels per SIMD (IIRC, fermi does 2), and if the programmer can control such behavior, maybe like egonotto's way, or if there is a more optimized way? if any.
sorry for being a "??????", but with all that GFLOPS blazing, one gets very curious. Speaking of curiosity, when can we expect a R8XX ISA reference?
you mentioned you were waiting for confirmation from hardware engineers. any word? I think quite a few of us are wondering not only if 58xx cards, but also - say - 57xx cards, or any other r8xx gpus, support concurrent kernel execution. this would make ATI gpus very attractive as opposed to fermi, especially if even the lower end 57xx cards support it. this is a major selling point for ATI - information on support for concurrent kernel execution - should be easily accessible to developers/system builders 🙂
(edited because i misread the name of the Micah)
been going through the R7xx ISA, and i suppose its 4 for R8xx. since R7xx has odd and even wavefronts (so 2 on the fly). and RV870 looks like 2 RV770 sticked together, so following the "having 2 of everything in cypress" theme, IMHO suppose (again) the magic number is 4.
the only case its still 2 that the thread scheduler was excluded from the X2 theme.
kinda off-topic: its becoming quite amusing to dig for facts in ATi's ISA documents!