Archives Discussions

dragonxi4amd · ‎11-29-2011

Partitioning recommendations and available options from the latest AMD GPU ?

Hi,

Referring to clCreateSubDevices in http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf

1) CL_DEVICE_PARTITION_EQUALLY

n compute units recommended by AMD for its newest GPU ?

2) CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN

AMD's recommendations based on cache in the newest AMD GPU ?

~ Ronnie

nou · ‎11-29-2011

currently no AMD GPU support device fission. so this question is irelevant.

dragonxi4amd · ‎12-01-2011

Sorry, we and what is more important our clients don't think this question is irrelevant.

What is the highest/maximum number of compute units available from the latest AMD GPU at the momemt ?

NVIDIA could provide this information, what is your problem ?

Does anyone else in this forum know ?

Hopefully this message will not be "merged" (deleted) by AMD as has

happened with our questions (for example our image object question were deleted by AMD - propably because images were not yeat ready)!

Just give the figure, please !

Ronnie

Raistmer · ‎12-01-2011

In my understanding HD6xxx should support device fission in hardware, at least it looks so from hardware description. But probably this feature still not exposed in AMD's OpenCL runtime so you can't really use it. That's why the question is "irrelevant". There are GPUs with 22 or even more number of compute units, but again, with device fission not supported in runtime you can't partition them.
As usually, AMD's great hardware abilities become just hype because we can't really use them w/o support in software...

nou · ‎12-01-2011

yes relevant question is when will AMD support device fission on GPU. maybe in upcoming SDK 2.6 but don't hold your breath.

Meteorhead · ‎12-01-2011

Really, these are the type of features the industry requires to become standard, in order for people to start utilizing it widespread. If I were a game developer, I would be restricted to using constructs that all (both) vendors support, because my app would have to be able to run on all 'modern' machines. HD5000 and below won't disappear for another 3 years, so naturally, no game will feature physics with collision detection on GPU with dedicated proportion of the GPU reserved for this task if SW don't support it.

CPU usage is neat also for calculating cush things, but the really cool stuff would be to use GPU for these things.

Meteorhead · ‎12-02-2011

Forgive me for posting a new, instead of editing, but as Raistmer has said, cool features inside the HW, but they remain hype, because we can't use them. Since HD5000 (or even 4000) we've had HW implemented Global Data Share (GDS) and the possibility through it to do very efficient Global Wave Sync (GWS). These would be kickass features that we could use to make the world turn the other way around.

Micah has started implementing these before SDK 2.5, and I was very much hoping he would finish it. Unfortunately it did not make it into the features list, and I have a fear that since he's got many more important things to do, he put the project aside.

But really, are there more important things? Now I'm not talking about GWS or GDS specifically, but the reason why AMD is always just "catching" up to CUDA is that OpenCL is the "common denominator" so to say, and there are many features on both sides that the HW is capable of, but cannot be leveraged due to spec restrictions. Really, this is what vendor specific extensions are for on platform and kernel level!

If just for once, AMD could implement something that NV could only emulate, even if they decided to implement it, that could be one more reason, to stick with OpenCL and AMD. Extremely efficient Global Atomics, the possibility to access GDS, and the use of GWS instead of finishing kernel and issuing new kernel launch... These would be kickass features that have been implemented inside the chip for years now.

This is my biggest fear with AMD, that there are too many things to implement, and visibly there are not enough people to do it, so naturally HW becomes obsolete by the time features get implemented (if they are EVER implemented).

I cannot even imagine in my wildest dream that AMD would implement such a complex feature like UVA, where one buffer could span over devices, and devices could access each others memory (although HD7000 would be capable of it, as far as I understood the slides), because I know that there will never be anough capacity to implement kick@ss features like this. But what I meant was that many little features could compensate for the lack of big features, that are missing. GDS and GWS could be the first, as it was already given some thought and had work put into it already.

There are still 2 weeks till SDK 2.6. Micah, if you could find the time to finish implementing it, or find someone who has the time, that would be real neat. But yet again we're back at the point where we started, that there are so many bugs presently, that they have a higher priority. And because there are not enough developer, all are stuck with bug fixing instead of making new features... (and the circle continues)

himanshu_gautam · ‎12-02-2011

Hi All,

Thanks Meteorhead for the generous feedback.

There is a lot of processes the SDK has to go through before getting shipped.

dragonx4amd,

You have access to all the compute units of the GPU hardware. It is just that the partitioning of it is not yet supported. Well device partition is now the part of core specification(1.2), which increases the hope of getting it.

dragonxi4amd · ‎12-05-2011

Thanks for your feedbacks guys !

As I said before our company is NOT in position to tell our customers that "we don't know / our partners don't know /
schedule is totally unknown / we don't have any idea about GPU hardware features".

Also we can NOT afford to WAIT something to become available from any GPU vendor before starting to design and develop.

For example we are designing apps NOW trusting that clCreateSubDevices feature will be available.
Whether AMD can provide clCreateSubDevices feature when it is needed is AMD's headache, there are other GPU vendors
and also those developing other OpenCL compatible devices. NOBODY is WAITING, whether hardware or software developer!

Referring to OpenCL specs 4.3 Partitioning a Device p. 49

"clCreateSubDevices, creates an array of sub-devices that each reference a non-intersecting set of compute units within
in_device, according to a partition scheme given by properties"
"properties specifies how in_device is to be partition described by a partition name and its
corresponding value"

My question is as follows: can any of the above AMD guys give us tips how to setup properties for AMD GPUs ?

AMD Developer Forum File:

Benedict Gaster, AMD
Bill Licea Kane, AMD
Ed Buckingham, AMD
Jan Civlin, AMD
Laurent Morichetti, AMD
Mark Fowler, AMD
Michael Houston, AMD
Michael Mantor, AMD
Norm Rubin, AMD
Ofer Rosenberg, AMD
Victor Odintsov, AMD

How and where can I reach above AMD specifiers ?

If not, why not in this forum ?

Note, that for apps to scale the feature to partition devices is mandatory.

And for AMD developers our message is that in many case the first ok version
does NOT have to be complete and fully tested, sometimes even skeleton is ok to check links/build!

Thanks in advance
Ronnie

himanshu_gautam · ‎12-05-2011

Well, actually device partition was a extension as of opencl 1.1 and now it is moved inside core specification. OpenCL 1.2 spec is just released and no current hardware can be OpenCL 1.2 compliant so early. Implementing OpenCL 1.2 is surely in AMD's roadmap, but the details can't be revealed.