cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Linuxhippy
Adept I

Different memory types / busses in Kaveri?

Hi,

I am currently evaluating Kaveri+OpenCL for image processing tasks, where we have the following requirements:

- For image data highest possible memory bandwidth + zero-copy (but no CPU/GPU coherency or SVM support required)

- For data structures SVM, we would like to be able to pass data-structures with Pointers to the GPU - however for this highest throughput is not required.

I tried to find documentation, and at least for earlies APUs there were different memory busses (onion  and garlic) with very different performance characteristics.

Is there documentation available for Kaveri covering the same topics?

E.g. I would be interested how different memory-mapping flags influence read/write throughput on both (GPU+CPU) sides and which options will yield zero-copy.

Thank you in advance, Clemens

0 Likes
1 Solution

what I miss in the ""AMD OpenCL Programming Optimization Guide" is Kaveri-specific information, everything is kept very general and non-specific.

I agree with you that current optimization guide lacks any specific details related to new generation APUs. We've also realized the same. Our concerned team is working on that.

Isn't there some Prozessor Manual (not limited to the CPU core architecture), which explains how things work in detail?

I'll check if there is any such manual/reference guide available. Meanwhile, you may check here http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/, if anything is useful to you.

Regards,

View solution in original post

4 Replies
dipak
Big Boss

Hi,

You may check the following chapters in "AMD OpenCL Programming Optimization Guide" that cover various data transfer scenario and usage of different memory flags.

1.3 OpenCL Memory Objects

1.4 OpenCL Data Transfer Optimization

There are also two SDK samples named GlobalMemoryBandwidth and BufferBandWidth which are useful for doing some memory bandwidth test.

Regarding the Kaveri APU, here is a nice article http://www.hotchips.org/wp-content/uploads/hc_archives/hc26/HC26-11-day1-epub/HC26.11-2-Mobile-Proce... which may be useful to you.

Regards,

Hi dipak,

Thanks for your response. I knew both documents, however what I miss in the ""AMD OpenCL Programming Optimization Guide" is Kaveri-specific information, everything is kept very general and non-specific. Also, the link you provided me with is only a powerpoint kept very high-level (intertwined with a lot of marketing stuff)

Isn't there some Prozessor Manual (not limited to the CPU core architecture), which explains how things work in detail?

Something in the style of: http://www.ti.com/lit/ds/symlink/tms320c6678.pdf

Thanks & br, Clemens

what I miss in the ""AMD OpenCL Programming Optimization Guide" is Kaveri-specific information, everything is kept very general and non-specific.

I agree with you that current optimization guide lacks any specific details related to new generation APUs. We've also realized the same. Our concerned team is working on that.

Isn't there some Prozessor Manual (not limited to the CPU core architecture), which explains how things work in detail?

I'll check if there is any such manual/reference guide available. Meanwhile, you may check here http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/, if anything is useful to you.

Regards,

I'll check if there is any such manual/reference guide available

Would be great if there is some documentation describing the busses connecting the various components of the APU, their bandwidth/latencies and under which circumstances they are used.

We've also realized the same. Our concerned team is working on that

Glad to hear you are working on improving the documentation in this regard, maybe there.

Thanks, Clemens

0 Likes