Archives Discussions

mflamer · ‎04-10-2012

I'm starting my second project using openCL and am curious about others opinions and experience with buffer size strategy. Here is my question. For kernels that use small amounts of input data and as a result might require many kernel calls from the host, is it worth it to set up a "managed buffer" system that packs the data into larger buffers and handles fragmentation, etc..? Or is it better to just create alot of smaller buffers as needed? Thanks.

Jawed · ‎06-09-2012

As regards fragmentation, one approach is to allocate a population of buffers on the GPU and then simply re-use them over the lifetime of the app. I've done this, using a circular buffer scheme (for as many as 30 or 40 buffers), for runs over several hours without any trouble.

Separately, if you have lots of small operations that all run independently, then it's definitely worth packing into super-buffers - e.g. I've done matrix-matrix and matrix-vector multiplications on n=512 sized objects packed into super-buffers for a substantial speed-up.

The fact is all these techniques are part of the learning curve. Whether to use buffers or textures. What layout shape for super-buffers. Zero-padding. etc.

View solution in original post

Wenju · ‎06-08-2012

Hi mflamer,

It depends, and you can learn from the samples, it is not absolute.

Thank you.

mflamer · ‎06-09-2012

Thanks Wenju. Can you point me to any specific examples that might express some options for this subject? I'm pretty familiar with openCL in general, I just don't have the range of experience to judge whats best for performance in some cases.

Jawed · ‎06-09-2012

As regards fragmentation, one approach is to allocate a population of buffers on the GPU and then simply re-use them over the lifetime of the app. I've done this, using a circular buffer scheme (for as many as 30 or 40 buffers), for runs over several hours without any trouble.

Separately, if you have lots of small operations that all run independently, then it's definitely worth packing into super-buffers - e.g. I've done matrix-matrix and matrix-vector multiplications on n=512 sized objects packed into super-buffers for a substantial speed-up.

The fact is all these techniques are part of the learning curve. Whether to use buffers or textures. What layout shape for super-buffers. Zero-padding. etc.

mflamer · ‎06-10-2012

Jawed,

Thanks a lot. That was exactly the type of input I was hoping for.

Archives Discussions

Buffer Strategy