I'm starting my second project using openCL and am curious about others opinions and experience with buffer size strategy. Here is my question. For kernels that use small amounts of input data and as a result might require many kernel calls from the host, is it worth it to set up a "managed buffer" system that packs the data into larger buffers and handles fragmentation, etc..? Or is it better to just create alot of smaller buffers as needed? Thanks.
Solved! Go to Solution.
As regards fragmentation, one approach is to allocate a population of buffers on the GPU and then simply re-use them over the lifetime of the app. I've done this, using a circular buffer scheme (for as many as 30 or 40 buffers), for runs over several hours without any trouble.
Separately, if you have lots of small operations that all run independently, then it's definitely worth packing into super-buffers - e.g. I've done matrix-matrix and matrix-vector multiplications on n=512 sized objects packed into super-buffers for a substantial speed-up.
The fact is all these techniques are part of the learning curve. Whether to use buffers or textures. What layout shape for super-buffers. Zero-padding. etc.
Hi mflamer,
It depends, and you can learn from the samples, it is not absolute.
Thank you.
Thanks Wenju. Can you point me to any specific examples that might express some options for this subject? I'm pretty familiar with openCL in general, I just don't have the range of experience to judge whats best for performance in some cases.
As regards fragmentation, one approach is to allocate a population of buffers on the GPU and then simply re-use them over the lifetime of the app. I've done this, using a circular buffer scheme (for as many as 30 or 40 buffers), for runs over several hours without any trouble.
Separately, if you have lots of small operations that all run independently, then it's definitely worth packing into super-buffers - e.g. I've done matrix-matrix and matrix-vector multiplications on n=512 sized objects packed into super-buffers for a substantial speed-up.
The fact is all these techniques are part of the learning curve. Whether to use buffers or textures. What layout shape for super-buffers. Zero-padding. etc.
Jawed,
Thanks a lot. That was exactly the type of input I was hoping for.