AnsweredAssumed Answered

GL4.3 Storage Buffer bug?

Question asked by piranha on Jul 24, 2013
Latest reply on Jul 24, 2013 by piranha



We using compute shaders to calculate fractal noise. The problem is, that we can't compute larger inputs than 4096 vectors. After we exceed this line, the shader returns the same value for all remainders, or skips it.

We made a video showing this on a visual output:

(external link: amd compute shader bug - YouTube)


As you can see, if the resolution exceeds 64x64 (4096) it breaks.

When we have no input buffer (means just creating random test values in shader, and putting them into the output buffer), we don't get these problems at all. So I'm pretty sure its input buffer bug.


Here's how we run it from input to the finished result:

        public float[] GetValues(Vector4[] input) { // Takes Vec4 1D array as input. Finished noise is 1D float array as output
            // Generate Input Buffers
            int inBuffer = GL.GenBuffer(); // First buffer contains the vec4 data and is our "problem child"
            GL.BindBuffer(BufferTarget.ArrayBuffer, inBuffer); // No difference using ArrayBuffer or ShaderStorageBuffer
            GL.BufferData(BufferTarget.ArrayBuffer, new IntPtr(Vector4.SizeInBytes * input.Length), input, BufferUsageHint.StaticDraw);
            GL.BindBufferBase(BufferTarget.ShaderStorageBuffer, 0, inBuffer); // Bind buffer to shader location 0

            int inPermBuffer = GL.GenBuffer(); // Second input is permutation data, its size is only a kb and makes no problems 
            GL.BindBuffer(BufferTarget.ArrayBuffer, inPermBuffer);
            GL.BufferData(BufferTarget.ArrayBuffer, new IntPtr(sizeof(int) * permutation.Length), permutation, BufferUsageHint.StaticDraw);
            GL.BindBufferBase(BufferTarget.ShaderStorageBuffer, 1, inPermBuffer); // Bind buffer to shader location 1

            //Generate Ouput Buffer
            float[] result = new float[input.Length];
            int outBuffer = GL.GenBuffer(); // The buffer which contains the result
            GL.BindBuffer(BufferTarget.ArrayBuffer, outBuffer);
            GL.BufferData(BufferTarget.ArrayBuffer, new IntPtr(sizeof(float) * input.Length), result, BufferUsageHint.StaticCopy);
            GL.BindBufferBase(BufferTarget.ShaderStorageBuffer, 2, outBuffer); // Bind buffer to shader location 3 
            // Start compute
            GL.DispatchCompute((int)Math.Ceiling(input.Length / 256.0), 1, 1);
            // Getting pointer to result data
            IntPtr outBufferPointer = GL.MapBuffer(BufferTarget.ShaderStorageBuffer, BufferAccess.ReadOnly);
            // Copy the result to our managed result variable;
            Marshal.Copy(outBufferPointer, result, 0, input.Length);
            // Exiting buffer access
            // Clean up

            return result;


It's C# code, but it easy to read for c++ guys anyway


Here is how we implemented our buffers in GLSL and  how we acces them in the main function:



#version 430 core

struct vertex
  vec4 pos;

layout(std430,binding = 0) readonly buffer iBuffer
  vertex Vectors[];

layout(std430,binding = 1) readonly buffer pBuffer
  int Permutation[];

layout(std430,binding = 2) writeonly buffer oBuffer
  float Output[];

layout (local_size_x = 256) in;

void main() { 
    vec4 in_pos = Vectors[gl_GlobalInvocationID.x].pos;      
    Output[gl_GlobalInvocationID.x] = /*Tons of instructions using in_pos here*/  ;


We are experimenting for 2 days now to get this working on AMD cards (NV -> no problems at all).

Are we using the storage shaders wrong, or is this really a memory bug on AMD cards ?


Hardware: 7970

Driver: Tested with latest stable and latest beta driver.


I appreciate all comments