AnsweredAssumed Answered

imageLoad/Store not working when using more than one SSBO in compute shader

Question asked by schulmar on Jul 29, 2015
Latest reply on Jul 30, 2015 by jtrudeau

Recently I noticed that one of my compute shaders does not work correctly.

After some trial and error I reduced it to following code that still exhibits the fundamental problem:

When there are more than one used SSBOs the imageLoad function returns 0 and the imageStore function does not write.

 

This is the shader

#version 430

layout(std430, binding=0) buffer A { float a[]; };
layout(std430, binding=1) buffer B { float b[]; };

layout(r32f) uniform image2D image;

layout(local_size_x = 1) in;

void main() {
  // b.length();
  a[0] = -2;
  imageStore(image, ivec2(gl_GlobalInvocationID.xy),
             vec4(a[0], 0, 0, 0));
  a[1] = imageLoad(image, ivec2(gl_GlobalInvocationID.xy)).r;
}

 

This shader is used in a program that prints the contents of SSBO A and image before and after the dispatch call.

Using the above shader as is, I get the output

 

textureSize: QSize(2, 2)

INITIAL_BUFFER_VALUE: 2

INITIAL_TEXTURE_VALUE: 1

PRE_READBACK_CPU_TEXTURE_BUFFER_VALUE: 3

SHADER_FILLED_VALUE: -2

 

GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS: 8

Buffer values: 2 2

 

Texture values:

1 1

1 1

 

glDispatchCompute(2, 2, 1)

glMemoryBarrier(GL_ALL_BARRIER_BITS);

 

Buffer values: -2 -2

 

Texture values:

-2 -2

-2 -2

which is what I expect.

 

However, when

b.length();

on line 11 is commented in, the output changes to (first change after glMemoryBarrier):

textureSize: QSize(2, 2)

INITIAL_BUFFER_VALUE: 2

INITIAL_TEXTURE_VALUE: 1

PRE_READBACK_CPU_TEXTURE_BUFFER_VALUE: 3

SHADER_FILLED_VALUE: -2

 

GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS: 8

Buffer values: 2 2

 

Texture values:

1 1

1 1

 

glDispatchCompute(2, 2, 1)

glMemoryBarrier(GL_ALL_BARRIER_BITS);

 

Buffer values: -2 0

 

Texture values:

1 1

1 1

So the shader is still running (no errors are reported!) but the texture is not being written to (Texture values after dispatch are still 1) and the value that is read from it is 0 (as stored in the second element of the buffer).

 

I could produce this output on Arch Linux using Catalyst 15.5 or 15.7 on my (old) HD 5770.

 

My questions are:

 

1. Am I doing something wrong here?

2. Can anyone reproduce this problem? (I have attached the code. It requires Qt5, C++11 and probably GCC)

3. Is this a driver bug?

Attachments

Outcomes