Recently I noticed that one of my compute shaders does not work correctly.
After some trial and error I reduced it to following code that still exhibits the fundamental problem:
When there are more than one used SSBOs the imageLoad function returns 0 and the imageStore function does not write.
This is the shader
#version 430
layout(std430, binding=0) buffer A { float a[]; };
layout(std430, binding=1) buffer B { float b[]; };
layout(r32f) uniform image2D image;
layout(local_size_x = 1) in;
void main() {
// b.length();
a[0] = -2;
imageStore(image, ivec2(gl_GlobalInvocationID.xy),
vec4(a[0], 0, 0, 0));
a[1] = imageLoad(image, ivec2(gl_GlobalInvocationID.xy)).r;
}
This shader is used in a program that prints the contents of SSBO A and image before and after the dispatch call.
Using the above shader as is, I get the output
textureSize: QSize(2, 2) INITIAL_BUFFER_VALUE: 2 INITIAL_TEXTURE_VALUE: 1 PRE_READBACK_CPU_TEXTURE_BUFFER_VALUE: 3 SHADER_FILLED_VALUE: -2 GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS: 8 Buffer values: 2 2 Texture values: 1 1 1 1 glDispatchCompute(2, 2, 1) glMemoryBarrier(GL_ALL_BARRIER_BITS); Buffer values: -2 -2 Texture values: -2 -2 -2 -2 |
which is what I expect.
However, when
b.length();
on line 11 is commented in, the output changes to (first change after glMemoryBarrier):
textureSize: QSize(2, 2) INITIAL_BUFFER_VALUE: 2 INITIAL_TEXTURE_VALUE: 1 PRE_READBACK_CPU_TEXTURE_BUFFER_VALUE: 3 SHADER_FILLED_VALUE: -2 GL_MAX_COMPUTE_SHADER_STORAGE_BLOCKS: 8 Buffer values: 2 2 Texture values: 1 1 1 1 glDispatchCompute(2, 2, 1) glMemoryBarrier(GL_ALL_BARRIER_BITS); Buffer values: -2 0 Texture values: 1 1 1 1 |
So the shader is still running (no errors are reported!) but the texture is not being written to (Texture values after dispatch are still 1) and the value that is read from it is 0 (as stored in the second element of the buffer).
I could produce this output on Arch Linux using Catalyst 15.5 or 15.7 on my (old) HD 5770.
My questions are:
1. Am I doing something wrong here?
2. Can anyone reproduce this problem? (I have attached the code. It requires Qt5, C++11 and probably GCC)
3. Is this a driver bug?