Consider a triangle made of the following vertices:
class Vertex
{
public:
Vertex(const float x_, const float y_, const float z_,
const uint16_t idx0, const uint16_t idx1)
: x(x_), y(y_), z(z_), idx{idx0, idx1} {}
public:
float x;
float y;
float z;
uint16_t idx[2];
};
std::vector<Vertex> vboData;
vboData.emplace_back(0.0f, 0.0f, 0.0f, 0, 0);
vboData.emplace_back(0.5f, 0.0f, 0.0f, 1, 0);
vboData.emplace_back(0.5f, 0.5f, 0.0f, 2, 0);
with this vertex format:
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, stride, (void*)(0));
glVertexAttribIPointer(1, 2, GL_UNSIGNED_SHORT, stride, (void*)(3 * sizeof(float)));
and a buffer of instancing data:
struct Instance
{
int indices[4];
};
filled this way:
for (int i = 0; i < instanceCount; ++i) {
instances[i].indices[0] = i;
}
Then let's draw four instances of the triangle (displacement and scale are hard-coded in vertex shader for brevity) with the following vertex shader, where color is trivially passed to fragment output:
struct InstancingBuffer { ivec4 indices; };
layout(std140, row_major, binding = 6) restrict readonly buffer Instances
{
InstancingBuffer instances[];
};
layout(location = 0) in vec3 v1;
layout(location = 1) in uvec2 idx;
smooth out vec3 color;
void main() {
int cInst = instances[gl_InstanceID].indices[0] + int(idx.x);
color = vec3(vec2(cInst < 2), 0.0);
vec2 v = v1.xy * 0.2;
v.x += gl_InstanceID * 0.1;
gl_Position = vec4(v, 0.0, 1.0);
}
On Intel, Nvidia and some AMD GPUs we'll get the following image:
For example, this is what I see on my Ryzen 7 4800H integrated GPU and Nvidia 2060 RTX.
But on Radeon RX550 with GL_VERSION 4.5.14761 Core Profile Context 21.30.23.01 30.0.13023.1012 I see this image:
So the sum of instances[gl_InstanceID].indices[0] and int(idx.x) is different from what is expected. Then if we replace the line
int cInst = instances[gl_InstanceID].indices[0] + int(idx.x);
with equivalent
int cInst = instances[gl_InstanceID].indices[0];
cInst += int(idx.x);
the image on Radeon RX550 gets similar to other vendors.
This reproduction is not what the bug originally looked like. Initially it was bone indexing in gpu skinning of skeletal models:
int invIdx = instances[instanceIdx + gl_InstanceID].startIndices[0];
int matIdx = instances[instanceIdx + gl_InstanceID].startIndices[1];
mat3x4 m1 = mat34mul(boneMatrices[matIdx + int(boneIndices.x)], bindInverseMatrices[invIdx + int(boneIndices.x)]);
mat3x4 m2 = mat34mul(boneMatrices[matIdx + int(boneIndices.x)], bindInverseMatrices[invIdx + int(boneIndices.y)]);
and the sum
instances[instanceIdx + gl_InstanceID].startIndices[0] + int(boneIndices.x)
yielded zero regardless of instances[instanceIdx + gl_InstanceID].startIndices[0] and int(boneIndices.x). Indexing anything is unnecessary to see the problem. It was enough to output indices as color with some reasonable range mapped to 0..1. I couldn't reproduce exactly that behavor in the sandbox shader, but the sandbox is still informative.
Here is reproduction executable: https://drive.google.com/file/d/1rl9dRxXH6xLj0TMTUy1LLbd0ha57S4qA/view?usp=sharing
And here is the source code: https://drive.google.com/file/d/1tNStQtQq6VEckKrHcDcdElb9zWhWEWce/view?usp=sharing
I managed to reporduce this problem on all drivers and hardware where my previous reported problem was found, so maybe both problems are present in the same drivers, even if they are unrelated.
The sandbox checks OpenGL errors. In my engine demo app I also used debug context and there were no messages from the driver.