cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

bananafish
Adept I

Are dynamically-indexed subroutine arrays in GL4 on HD5xxx hardware half-possible?

To my astonishment, the following successfully compiles in GLSL 410 on a Radeon HD 5570 using FGLRX-updates 2:9.000 ubuntu:

#version 410

subroutine uniform testSR testSub[2];

/**

Code here setting up the subroutines for testSR

**/

void main()

{

testSub[gl_VertexID%2]();

}

...however this resulted in execution only of the subroutine at array element 0. My understanding is that instructions within wavefronts are in lock-step and cannot truly branch so I did not expect this to compile at all, or perhaps it would be compiled to the equivalent to a switch() block and inline the subroutines the old fashioned way. When a constant is passed as array index, implicit or otherwise, the array call for that index works.  For example:

void main()

{

testSub[1]();

}

..correctly results in subroutine behavior of the subroutine specified at index 1.

Why is this? Is this a bug?

The test code above was inspired by this posting. The wording was confusing but the idea of using dynamically-indexed subroutine arrays seemed impossible in GPU hardware as it would imply that each thread could follow its own instruction path without traditional clause-lockouts.

Driver info below, in case this is a bug:

sudo modinfo fglrx_updates

filename:       /lib/modules/3.5.0-18-generic/updates/dkms/fglrx_updates.ko

license:        Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY

description:    ATI Fire GL

author:         Fire GL - ATI Research GmbH, Germany

srcversion:     9C5BC5DE95ACE501DA51B24

...

vermagic:   3.5.0-18-generic SMP mod_unload modversions
0 Likes
1 Solution
gsellers
Staff

Hi,

No, dynamically non-uniform expressions are not supported. The index used to look up into the array of subroutine uniforms is always taken from lane zero of the wavefront. There is no requirement that the shader compiler fail to compile the shader because determining that an expression is not dynamically uniform is virtually impossible. For example, you could read from a texture containing all black and end up with a uniform expression. The shader compiler has no way to know that you're going to do this although it is a (highly contrived) use case.

There is no bug here. The shader is correctly compiled to execute a jump into an indexed array of subroutines, but that index is always taken from lane zero.

In the example you posted, gl_VertexID % 2 is always zero for lane zero of any wavefront, regardless of the size of the draw. If you were, instead, to write gl_VertexID % 3 (and expand the array size appropriately), you would see subroutine zero executed for the first 64 vertices, subroutine 1 executed for the second 64 vertices, subroutine 2 executed for the third 64 vertices and so on.

Cheers,

Graham

View solution in original post

0 Likes
3 Replies
gsellers
Staff

Hi,

No, dynamically non-uniform expressions are not supported. The index used to look up into the array of subroutine uniforms is always taken from lane zero of the wavefront. There is no requirement that the shader compiler fail to compile the shader because determining that an expression is not dynamically uniform is virtually impossible. For example, you could read from a texture containing all black and end up with a uniform expression. The shader compiler has no way to know that you're going to do this although it is a (highly contrived) use case.

There is no bug here. The shader is correctly compiled to execute a jump into an indexed array of subroutines, but that index is always taken from lane zero.

In the example you posted, gl_VertexID % 2 is always zero for lane zero of any wavefront, regardless of the size of the draw. If you were, instead, to write gl_VertexID % 3 (and expand the array size appropriately), you would see subroutine zero executed for the first 64 vertices, subroutine 1 executed for the second 64 vertices, subroutine 2 executed for the third 64 vertices and so on.

Cheers,

Graham

0 Likes

Thanks for clarifying; is there a publicly available PDF which documents these behaviors?

0 Likes

I don't know if it will answer all of your questions, but this whitepaper could be an interesting read.

http://www.amd.com/us/Documents/GCN_Architecture_whitepaper.pdf

Cheers,

Graham