3 Replies Latest reply on May 10, 2011 12:28 AM by frali

    [GLSL] Wrong behavior when using simultaneously sampler2D and samplerCube (ATI hardware only)

    pds_civitec
      Driver bug or misuse ?

      Hello,

      I have been developping shaders with texturing functions on Nvidia hardware, and for overall software performance matters I need to make them run on AMD hardware.

      However, I have encountered a problem specific to AMD hardware that I currently analyse as a bug driver - unless it is a misunderstanding of GLSL specifications - so I would be grateful for any advice upon the matter.

      To simplify the problem, let us say that I would like to render materials that may be textured using a sampler2D object or a samplerCube object, the texture type to be used is passed through a uniform variable, and a if/else statement calls either texture2D or textureCube when desired.

       

      Here is a brief example of what I would intend to do


       

      #version 120

      uniform sampler2D Texture0;
      uniform samplerCube Envmap0;
      uniform vec4 is_cube;

      void main()
      {
      vec4 texel;
      if (is_cube[0] == 0.0)
      texel = texture2D(Texture0, gl_TexCoord[0].st);
      else
      texel = textureCube(Envmap0, gl_TexCoord[0].xyz);

      gl_FragData[0] = texel;
      }


       

       

      When I do so, all my texture coordinates get completely messed up. Using GPU ShaderAnalyser, I found a strange thing on compiled code (Example for Radeon HD 5870, similar behaviour for other cards) :

       


       

      ; -------- Disassembly --------------------
      00 ALU_PUSH_BEFORE: ADDR(32) CNT(13) KCACHE0(CB0:0-15)
      0 x: INTERP_XY R1.x, R0.y, Param0.x VEC_210
      y: INTERP_XY R1.y, R0.x, Param0.x VEC_210
      z: INTERP_XY ____, R0.y, Param0.x VEC_210
      w: INTERP_XY ____, R0.x, Param0.x VEC_210
      1 x: INTERP_ZW ____, R0.y, Param0.x VEC_210
      y: INTERP_ZW ____, R0.x, Param0.x VEC_210
      z: INTERP_ZW R0.z, R0.y, Param0.x VEC_210
      w: INTERP_ZW ____, R0.x, Param0.x VEC_210
      2 x: CUBE R0.x, PV1.z, R1.y
      y: CUBE R0.y, PV1.z, R1.x
      z: CUBE R0.z, R1.x, PV1.z
      w: CUBE R0.w, R1.y, PV1.z
      3 x: PREDE ____, KC0[0].x, 0.0f UPDATE_EXEC_MASK UPDATE_PRED
      01 JUMP ADDR(3) VALID_PIX
      02 TEX: ADDR(64) CNT(1) VALID_PIX
      4 SAMPLE R2, R1.xy0x, t1, s1
      03 ELSE POP_CNT(1) ADDR(7) VALID_PIX
      04 ALU: ADDR(45) CNT(4) WHOLE_QUAD
      5 t: RCP_e R0.z, |R0.z|
      6 x: MULADD R0.x, R0.x, PS5, (0x3FC00000, 1.5f).x
      y: MULADD R0.y, R0.y, PS5, (0x3FC00000, 1.5f).x
      05 TEX: ADDR(66) CNT(1) VALID_PIX
      7 SAMPLE R2, R0.yxwy, t0, s0
      06 POP (1) ADDR(7)
      07 EXP_DONE: PIX0, R2
      END_OF_PROGRAM


       

      I do not undersand the line I put in bold/underlined. Why woulds texture 1 coordinates be called ? I asked only texture 0 coordinates to be used.

       

      Is this a bug or a misuse of texture access functions on my side ?

       

      If I take the same shader code and simply remove the texture2D call or the textureCube call, texture coordinates are used as I expect.

      I hope to have given enough material to understand the problem, so please let me know if additionnal information is required.

      Thanks for any help.

       

      Philippe

       

      PS : I also tried using more recent texture calls, using GSLS >= 1.3 formalism, with identical results.

        • [GLSL] Wrong behavior when using simultaneously sampler2D and samplerCube (ATI hardware only)
          frali

          Do you mean R1.xy0x is not correct? Actually R1 is got from
          0 x: INTERP_XY R1.x, R0.y, Param0.x VEC_210
          y: INTERP_XY R1.y, R0.x, Param0.x VEC_210

          And PARAM0 stands for gl_TexCoord[0]. So the disassembly codes are correct.

            • [GLSL] Wrong behavior when using simultaneously sampler2D and samplerCube (ATI hardware only)
              pds_civitec

              Hello,

              Thanks a lot frali for your explanation of the assembly code. I misunderstood the end of the line, with the assumption that t1, s1 would be texture coordinates associated to texture unit 1.

              I am not very good at assembly code, so I have significiant difficulties to understand the  computation process.

              To turn my question in another way, I assume that it should be allowed to use same texture coordinate entry to access either a samplerCube or a sampler2D object.

              I would like as a first step to confirm that the disassembly code functions this way. Could you or somebody else have the kindness to confirm or refutate this, or to help me with a link to the understanding of ATI assembly code ?

               

              Another strange behaviour to try to point out the problem :

              Even when I ask only either a textureCube instruction a the texture2D instruction to be effectively used in the final result, texture coordinates are still messed up.

              Here is such code that I have tested, with still erroneous results.


              #version 120

              uniform sampler2D Texture0;
              uniform samplerCube Envmap0;
              uniform vec4 is_cube;

              void main()
              {
              vec4 texel;
              if (is_cube[0] == 0.0)
              texel = texture2D(Texture0, gl_TexCoord[0].st);
              else
              vec4 toto = textureCube(Envmap0, gl_TexCoord[0].xyz);

              gl_FragData[0] = texel;
              }


              What still surprises me is that when I comment the 'else' part, the disassembly code is slightly different (SAMPLE R1, R0.xy0x, t1, s1 becomes SAMPLE R1, R0.xy0x, t0, s0), and the shader behaves once again in the expected manner. It looks like I'm really missing something here, unless this is really a bug.

              Thanks once again for any ideas upon the matter.

               

              Philippe

                • [GLSL] Wrong behavior when using simultaneously sampler2D and samplerCube (ATI hardware only)
                  frali

                  t1, s1 are not the texture coordinates, they mean texture1 and sample1.

                  When if/else exists, Texture0 bounds to texture unit 1, Envmap0 bounds to texture unit 0. So you find the disassembly codes are "SAMPLE R1, R0.xy0x, t1, s1" for texture2D, R0 is the texture coordinate.

                  When else part is removed, Texture0 is bounds to texture unit 0, so the disassembly codes are "SAMPLE R1, R0.xy0x, t0, s0".

                  Both are correct and make sense. Hope it helpful to you.

                  I don't know if there is a guide for the public users to understand the assembly codes. I could help to ask for it if it exits.