7 Replies Latest reply on Jan 11, 2011 12:08 AM by frali

    GLSL strange behaviour

    fetti
      Position of variables matters where it shouldnt...

      Hi there!

      I have written a GLSL fragment shader and pretty quickly found out that if (performance_needed) branching_is_something_to_avoid

      So I wanted to be smart. My code is a flow simulation, reading inputs from sampler2D's. Among other data that are saved in those textures, there is a float value named "obst" that is 0.0 for "no obstacle" and 1.0 for "obstacle":

      read = texture2D(Texture3, gl_TexCoord[0].st);
      float obst;
      obst = read.x;

      Now, further down in main() there is an if-clause:

      fOut = fIn - omega * (fIn-fEq);     // works perfectly,
      if (obst==1.0) fOut
      = fIn[opp];        // but is slow!

      So, in order to avoid that if-clause, I made it a sum, abusing the "obst" value to switch of the term to be neglegted:

      fOut = (1.0-obst)*(fIn - omega * (fIn-fEq)) + fIn[opp]*obst;

      And here comes the catch: the above line does _not_ work. I had a hard time finding out that I have to switch the position of "obst":

      ... + fIn[opp]*obst;       // original code, wont work, is equal to ... + 0.0
      ... + obst*fIn[opp
      ];       // works very well

      Is this by design or is it a feature?
      I am on Ubuntu 10.04, Intel Core 2 Duo, ATI Mobility HD 2600

       

        • GLSL strange behaviour
          frali

          Could you please tell me what fIn and fOut are? A vec? An array? It's strange that the following are both valid.
          fOut = fIn;
          fOut = fIn[Opp];

            • GLSL strange behaviour
              fetti

              Pheww... somehow those brackets were taken out? That may explain why its going italic from that position on...

              Anyway, its all arrays of floats:

              fOut[k] = fIn[k] - omega * (fIn[k]-fEq);   // works perfectly,
              if (obst==1.0) fOut[k] = fIn[opp[k]];       // but slow

              Uh... let me see if this works. I replaced the index from "i" to "k". Obviously the "i" in square brackets is the html command for italic.

                • GLSL strange behaviour
                  fetti

                  Aight, k seems to be a good index. Here is the rest of first message's code:

                  fOut[k] = (1.0-obst)*(fIn[k] - omega * (fIn[k]-fEq)) + fIn[opp[k]]*obst;   // bad
                  fOut[k] = (1.0-obst)*(fIn[k] - omega * (fIn[k]-fEq)) + obst*fIn[opp[k]];   // good

                  To be precise: fEq is a local float (within for-loop), while fIn, fOut, opp and omega are declared globally:

                  float[9] fIn, fOut;
                  int[9] opp;
                  const float omega  = ...

                  Hope that makes more sense now.

                    • GLSL strange behaviour
                      frali

                      When compiled by a simple shader, I can't see any difference between the two ways. It's better to paste out the whole shader or send me it by frank.li@amd.com.

                      You should also retry it with the latest driver - Cat10.12. We had a bug for embedded indice just like "fIn[opp[k]]" long ago. Maybe it's not fixed in Cat10.4.

                        • GLSL strange behaviour
                          fetti

                          Ok. Not today tho. Gonna try to cook this down a little - the shader is rather lengthy and doesnt setup the textures (texture that contains the "obst" values).

                          Thx for looking at it for now.

                            • GLSL strange behaviour
                              fetti

                              Ok I upgraded to 10.12 and bug is gone... shame on me, should 've done that earlier

                              I got another rather boring question about how to get my shader faster. I know that you get loads of those, so I would be fully satisfied with some plausible advice, nothing in-depth analysis.

                              My shader takes 5 textures as inputs (texture units 0-4) and renders into 5 textures via FBO (color attachments 0-4). I do flip those outputs to the inputs, which some call "ping pong shader" or so.

                              I noticed a significant speedup disabling GL_BLEND before running the shader. Makes sense to me, cause it may avoid some costy blend function for writing into the textures.

                              Are there any other switches that _may_ speed up reading from/writing to textures? GL_BLEND gave me 30% speedup, and looking for more I was disabling and enabling switches quite randomly Without success, tho.

                              Thx in advance,
                              fetti