1 Reply Latest reply on Aug 12, 2014 1:56 AM by dipak

    nested-if performance penalty if no else

    firespot

      Hi,

       

      In 6.8.7.4 of the AMD OpenCL Programming Guide for AMD SDK (2.9; that's the version used here) it says there are performance penalties for nested if-statements and that "if blocks are nested k levels deep, 2^k nested conditional structures are generated".

      Is this also applicable if an outer if-statement does not have any else-branches, or would this outer-if then not contribute to k?

       

      E.g.:

       

      <code>

      if (someConditionUnknownAtCompileTimeButConstantAtRuntime)

      {

      // complex code here, including loops, nested ifs (these come with else-branches), etc.

      }

      </code>

       

      I further suppose that a goto wouldn't make any difference?:

       

      <code>

      if (! someConditionUnknownAtCompileTimeButConstantAtRuntime)

        goto: End;

       

      // complex code here, including loops, nested ifs (these come with else-branches), etc.

       

      End:

      </code>

       

      any other suggestions?

      thanks!

        • Re: nested-if performance penalty if no else
          dipak

          Hi,

          Branching and wavefront divergence depend on the condition. If all the work items within a wavefront satisfy the condition, then there will be no divergence at all. If not, then only wavefront divergence will occur and each branch path will be executed serially.

          During the if-else execution, all work-items execute the same instruction but those work-items not under the if condition are masked by the hardware. In case of else block, same thing happens but the mask is reversed. So, if there is no else block, then only one branch path is executed.

           

          Regarding the "goto": Try to avoid the goto statements as they are not efficient for SIMDs and can cause irreducible control flow. As per the OpenCL spec 1.2:

          Whether or not irreducible control flow is illegal is implementation defined.

          So, the OpenCL compiler can generate error against the "goto" statements.

           

          Regards,