3 Replies Latest reply on Aug 22, 2011 10:36 PM by arsenm

    What are these fetches?

    arsenm

      I'm looking at the (Cayman and Cypress) ISA for one of my kernels, Inside of a loop where I have a single memory read (which I can easily identify) near the beginning. Then there are a bunch of additional TEX clauses with VFETCH instructions appearing which I don't understand.

      There are reads from __constant buffers, which as far as I understand appear as an ALU clause locking some range in a CB (with something like KCACHE0(CB5:0-15) appearing at the beginning of the clause, with sources used such as KC0[2].z).

      What are these fetches? What's the significance of the fc130, and FETCH_TYPE(NO_INDEX_OFFSET)? How can I prevent these from happening, both to avoid the clause changes and the possible extra fetches that appear?

      // earlier read which I do understand 08 TEX: ADDR(5780) CNT(1) 41 VFETCH R13, R0.w, fc175 FORMAT(32_32_32_32_FLOAT) FETCH_TYPE(NO_INDEX_OFFSET) // later fetches which I don't understand look like this 13 TEX: ADDR(5782) CNT(1) 150 VFETCH R1, R1.z, fc130 FETCH_TYPE(NO_INDEX_OFFSET) // On Cypress they look like this 12 TEX: ADDR(5798) CNT(1) 146 VFETCH R2, R2.w, fc130 MEGA(16) FETCH_TYPE(NO_INDEX_OFFSET)

        • What are these fetches?
          MicahVillmow
          Dynamic indexing into constant address space must use the texture unit to load the data. Only static indexes(i.e. compile time known) use the constant cache. This is the extra loads you are seeing.
            • What are these fetches?
              arsenm

              I don't think I should have any dynamic indexing here. The closest thing I have to dynamic indexing is in an unrolled loop. The number of these clauses actually looks consistent with the number of unrollings * number of uses. Is this an optimization not applied to unrolled loops? Alternatively, could this have anything to do with reading a from a constant array of structs?

              What I have looks like this:

               

               

              __constant SomeStruct* a (a kernel argument) for (...) { #pragma unroll N for (i .. N) { // operations involving a[i].structfield1, a[i].structfield2 } }