9 Replies Latest reply on Jan 22, 2010 7:25 PM by Mikey

    How to get more build info

    Mikey

      Hi. I have CL code that works fine on CPU and not work on GPU. My CL code got quiet big now so it's useless to post about 1k lines here...

      I'm getting this output:

      Stack dump:
      0.      Program arguments: C:\Program Files (x86)\ATI Stream\bin\x86\llc -mcpu=atir770 -mattr=mwgs-3-256-1-1 -regalloc=linearscan -march=amdil C:\Users\Mikey\AppData\Local\Temp\OCLA304.tmp.bc -f -o C:\Users\Mikey\AppData\Local\Temp\OCLA304.tmp.il
      1.      Running pass 'AMDIL Load Store Setup Pass' on function '@__OpenCL_main_kernel'
      005410A5 (0x01B2E454 0x52DC4749 0x01B2E454 0x01A42F10)
      00541162 (0x0080D8A0 0xFFFFFFFF 0x00000000 0x01B99FC4)
      00598CD1 (0x01A6CBD8 0x0119FD84 0x0079CF29 0xFFFFFFFF)
      00593D5B (0x01B2E400 0x00000000 0x01B2E4A0 0x01A42F10)
      00541B44 (0x01A6FD4C 0x01B2E454 0x00000000 0x012F00B8)
      Compilation log: C:\Users\Mikey\AppData\Local\Temp\OCLA17D.tmp.cl(956): warning
      : null (zero)
                character in input line ignored
        }
         ^

      How can I find out what's wrong? Can I somehow tell compiler to not delete those files?

       

      --- btw.

      I have weird problem. At some point in code I have to put some meaningless function call (like cos(1)) - without this kernel returns wrong result...

        • How to get more build info
          genaganna

           

          Originally posted by: Mikey Hi. I have CL code that works fine on CPU and not work on GPU. My CL code got quiet big now so it's useless to post about 1k lines here...

           

          I'm getting this output:

           

          Stack dump: 0.      Program arguments: C:\Program Files (x86)\ATI Stream\bin\x86\llc -mcpu=atir770 -mattr=mwgs-3-256-1-1 -regalloc=linearscan -march=amdil C:\Users\Mikey\AppData\Local\Temp\OCLA304.tmp.bc -f -o C:\Users\Mikey\AppData\Local\Temp\OCLA304.tmp.il 1.      Running pass 'AMDIL Load Store Setup Pass' on function '@__OpenCL_main_kernel' 005410A5 (0x01B2E454 0x52DC4749 0x01B2E454 0x01A42F10) 00541162 (0x0080D8A0 0xFFFFFFFF 0x00000000 0x01B99FC4) 00598CD1 (0x01A6CBD8 0x0119FD84 0x0079CF29 0xFFFFFFFF) 00593D5B (0x01B2E400 0x00000000 0x01B2E4A0 0x01A42F10) 00541B44 (0x01A6FD4C 0x01B2E454 0x00000000 0x012F00B8) Compilation log: C:\Users\Mikey\AppData\Local\Temp\OCLA17D.tmp.cl(956): warning : null (zero)           character in input line ignored   }    ^

           

          How can I find out what's wrong? Can I somehow tell compiler to not delete those files?

           

           

           

          --- btw.

           

          I have weird problem. At some point in code I have to put some meaningless function call (like cos(1)) - without this kernel returns wrong result...

           

          Please send your code to streamdeveloper@amd.com.  Please send your system configuration also(OS, CPU, GPU, SDK version and Driver version).

            • How to get more build info
              Mikey

               

              Originally posted by: genaganna

               

              Please send your code to streamdeveloper@amd.com.  Please send your system configuration also(OS, CPU, GPU, SDK version and Driver version).

               

              OK.

              It could be problem with memory I think. I've got struct with uint a[64] and uint b[32], in both cases it crashes (GPU) when I'm using index i, where i >= 16.

              Is it not allowed to have more than 16 elements in array?

                • How to get more build info
                  omkaranathan

                   

                  Originally posted by: Mikey

                   

                  Is it not allowed to have more than 16 elements in array?

                   

                  It is allowed. Could you provide a test case which reproduces your problem?

                    • How to get more build info
                      Mikey

                      I couldn't generate the same output with simpler case, however, I could reproduce something else: 'Link failed'.

                       

                       

                       

                      typedef struct S { uint bitLength[32]; ulong8 hash; } S; void f(struct S * const sptr) { int size = 32; // change this to 16 - no crash for (int i = 0; i < size; i++) sptr->bitLength[i] = 0U; // without this line size can be greater than 16 sptr->hash = (ulong8)(1UL); } kernel void main() { struct S s; f(&s); }

                        • How to get more build info
                          omkaranathan

                          Hi Mikey,

                          I had a look into the test case which you have sent. In your code,  as below you are trying to use array indexing to access a vector element, which is illegal.  

                          ulong8 block; // mu(buffer) ulong8 state; // the cipher state ulong8 L; unsigned int *buffer = sp->buffer; for (int i = 0; i < 8; i++, buffer += 8) { block[i] = (((ulong)buffer[0] ) << 56)

                            • How to get more build info
                              Mikey

                              Thank you, omkaranathan.

                              I've already found that mistake - I thought that using [] was ok because it has been working in most cases (especally on CPU). But now I see that it was just by sheer chance (struct with values allocated one next to another).

                                • How to get more build info
                                  genaganna

                                   

                                  Originally posted by: Mikey Thank you, omkaranathan.

                                   

                                  I've already found that mistake - I thought that using [] was ok because it has been working in most cases (especally on CPU). But now I see that it was just by sheer chance (struct with values allocated one next to another).

                                   

                                  Mikey,

                                          Are you facing any more issue after solving that issue?

                                    • How to get more build info
                                      Mikey

                                       

                                      Originally posted by: genaganna

                                       

                                      Mikey,

                                       

                                              Are you facing any more issue after solving that issue?

                                       

                                      No, I have succesfully completed my program. However, it works a little bit slow - even 10 times slower on GPU than on CPU - so I will keep trying making it faster

                                      I'm aware of the fact that SDK isn't finished yet so I'm not giving up easily.

                                      Thanks for your interest!

                                        • How to get more build info
                                          Mikey

                                          Ok, I have to take those words back!

                                          I've just read another thread about optimalization and decided to see if my local group size is ok. Well, it wasn't. But now is - that gives me 24 times faster program. I have to addmit, that's quiet awsome!