5 Replies Latest reply on Apr 24, 2013 11:44 PM by himanshu.gautam

    ulong16 not implemented fully in libamdocl64??

    kd2

      I've got the following code crashing within the clBuildProgram() function. Specifically gdb SIGSEGVs at

         SCRegSpill::CreateSplitReload(SCInst*, int, unsigned short, SCBlock*, bitset*, bitset*) () from /usr/lib64/libamdocl64.so


      I'm using AMD-APP-SDK-v2.8-lnx64.tgz with linux kernel 3.4.4, Catalyst-13.1, on a CL_DEVICE_NAME=Tahiti card (if it matters, CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG=1).

       

      I seem to be able to get along just fine coding with ulong8s. It's just when I try to jump to using ulong16s that I eventually run into trouble like this.

      Should ulong16s just be avoided for the time being?


       

      -----------------------------------------------------------------------------------------------------------------------

      #include <CL/cl.h>

      #include <stdio.h>

      #include <string.h>

       

      const char *src = "\n\

      __kernel void kernel_0(__global ulong16 *in, __global int *out, uint N) \n\

      { \n\

          ulong16 A = in[0];          \n\

          ulong16 B0 = (ulong16)0;    \n\

          ulong16 B1 = (ulong16)0;    \n\

          ulong16 B2 = (ulong16)0;    \n\

          ulong16 B3 = (ulong16)0;    \n\

          for(uint i=0; i < N; i++) { \n\

              B0 += A << 0;           \n\

              B1 += A << 1;           \n\

              B2 += A << 2;           \n\

              B3 += A << 3;           \n\

          }                           \n\

          ulong16 C0 = B0 + B1;       \n\

          ulong16 C1 = B2 + B3;       \n\

          double16 D0 = as_double16(C0);  \n\

          double16 D1 = as_double16(C1);  \n\

          long16 D = isgreater(D0,D1);        \n\

          out[4] = any(D);            \n\

      }\n";

       

      int main(int argc, char *argv[])

      {

          cl_platform_id platform;

          cl_device_id dev;

          cl_uint platforms, devs;

          clGetPlatformIDs(1,&platform,&platforms);

          clGetDeviceIDs(platform,CL_DEVICE_TYPE_ALL,1,&dev,&devs);

          cl_context_properties properties[] = {CL_CONTEXT_PLATFORM,(cl_context_properties)platform,0};

          cl_int err;

          cl_context ctx = clCreateContext(properties,1,&dev,NULL,NULL,&err);

          size_t src_sz(::strlen(src));

          cl_program prog = clCreateProgramWithSource(ctx,1,&src,&src_sz,&err);

          clBuildProgram(prog,0,NULL,"",NULL,NULL);

      }

        • Re: ulong16 not implemented fully in libamdocl64??
          himanshu.gautam

          Hi kd2,

          I could reproduce the crash with 13.1 driver, but clBuildProgram passed without any error when tested on a linux machine with internal driver.

          Testcase reproduced on : Tahiti, 13.1 driver, Linux 64-bit (Ubuntu 12.04)

          Issue not found on: Capverde GPU, Internal Driver, Linux 64-bit (Ubuntu 12.04)

          So hopefully this issue will no longer exist in the next driver. Thanks for reporting it.

            • Re: ulong16 not implemented fully in libamdocl64??
              kd2

              Thanks.. but can you test this second kernel also?  The 13.3 beta driver builds the first kernel o.k, and so your internal driver may not be much different. Small tweaks will still crash it.  This following kernel crashes clBuildProgram() in the current 13.3 beta.

               

              const char *src = "\n\

              __kernel void kernel_0(__global ulong16 *in, __global int *out, uint N) { \n\

                  ulong16 A = in[0],                      \n\

                      B0 = (ulong16)0, B1 = (ulong16)0,   \n\

                      B2 = (ulong16)0, B3 = (ulong16)0,   \n\

                      B4 = (ulong16)0, B5 = (ulong16)0,   \n\

                      B6 = (ulong16)0, B7 = (ulong16)0;   \n\

                  for(uint i=0; i < N; i++) {             \n\

                      B0 += A << 0; B1 += A << 1;         \n\

                      B2 += A << 2; B3 += A << 3;         \n\

                      B4 += A << 4; B5 += A << 5;         \n\

                      B6 += A << 6; B7 += A << 7;         \n\

                  }                                       \n\

                  out[4] = any(B0+B1+B2+B3==B4+B5+B6+B7); \n\

              }\n";