cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

miktis
Journeyman III

OpenCL compilation for CPU crashes.

I have a code that produces correct results when I test it with my Radeon HD7870.
It also compiles for GPUs in KernelAnalyser2.
The kernel myTask is used as a task (clEnqueueTask).
The code computes y=L*R*x with L and R being sparse matrices.
The code is not efficient but this is not my point.

I use AMD APP 2.8 and CAT 13.1 on Windows 7 Ultimate 64bit with i7-3820.
When I select CPU as device, the compiler crashes inside clBuildProgram (in amdocl64.dll).
KernelAnalyser2 also crashes.

I have made some observations with KernelAnalyser2:
      1. The code compiles if I comment out myKernel.
      2. The code compiles if I uncomment the line i = get_global_id(0)
      3. The code compiles if I don't use the second for loop (for(i = 0; i < rows; i++)),
          even though results are not correct.
The above apply also when I compile using my software.
Could anyone try to repeat my findings with the KernelAnalyser2?

The code is as follows:
__kernel void myTask(__global float *R, __global int *RIdx, __global int *RPtr,
      __global float *L, __global int *LIdx, __global int *LPtr,
      __global float *x, __global float *y, int rows, int cols)
{
      __local float z[1024];
      int i = 0;
      //i = get_global_id(0);
      for(; i < cols; i++)
      {
           float acc = 0.0f;
           for (int j = RPtr; j < RPtr[i+1]; j++)
           {
                acc += R * x[RIdx];
           }
           z = acc;
      }
      for(i = 0; i < rows; i++)
      {
           float acc = 0.0f;
           for (int j = LPtr; j < LPtr[i+1]; j++)
           {
                acc += L * z[LIdx];
           }
           y = acc;
      }
}
__kernel void myKernel(__global float *a, __global float *b, __global float *c)
{
      int i = get_global_id(0);
      a = b + c;
}
0 Likes
2 Replies
himanshu_gautam
Grandmaster

Will Looks into this THanks for reporting

0 Likes
himanshu_gautam
Grandmaster

Hi miktis,

I was able to reproduce the issues you reported. I will ask someone relevant to work on it.

Anyways I would just like to point out that the above kernels are having serious multiple work-items writing to same location bug. I understand, this kernel might only be for showcasing the codexl bug, but just be careful in case you are using the above kernel and wondering about correctness failures.

I would recommend to post similar issues in the codexl section.

0 Likes