AnsweredAssumed Answered

Private memory corruption?

Question asked by yoyo on Dec 26, 2013
Latest reply on Mar 4, 2014 by prao



I'm using version 13.9 of Catalyst drivers on Linux.

I've got an OpenCL program that was tested on an Nvidia GPU and works quite well, but it behaves strangely when I'm trying to run it on Radeon HD 7970. That is, when I pass by value an argument to a kernel that uses a private array, at some point argument's private copy becomes corrupted. I can reproduce it with the following simple code:


#include <iostream>
#include <string>
#include <CL/cl.hpp>

using std::cout;
using std::endl;
using std::string;

typedef struct
  double x[2];
} DDouble;

int main()
  VECTOR_CLASS<cl::Platform> platforms;
  cl::Platform pl = platforms[0];

  VECTOR_CLASS<cl::Device> devices;
  pl.getDevices(CL_DEVICE_TYPE_ALL, &devices);
  cl::Device dev = devices[0];

  string tmp;
  pl.getInfo(CL_PLATFORM_NAME, &tmp);
  cout << "Platform: " << tmp << endl;

  dev.getInfo(CL_DEVICE_NAME, &tmp);
  cout << "Device: " << tmp << endl;

  cl::Context context = cl::Context(VECTOR_CLASS<cl::Device>(1, dev));
  cl::CommandQueue queue = cl::CommandQueue(context, dev);

  string src_string =
    "#pragma OPENCL EXTENSION cl_khr_fp64 : enable \n"
    "typedef struct { double x[2]; } DDouble;      \n"
    "                                              \n"
    "#define N 12                                  \n"
    "__kernel void A(__global double* p,           \n"
    "                DDouble dd)                   \n"
    "{                                             \n"
    "  double tmp[N];                              \n"
    "                                              \n"
    "  tmp[0] = 3.14159;                           \n"
    "  for(int i = 1; i < N; i++)                  \n"
    "     tmp[i] = tmp[i-1]+1.0/tmp[i-1];          \n"
    "                                              \n"
    "  p[0] = tmp[N-1];                            \n"
    "  p[1] = dd.x[0];                             \n"
    "}                                             \n";

  cl::Program::Sources src(1, std::make_pair(src_string.c_str(),
  cl::Program program(context, src);<cl::Device>(1, dev));

  string buildLog;
  program.getBuildInfo(dev, CL_PROGRAM_BUILD_LOG,
  cout << "Build log:" << endl
       << " ******************** " << endl
       << buildLog << endl
       << " ******************** " << endl;

  cl::Kernel kernel(program, "A");

  DDouble dd;
  dd.x[0] = 1.2345;
  dd.x[1] = 5.4321;

  cl::Buffer buff(context, CL_MEM_READ_WRITE, 2*sizeof(double));

  kernel.setArg(0, buff);
  kernel.setArg(1, dd);


  double* map = (double*)queue.enqueueMapBuffer(buff, CL_TRUE,
                                                CL_MAP_READ, 0,
  cout << "p[0] = " << map[0] << endl
       << "p[1] = " << map[1] << endl;
  queue.enqueueUnmapMemObject(buff, map);

  return 0;




Platform: AMD Accelerated Parallel Processing
Device: Tahiti
Build log:
"/tmp/", line 1: warning: OpenCL extension is now part of core
  #pragma OPENCL EXTENSION cl_khr_fp64 : enable

p[0] = 5.69946
p[1] = 5.13595


I was, of course, expecting to get p[1]=1.2345. It gives correct results for N=11, however.


My questions are:

1. Is it a compiler's bug or an intended behaviour?

2. Is there a way to make compiler issue a warning when it's going to produce a code that overwrites data that should remain unchanged? I can rewrite some of my kernels to use more local memory instead of private, but I want to be sure that if kernel compiles without warnings it will run correctly.