cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

zypo
Journeyman III
Journeyman III

Device queues

I don't mean to hijack your thread, but I'm having problems w/ clCreateCommandQueueWithProperties.  Anytime I pass ANY properties through, the function returns CL_INVALID_QUEUE_PROPERTIES.  I would really like to turn on out of order queues.


I am using a Radeon 290.

I'm using the 3.0-0 Beta SDK

I've tried all the newest drivers including beta.

I am running into other problems as well with kernel errors that arn't supposed to exist in OpenCL 2.0 such as it doesn't know what the ndrange_1D(...) function is... this might be related.

Any help would be greatly appreciated!

0 Kudos
Reply
15 Replies
dipak
Staff
Staff

Re: Device queues

Please ensure that you are passing the right arguments to clCreateCommandQueueWithProperties API. To check the usage, you may check the OpenCL2.0 samples in APP SDK 3.0.Beta or may refer clCreateCommandQueueWithProperties.

For example,


// A host command queue


cl_queue_properties props[] = {CL_QUEUE_PROPERTIES, CL_QUEUE_PROFILING_ENABLE, 0};


hostCommandQueue = clCreateCommandQueueWithProperties(context, deviceId, props, &status);




// A device command queue


cl_queue_properties prop[] = {  CL_QUEUE_PROPERTIES, CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_ON_DEVICE|CL_QUEUE_ON_DEVICE_DEFAULT,  CL_QUEUE_SIZE, maxQueueSize, 0 };


deviceCommandQueue = clCreateCommandQueueWithProperties(context, deviceId, prop, &status);




Now, coming to your kernel related problem. As per the clBuildProgram API, applications are required to specify the –cl-std=CL2.0 option if they want to compile or build their programs with OpenCL C 2.0. Otherwise i.e. if the –cl-std build option is not specified, the highest OpenCL C 1.x language version supported by each device is used when compiling the program for each device. So, please ensure that you have used that flag correctly.

If you still face the problem, please provide more details such OS, exact driver version, which type of command queue etc.. A sample code would be very helpful for us.

Regards,

0 Kudos
Reply
zypo
Journeyman III
Journeyman III

Re: Device queues

Thanks Dipak... both of your recommendations worked getting things started.  I have another issue involving enqueue_kernel.

There are 2 blocks of code below.  They are supposed to do the same thing, but they don't.  The iterative method below works, whereas the parallel enqueued method doesn't fully work (I can still see SOME of the kernel instances are running via  flickers on the screen - but each frame different kernel instances are showing up)

I'm thinking it may have to do with a limitation on the device queue, but I'm a noob so what do I know.

Maybe there is an OpenCLism that I don't know about that may help get these kernels working.

In my c++, I am enqueuing via the hostCommandQueue, but in order to prevent a kernel crash, I also create a default device queue (via c++) that just sits there doing nothing.

Here is my kernel code:

     //Parallel method that doesn't work.

  enqueue_kernel(get_default_queue(), CLK_ENQUEUE_FLAGS_WAIT_WORK_GROUP, ndrange_1D(numInstances),

  ^{

      int gid = get_global_id(0);

      runKernel(gid, var1, var2, var3);

  });

          //Iterative method that works just like it should

  for (int index = 0; index < numInstances; index++)

       runKernel(index, var1, var2, var3);

Thanks!

0 Kudos
Reply
dipak
Staff
Staff

Re: Device queues

It is difficult from your post to follow what your code is trying to do. It would be helpful if you can bring more clarity to it

0 Kudos
Reply
zypo
Journeyman III
Journeyman III

Re: Device queues

This is my kernel. I know it is not optimized or really does anything, but it illustrates my problem in a small amount of code.  After the kernel runs, I display the buffer on the screen.  It should be solid white (the iterative method does), but when I use the device enqueue, I get a bunch of random white lines which is not supposed to happen. 

Thanks!

void plotPoint(uchar* graph, int bufferX, int bufferY, unsigned int bufferWidth, unsigned int bufferHeight)

{

  //convert those coordinates to a mem location and write color to screen

  if (bufferX < 0 || bufferY < 0 || bufferX > bufferWidth || bufferY > bufferHeight)

  return;

  long graphIndex = (bufferY * bufferWidth + bufferX) * 4;

  for (int index = 0; index < 4; index++)

  graph[graphIndex+index] = 255;

}

//graph is a buffer of size 4*bufferWidth*bufferHeight

//There is bufferWidth*bufferHeight instances running.

__kernel void testKernel(__global uchar *graph, uint bufferWidth, uint bufferHeight)

{

  // thread index and total

  int gid = get_global_id(0);

  int x = gid % bufferWidth;

  int y = gid / bufferWidth;

  if (x == 0) //only allow 1/bufferWidth kernels to run.

  {

  //this doesn't work

  enqueue_kernel(get_default_queue(), CLK_ENQUEUE_FLAGS_WAIT_WORK_GROUP, ndrange_1D(bufferWidth),

  ^{

  int gid = get_global_id(0);

  plotPoint(graph, gid, y, bufferWidth, bufferHeight);

  });

  //this does

  //for (int index = 0; index < bufferWidth; index++)

  // plotPoint(graph, index, y, bufferWidth, bufferHeight);

  }

}

0 Kudos
Reply
dipak
Staff
Staff

Re: Device queues

As I checked with a wrapper code, you kernel code having enqueue_kernel worked fine on my Kaveri m/c. I've attached the sample wrapper code here. Please check at your end. If you still face the problem, please provide your host code as well as share your setup details.

Regards,

0 Kudos
Reply
zypo
Journeyman III
Journeyman III

Re: Device queues

Dipak,

First and foremost, I REALLY appreciate your help with this issue of mine.  I see that you went out of your way to create test files for myself, and it does not go unnoticed... so thank you very much!

On the other hand, I compiled and ran your software, and your code gives me ##########FAILED########, so I now assume that it is not my code giving the issue.  I'm thinking about re-installing drivers (again). 

I'm using a discrete R9 290 on Windows 7 w/ the 3.0 beta OpenCL install.  I do have a A10-5700 cpu, but it won't run OCK2.0 code, so my gpu has to work.

I looked more into my kernel and the documentation and noticed that if I supply the '-g' to build options for your kernel, it gives more verbose errors.  enqueue_kernel is returning CLK_ENQUEUE_FAILURE, but even w/ the -g option, it does not give me any more information.

Other then driver changes, do you know of anything else I can try?

Thanks!

0 Kudos
Reply
dipak
Staff
Staff

Re: Device queues

Thanks for your appreciation . We always try our best to provide support to our users or customers.

Don't know whether its a driver/hardware specific issue or not. I'll try the same on a R9 290 card and share my observation with you.

Regards,

0 Kudos
Reply
dipak
Staff
Staff

Re: Device queues

This seems to be a driver issue. Last time, I used an internal driver package (higher version than public one) where it worked fine. When I tried with public catalyst driver (15.4 or 14.502) on R9 290X card, I was able to reproduce the issue. I'll check further and get back to you shortly.

Regards,

0 Kudos
Reply
zypo
Journeyman III
Journeyman III

Re: Device queues

So this is good news! I'll be standing by.

0 Kudos
Reply