cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

mineral007
Journeyman III

Concurrent execution on APU

Hi folks,

    I am using my A8-3850 for OpenCL programming. It seemed GPU returns the control to CPU if not using a clFinish(command_queue) after clEnqueueNDRangeKernel(). I am just wondering that can the CPU part of APU work concurrently with a running kernel on 6550D of the same APU?

   Thanks in advance.

0 Likes
1 Solution

clEnqueueNDRangeKernel(kernelA);

clFlush(command_queue);//start execution of kernel

functionB();//your code on CPU

clFinish(command_queue);//ensure that kernel finished or wait for it.

View solution in original post

0 Likes
7 Replies

Hi,

I don't think there should be any problem in that. Current APUs as i know have separate areas of RAM reserved for them, and a copy is used when buffers are to be transferred from APU's RAM to GPU's RAM area.

It would be interesting to know, what problem you are working on. I am specially interested in how you are doing load balancing.

0 Likes

Suppose I have GPU kernelA (0.4 sec) and CPU functionB (0.3sec), I tried:

(1) clEnqueueNDRangeKernel(kernelA);

     functionB();

     total time: 0.3 sec                                //actually, kernelA didn't execute

(2) clEnqueueNDRangeKernel(kernelA);

     clFinsih(command_queue);

     functionB();

     total time: 0.7 sec

(3) clEnqueueNDRangeKernel(kernelA);

     clEnqueueReadBuffer();                                     //like nou mentioned, I think it's implicit clFlush()

     functionB();

     total time: 0.7 sec                 // ignored short readbuffer time

No matter functionB is accessing its private memory or not, the 3 cases are the same. I would like to know whether APU allows its CPU and GPU to concurrently execute which means I can get 0.3 sec.

0 Likes

clEnqueueNDRangeKernel(kernelA);

clFlush(command_queue);//start execution of kernel

functionB();//your code on CPU

clFinish(command_queue);//ensure that kernel finished or wait for it.

0 Likes
nou
Exemplar

OpenCL implementation from AMD is lazy. that mean it doesn't start execution of enqueued operation until you call clFlush() or other method which call it implicitly.

Are you really shure?

I never use clFlush or clFinish, and in my situation it it so, that i read the data from the GPU, when the calculation is finished with a blocking read. So the System waits until the kernel is finished, and the read is done.

So if you use only one queue, there is no need for the two functions. As far i could say.

0 Likes

A blocking operation triggers a flush anyway.

0 Likes
paladice
Journeyman III

I already seen this question when I had search a good library "cassoulet.h". I think you can to find it easily with google or yahoo. But, 😕 I didn't find it now. Good luck and don't forgot "cassoulet.h" or "ravioli.h"

I don't remember good luck

0 Likes