2 Replies Latest reply on Jun 3, 2013 12:31 AM by himanshu.gautam

    Handling event objects for synchronization points

    studintern29

      Hello everyone, I'm quite new in the world of OpenCL and i have some things that I don't understand. I hope you can make things clearer.

      I would like to use some event objects for synchronization points in my program. Nevertheless, I have a hard time with handling memory leaks.

      In the function cl_int clEnqueueNDRangeKernel (cl_command_queue command_queue,cl_kernel kernel,cl_uint work_dim,const size_t *global_work_offset,const size_t *global_work_size,const size_t *local_work_size, cl_uint num_events_in_wait_list,const cl_event *event_wait_list,cl_event *event), I think the event object event should increment its status when it's called by the function and decrement itself after the function is completed, am i right ?

       

      First I want that my kernel2 waits the kernel1 so, this is how I see it:

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

      clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

      In this case, how will event1 and event2 react ? Is it dangerous to generate event2 even though I'm not sure he will be used.

       

      Now, I would like to call again my first two kernels and they should start as soon as the previous is done. I mean that if the kernel1 is done, I should call kernel2 and in the same time another kernel1. And if kernel2 is done, i should call again kernel2. This how i see it, but I think it's no good.

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

      clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event1);

      clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event2, &event2);

       

      Can I do things like that or maybe you have something better in mind ? Should I need to use the function clReleaseEvent() to clean memory ?

      I thought about this one, but still no idea if it's better:

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

      wait_event1=event1;

      clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

      wait_event2=event2;

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event1, &event1);

      clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event2, &event2);

       

      Imagine, now that we would like to call many times the same kernel and each kernel should wait for the previous one, how sould we do it without memory leaks.

      Is it something like this:

       

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

      for(i = 0; i < 1000; i++)

      {

           clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event1);

      }

      or maybe this one:

      clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

      for(i = 0; i < 1000; i++)

      {

           wait_event1 = event1;

           clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event1, &event1);

      }

       

      or do I need to add also some functions to release the event and how to do it with the certitude that the kernel has been completed?

       

      Thank you, for your consideration. I'm sure I am not the only one who is encountering difficulties with the event objects and memory leaks.

        • Re: Handling event objects for synchronization points
          nou

          you can relase event after you enqueue it. also if you didn't create out of order queue (which is not supported at the moment) all kernels execution and data transfers are implicitly synchronized. that mean kernel1 enqueued before kernel2 will finish before kernel2 start and any data transfer will also finish before.

          • Re: Handling event objects for synchronization points
            himanshu.gautam

            studintern29 wrote:

             

            Hello everyone, I'm quite new in the world of OpenCL and i have some things that I don't understand. I hope you can make things clearer.

            I would like to use some event objects for synchronization points in my program. Nevertheless, I have a hard time with handling memory leaks.

            In the function cl_int clEnqueueNDRangeKernel (cl_command_queue command_queue,cl_kernel kernel,cl_uint work_dim,const size_t *global_work_offset,const size_t *global_work_size,const size_t *local_work_size, cl_uint num_events_in_wait_list,const cl_event *event_wait_list,cl_event *event), I think the event object event should increment its status when it's called by the function and decrement itself after the function is completed, am i right ?

            It is not about incrementing/decremetning. An Event object can be in 4 states: Enqueue, Submit, Start, End. You can check clGetEventProfilingInfo to check the current state of an event object. OpenCL Spec 1.2

             

            First I want that my kernel2 waits the kernel1 so, this is how I see it:

            clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

            clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

            In this case, how will event1 and event2 react ? Is it dangerous to generate event2 even though I'm not sure he will be used.

            This is safe and correct. That is how it should be done. you can pass NULL instead of event2 in case you do not need it.

             

            Now, I would like to call again my first two kernels and they should start as soon as the previous is done. I mean that if the kernel1 is done, I should call kernel2 and in the same time another kernel1. And if kernel2 is done, i should call again kernel2. This how i see it, but I think it's no good.

            clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

            clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

            clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event1);

            clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event2, &event2);

             

            Can I do things like that or maybe you have something better in mind ? Should I need to use the function clReleaseEvent() to clean memory ?

            I thought about this one, but still no idea if it's better:

            clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);

            wait_event1=event1;

            clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

            wait_event2=event2;

            clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event1, &event1);

            clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event2, &event2);

            This is bit scary. You are waiting for the same event, and then assigning it to a different command. It may work, but i would not like to use it, because of confusion. Better use different event objects. Have an array of cl_events.