3 Replies Latest reply on May 21, 2010 2:19 PM by LeeHowes

    Asynchronous transfers on OpenCL CPU devices

    hsyl20
      How to make asynchronous transfers asynchronous?

      Hi,

      When using ATI OpenCL implementation on x86 CPUs, asynchronous data transfers are not asynchronous. That is, when you specify "blocking_*" parameter to CL_FALSE, you have to explicitly wait for the associated event (with clFinish or clWaitForEvents for instance), you cannot poll using clGetEventInfo(CL_EVENT_COMMAND_EXECUTION_STATUS).

      I think the implementation should use a thread to perform data transfer asynchonously in order to be compliant with OpenCL specification.

      Will this be corrected in a coming release?

      Thanks

      Sylvain

        • Asynchronous transfers on OpenCL CPU devices
          LeeHowes

          I don't think the specification says that the copy must be asynchronous, only that if the flag is set the other way it must be complete by return.

            • Asynchronous transfers on OpenCL CPU devices
              hsyl20

              Nowhere in the spec it is mentionned that we need to explicitly wait for each transfer to complete either.

              I still think it is the implementation role to make transfers progress. Otherwise what would be the preconised solution? We can't use a new thread waiting for each transfer to complete as it would be highly inefficient. Moreover it would totally defeat the purpose of using OpenCL for concurrency and parallelism.

              IMO, the correct behavior would be for the OpenCL CPU implementation to have a single data transfer thread that would perform data transfers in sequence. (Btw, technologies like I/OAT QuickData could even be used if available).

              Regards,

              Sylvain

                • Asynchronous transfers on OpenCL CPU devices
                  LeeHowes

                  Right, it might be a good thing to farm off a separate thread, though I don't think it's definitive. My point was just that the current implementation isn't short of spec compliance, only at best with the intent of the spec.

                  I couldn't comment on OpenCL implementation decisions, though, that's not my area.