8 Replies Latest reply on Jul 14, 2011 9:29 AM by sambani

    What's faster: clEnqueueReadBuffer/clEnqueueMapBuffer ?

    bubu

      What's faster? To map a CL buffer and read from it or to call clEnqueueReadBuffer explicitly?

       

      thx

        • What's faster: clEnqueueReadBuffer/clEnqueueMapBuffer ?
          n0thing

          I have observed that Mapping/Unmapping is faster  which is around ~2.5 GB/s compared to ~1.5 Gb/s for enqueue read/write.

           

          • What's faster: clEnqueueReadBuffer/clEnqueueMapBuffer ?
            sambani

             

            Originally posted by: bubu What's faster? To map a CL buffer and read from it or to call clEnqueueReadBuffer explicitly?

             

             

             

            thx

             

             

            mate, check out this page,

             

            click here

             

            Yes, you're missing the CL_TRUE in the clEnqueueWriteBuffer call. This makes the write operation blocking, which stalls the CPU while the copy is made. Using the host pointer, the OpenCL implementation can "optimize" the copy by making it asynchronous, thus in overall the performance is better....

             

             

             

            acne cream|best acne products|proactol

              • What's faster: clEnqueueReadBuffer/clEnqueueMapBuffer ?
                jeff_golds

                 

                Originally posted by: sambani
                Originally posted by: bubu What's faster? To map a CL buffer and read from it or to call clEnqueueReadBuffer explicitly?


                 

                 

                 

                mate, check out this page,

                 

                 

                 

                click here

                 

                 

                 

                Yes, you're missing the CL_TRUE in the clEnqueueWriteBuffer call. This makes the write operation blocking, which stalls the CPU while the copy is made. Using the host pointer, the OpenCL implementation can "optimize" the copy by making it asynchronous, thus in overall the performance is better....

                 

                 



                If you make the copy blocking, then making the copy asynchronous would not help as you have to wait for the copy to finish anyway.

                If you use clEnqueueRead/WriteBuffer with generic system memory as the host ptr, then the driver will have to pin that memory in order for the device to copy from/to it.  If you use clEnqueueMapBuffer, the driver can optimize the operation by using pre-pinned memory.

                Jeff

                  • What's faster: clEnqueueReadBuffer/clEnqueueMapBuffer ?
                    sambani

                     

                    Originally posted by: jeff_golds
                    Originally posted by: sambani
                    Originally posted by: bubu What's faster? To map a CL buffer and read from it or to call clEnqueueReadBuffer explicitly?


                     

                     

                     

                       

                     

                    mate, check out this page,

                     

                     

                     

                       

                     

                    click here

                     

                     

                     

                       

                     

                    Yes, you're missing the CL_TRUE in the clEnqueueWriteBuffer call. This makes the write operation blocking, which stalls the CPU while the copy is made. Using the host pointer, the OpenCL implementation can "optimize" the copy by making it asynchronous, thus in overall the performance is better....

                     

                     

                     

                     



                     

                    If you make the copy blocking, then making the copy asynchronous would not help as you have to wait for the copy to finish anyway.

                     

                    If you use clEnqueueRead/WriteBuffer with generic system memory as the host ptr, then the driver will have to pin that memory in order for the device to copy from/to it.  If you use clEnqueueMapBuffer, the driver can optimize the operation by using pre-pinned memory.

                     

                    Jeff

                     

                     

                    Thanks Jeff, dint think about it that what though....

                     

                     


                    hgh pills|triactol|painful intercourse|buy proactol