1 Reply Latest reply on Sep 18, 2011 3:02 AM by genaganna

    USWC and discrete GPU

    Raistmer
      why only APUs mentioned?

      USWC - Host memory from the Uncached Speculative Write Combine heap can be accessed by the GPU without causing CPU cache coherency traffic. Due to the uncached WC access path, CPU streamed writes are fast, while CPU reads are very slow. On Fusion devices, this memory provides the fastest possible route for CPU writes followed by GPU reads.


      And what will be fastest for Discrete GPU, provided CPU can perform streamed writes for data buffer?

        • USWC and discrete GPU
          genaganna

           

          Originally posted by: Raistmer
          USWC - Host memory from the Uncached Speculative Write Combine heap can be accessed by the GPU without causing CPU cache coherency traffic. Due to the uncached WC access path, CPU streamed writes are fast, while CPU reads are very slow. On Fusion devices, this memory provides the fastest possible route for CPU writes followed by GPU reads.
          And what will be fastest for Discrete GPU, provided CPU can perform streamed writes for data buffer?


          Best paths for discrete GPUs

          1. Read only Input buffers of the kernel should be created with CL_MEM_USE_PERSISTENT_MEM_AMD

          2. Write only output buffers of the kernel should be created with CL_MEM_ALLOC_HOST_PTR

          Kernel execution time is increated if output buffer is CL_MEM_ALLOC_HOST_PTR.