2 Replies Latest reply on Jun 19, 2008 6:14 AM by delgadom

    Performance of reduction kernel

    delgadom

      I am benchmarking the AMD stream SDK v1.1 with my Radeon HD2400 using thesamples shipped with brook+. Performance is awesone except for the tests using the reduction kernels. For example, setting

      BRT_RUNTIME=cpu

      and the call to the compiled example reduction.br as

      ./reduction -t -x 1024 -y 1024 -i 100

      results in a execution time of 0.024000s. However with the same call after seeting

      BRT_RUNTIME=cal

      the time increases to 1.218000s, and the load in the CPU is not negligible.

      Is the reduction buffer implemented for my Radeon? If yes, do you have any idea of how to increase the performance of the reduction?

      Carlos