2 Replies Latest reply on Apr 29, 2012 8:42 PM by laoshi

    gradual performance degradation of clEnqueueReadImage()

    laoshi

      Hello,

       

      I'm doing a Ray Cast of some fairly large volume (examples from http://www.volvis.org) for medical applications.

       

      I'm generating a "DRR", that is, a digital X-Ray of the objects by integrating along a ray.

       

      I have a 1024 by 1024 by 16 bit intensity destination image.

       

      I am rotating about the volume from 0 - 90 or 180 degrees.

       

      I notice, both in my own timings and in the AMD profiler that the read of the 2D projected image is slowing down over time:

       

      For instance

      elapsed time = 363.08

      elapsed time = 359.41

      elapsed time = 353.417

      elapsed time = 359

      elapsed time = 365.704

      elapsed time = 364.322

      elapsed time = 364.974

      elapsed time = 364.162

      elapsed time = 360.681

      elapsed time = 358.457

      elapsed time = 366.828

      elapsed time = 366.772

      elapsed time = 369.182

      elapsed time = 365.62

      elapsed time = 371.733

      elapsed time = 373.001

      elapsed time = 373.148

      elapsed time = 373.764

      elapsed time = 376.527

      elapsed time = 378.569

      elapsed time = 379.176

      elapsed time = 380.202

      elapsed time = 383.006

      elapsed time = 384.96

      elapsed time = 385.866

      elapsed time = 387.969

      elapsed time = 386.392

      elapsed time = 386.947

      elapsed time = 395.193

      elapsed time = 398.699

      elapsed time = 400.417

      elapsed time = 403.66

      elapsed time = 406.639

      elapsed time = 410.729

      elapsed time = 414.783

      elapsed time = 417.085

      elapsed time = 420.945

      elapsed time = 424.348

      elapsed time = 430.082

      elapsed time = 434.506

      elapsed time = 438.68

      elapsed time = 445.329

      elapsed time = 441.212

      elapsed time = 446.529

       

      And this get's much slower as time goes on.

       

      I don't see any "hints" from the profiler as to what may be the problem. Has anyone seen this type of problem, and what the cause might be?

       

      Thanks,

       

      Rick

        • Re: gradual performance degradation of clEnqueueReadImage()
          drallan

          Hi Rick,

           

          I often see a similar effect when projecting a 3D volume onto a 2D image where I  sum through the volume to get each 2D pixel.

           

          The time variation (at least mine) occurs while rotating through different angles. As you change angles, the step between sequential 3D volume memory addresses slowly changes from small to large (or v.v.). This can affect both the GPU's memory cache and/or the access pattern of the GPU's memory controllers thus varying the time it takes to access memory.

           

          There is no way around this but you may be able to optimize a bit. If you  consider a 3D volume of dimensions DX, DY, and DZ, memory stepping can go from 1 to DX along one rotation access, or maybe DX to DX*DY along another rotation axis etc. Depending on the problem, you might be able to choose a better set of angles.

           

          Allan