2 Replies Latest reply on Apr 20, 2017 12:49 PM by maikelsz

    OpenCV Farneback Optical flow (opencl) destroy AMD cards? Or is my Code?

    maikelsz

      I have had a serious problem with a code that I have been developing for a few months. It is created using OpenCL and I use only one OpenCV function, calcOpticalFlowFarneback, accelerated with OpenCL also, the rest of the code are several not very complex kernels.

      The issue is that this has destroyed 2 graphics cards in a period of 5 months, an XFX R9 270 and an MSI RX 470. The first was in a state that could only be "used" without the drivers, and during the boot PC pink dots were observed in the letters. The second remained usable for a while, with sporadic hangings and screens in black / white / pink until it completely failed.

      The interesting thing is that this only happened to me with videos of 1280x720 (dont tested higher resolutions), but with 320x240 and 640x480, they did not fail. The second time forget how the previous problem had been. I just run the code, with video of this resolution, and a second later, after shown some results, the system fails. A black screen occurred the first time, the second time, a hang up.

      The OpenCV I used was first 3.0, then 3.2.

      At the moment I do not have any cards to try and I do not want to risk either.

      Anyone have any ideas?

      If the problem is calcOpticalFlowFarneback to these resolutions?

      Or my kernels?

       

      The host code is mostly flow control. With ocv3.0 ther are copy/writes, with 3.2 I avoid that, the UMats use my bufferas.

      The most complex kernels are (the rest are only grayscale convertions, type convertion, etc):

       

      __kernel void kernel_SumAndDiv(__global uchar4 *imageIn, __global float4 *imageSum, __global uchar4 *imageOut, uint count)

      {

          const uint x = get_global_id(0);

          const uint y = get_global_id(1);

          const uint width = get_global_size(0);

          const uint pos = x + y * width;

          float4 color = imageSum[pos];

          color =  color + convert_float4(imageIn[pos]);

          imageSum[pos] = color;

          imageOut[pos] = convert_uchar4(clamp(color / (float)count, 0.0f, 255.0f));

      }

       

      __kernel void kernel_Add(__global uchar4* srcImage, __global uchar4* framesBuffer, __global float4* cumulImage,  __global uchar4* dstImage, const int framePos)

      {

          const uint x = get_global_id(0);

          const uint y = get_global_id(1);

          const uint width = get_global_size(0);

          const uint height = get_global_size(1);  

          const uint pixelPos = x + y * width;

          const uint bufferFramePos = width * height * framePos;

          const uint bufferPixelPos = bufferFramePos + pixelPos;

          uchar4 srcColor = srcImage[pixelPos];

          float4 cumulColor = cumulImage[pixelPos];  

          cumulColor = cumulColor + convert_float4(srcColor);

          framesBuffer[bufferPixelPos] = srcColor;

          cumulImage[pixelPos] = cumulColor; 

          dstImage[pixelPos] = convert_uchar4(clamp(cumulColor / (float)(framePos+1), 0.0f, 255.0f));

      }

       

      __kernel void kernel_InsertAndUpdate(__global uchar4* srcImage, __global uchar4* framesBuffer, __global float4* cumulImage, __global uchar4* dstImage,

                                                const int framePos, const int frameCount)

      {

          const uint x = get_global_id(0);

          const uint y = get_global_id(1);

          const uint width = get_global_size(0);

          const uint height = get_global_size(1);  

          const uint pixelPos = x + y * width;

          const uint bufferFramePos = width * height * framePos;

          const uint bufferPixelPos = bufferFramePos + pixelPos;

          uchar4 srcColor = srcImage[pixelPos];

          float4 srcColorf = convert_float4(srcColor);

          float4 cumulColor = cumulImage[pixelPos];

          float4 bufferColor = convert_float4(framesBuffer[bufferPixelPos]);

          cumulColor = cumulColor - bufferColor + srcColorf;  

          framesBuffer[bufferPixelPos] = srcColor;

          cumulImage[pixelPos] = cumulColor;

          dstImage[pixelPos] = convert_uchar4(clamp(cumulColor / (float)frameCount, 0.0f, 255.0f));

      }

       

      __kernel void kernel_GM(__global uchar4* meanImage, __global uchar4* framesBuffer, __global uchar4* dstImage, const int frameCount)

      {

          const uint x = get_global_id(0);

          const uint y = get_global_id(1);

          const uint width = get_global_size(0);

          const uint height = get_global_size(1);

          const float e2 =  0.001f * 0.001f;

          const uint pixelPos = x + y * width;

          float4 meanColor = convert_float4(meanImage[pixelPos]);

          float4 A = (float4)0;

          float4 B = (float4)0;

          for (int i = 0; i < frameCount; i++)

          {

              uint bufferFramePos = width * height * i;

              uint bufferPixelPos = bufferFramePos + pixelPos;

              float4 bufferColor = convert_float4(framesBuffer[bufferPixelPos]);

              float4 R = 1.0f / sqrt(e2 + (meanColor-bufferColor)*(meanColor-bufferColor));

              A += bufferColor * R;

              B += R;

          }

          dstImage[pixelPos] = convert_uchar4(clamp(A / B, 0.0f, 255.0f));

      }

       

      the rest of the system

      XFX R9-270 first. MSI RX-470 Armor OC, second time

      i7-920 + 12 GB DDR3 first, i5-4460 + 16 GB later

      Windows 10 x64 Aniversary Update, both times

      Crimson 16.9.x first (cant remember the exact number), 17.3.2 the second time