    Why writeimagef so slow


      my code is work on opencl and d3d9 interop,

      in kernel use write_imagef is cost 4ms to write image data?

      the code:

              AllVal = (mul24(val, iInvWeight ) + mul24(v2Val, iWeight))>>11;

            write_imagef(image, id, convert_float4(AllVal )/(float4)255.0);

      how do i to optimize my code?thank you !!