7 Replies Latest reply on Sep 6, 2011 2:35 PM by MicahVillmow

    [XF] Float16 vs 16 float

    xfaure
      Interest of float16

      Hello every body,

      I'm new with OpenCL. I try to illustrate the power of float16, but I failed to.
      I built a program which add to 1024*1024*16-array of float. With GPU, when I run with float16, the time of computation is 0.03 secondes. With GPU when I run with 16 * float, the time of computation is 0.006 secondes. And with CPU, the time of computation is 2 secondes. But Why it's longer with float16 than 16 * float?

      Thanks for your help.

      A part of my code :
      [code]
      Fichier Main.cpp :

      // Define an index space (global work size) of threads for execution.
      // A workgroup size (local work size) is not required, but can be used.
      size_t globalWorkSize[1];
      size_t localWorkSize[1];
      // There are nbKernel threads
      globalWorkSize[0] = nbKernel/16;
      localWorkSize[0] = 512;

      // Execute the kernel.
      // 'globalWorkSize' is the 1D dimension of the work-items
      status = clEnqueueNDRangeKernel(cmdQueue, kernel, 1, NULL, globalWorkSize,
      localWorkSize, 0, NULL, NULL);

      clFinish(cmdQueue);

      Fichier.cl :

      __kernel void vecadd(__global float16 const * const A, __global float16 const * const B, __global float16 * const C)
      {

      unsigned int const i = get_global_id(0);

      C = A + B;
      [/code]

      Thanks

      Xavier Faure