4 Replies Latest reply on Jul 28, 2016 5:17 AM by shailu1995

    invalid_command_queue when using barrier

    shailu1995

      Hi,

      I am getting an error of CL_INVALID_COMMAND_QUEUE while using barrier(CLK_LOCAL_MEM_FENCE ) in my code. When I don't use this the code runs successfully. But since this is not redundant I need to fix this.

      Since its a large project I am confining the error to a small scale. My OpenCL kernel has been reduced to just a few lines of code. In that, if I use barrier I get this error.

      So any clues  on what is causing this error.

      Thanks in advance

      Shailesh  Tripathi

        • Re: invalid_command_queue when using barrier
          dipak

          Hi Shailesh,

           

          Could you please share a test program (host and kernel) that manifests the issue?

           

          Regards,

            • Re: invalid_command_queue when using barrier
              shailu1995

              Hi dipak,

              Since its a large project and sharing the complete code makes no sense. Still I can share a few parts (steps followed for one of the variables):

              Create buffer:

              d_glbSpkCntInput=clCreateBuffer(context,  CL_MEM_READ_WRITE | CL_MEM_ALLOC_HOST_PTR ,7 * sizeof(unsigned int), NULL, &ret);

               

              Mapping buffer memory:

              glbSpkCntInput= (unsigned int *) clEnqueueMapBuffer(command_queue, d_glbSpkCntInput, CL_TRUE, CL_MAP_READ | CL_MAP_WRITE, 0, 7* sizeof(unsigned int), 0, NULL, NULL, &ret);

               

              Initializing memory:

              for (int i = 0; i < 7; i++) {

                      glbSpkCntInput[i] = 0;

                  }

               

              Unmap memory:

              clEnqueueUnmapMemObject(command_queue, d_glbSpkCntInput,glbSpkCntInput, 0, NULL, NULL);

               

              Launch kernel :

              size_t sGlobalSize = 1536;

                  size_t sLocalSize = 32;

              CHECK_OPENCL_ERRORS(clEnqueueNDRangeKernel(command_queue,calcSynapses,1, NULL, &sGlobalSize , &sLocalSize, 0, NULL,NULL/* &synapseevent*/));

                CHECK_OPENCL_ERRORS(clFinish(command_queue));

               

              opencl kernel:

              __kernel void calcSynapses(__global float *t ,  __global unsigned int* dd_glbSpkCntInput,  __global unsigned int* dd_glbSpkInput, __global unsigned int * dd_spkQuePtrInput,  __global unsigned int* dd_glbSpkCntInter,  __global unsigned int* dd_glbSpkInter,  __global unsigned int* dd_glbSpkCntOutput,  __global unsigned int* dd_glbSpkOutput, __global float* dd_inSynInputInter, __global float* dd_inSynInputOutput, __global float* dd_inSynInterOutput, __global unsigned int *d_done , __global int *dd_test)

              {

                  unsigned int id = get_global_id(0);

                  unsigned int lmax, j, r;

                  float addtoinSyn;

                  volatile __local float shLg[BLOCKSZ_SYN];

                  float linSyn;

                  unsigned int ipost;

                  __local unsigned int shSpk[BLOCKSZ_SYN];

                  unsigned int lscnt, numSpikeSubsets;

                

                if (id==0)

                dd_test[0]=65;

                  // synapse group InputInter

                  if (id < 512) {

                      unsigned int delaySlot = (*dd_spkQuePtrInput + 4) % 7;

                      // only do this for existing neurons

                      if (id < 500) {

                          linSyn = dd_inSynInputInter[id];

                          }

               

                 lscnt = dd_glbSpkCntInput[delaySlot];

                      numSpikeSubsets = (lscnt+BLOCKSZ_SYN-1) / BLOCKSZ_SYN;

                      // process presynaptic events: True Spikes

                

                for (r = 0; r < numSpikeSubsets; r++) {

                          if (r == numSpikeSubsets - 1) lmax = ((lscnt-1) % BLOCKSZ_SYN) +1;

                          else lmax = BLOCKSZ_SYN;

                        barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);

                          if (get_local_id(0) < lmax) {

                              shSpk[get_local_id(0)] = dd_glbSpkInput[(delaySlot * 500) + (r * BLOCKSZ_SYN) + get_local_id(0)];

                              }

                              }

              }

               

              clFinish() produces the error CL_INVALID_COMMAND_QUEUE

              When I remove the barrier function the code runs fine but not with it.

              So please tell me what could be the possible source of this error.

               

              Thanks in advance

              Shailesh