3 Replies Latest reply on Apr 22, 2014 8:55 AM by avillegas

    Issues testing HSA driver

    avillegas

      Hello,

       

      I am trying to make CPU and GPU synchronize via HSA using a shared variable in a A10 - 7850, so I tried to modify the examples provided with the HSA driver. This is my kernel code:

       

      #pragma OPENCL EXTENSION cl_amd_c11_atomics : enable 
      #define NULL 0 

      __kernel void consumer(global volatile atomic_int * data)

      {

           int id = get_global_id(0);

           int counter = 2;

           while(data[0] == 0)

           {

               counter = (counter + 1) ;

           }

           data[0] = counter;

      The host simply sets data[0] = 0; sleeps for 2 ms and sets data[0] = 1.

       

      First issue:

      This example works fine, but whenever I update counter with a different operation (I tested counter = (counter + 1);  counter = (counter + 1)%10; counter = (counter * 3); ) it freezes for a moment and restarts the computer.

      Before restarting, windows 8.1 informs of an unhandled exception (sometimes SYSTEM_THREAD_EXCEPTION_NOT_HANDLED and sometimes SYSTEM_SERVICE_EXCEPTION) in the file amdkfd.sys

      Is there any restriction with the beta driver, or am I doing something wrong=

       

      Second issue:

      I might want to launch this kernel in the CPU, but when building the program, it does not recognize the #pragma. Is there any way to force the CPU to recognize said pragma?

       

      Thank you.

        • Re: Issues testing HSA driver
          avillegas

          I solved the first issue myself:

           

          Due to the weak consistency memory model, accesses to counter caused an unexpected behavior.

          It was fixed by using a mem_fence:

           

          counter = (counter + 1) % 10 ;

          mem_fence(CLK_LOCAL_MEM_FENCE);

           

          Anyway, avoiding the mem_fence caused certain instabilities in the driver, that maybe could be reviewed by the development team.

           

          I got no way to solve the second issue. Could be posssible to force the use of c11 atomic in a OpenCL kernel launched in the CPU?

           

          Thanks.

          • Re: Issues testing HSA driver
            gopal

            Hi

             

            I hope you would have fixed your second issue.

             

            @"I might want to launch this kernel in the CPU, but when building the program, it does not recognize the #pragma."

            The reason for getting "unrecognized OpenCL extension" error while building the kernel is because cl_amd_c11_atomics extension may not be supported by CPU. This is a AMD vendor specific extension which AMD supports for its GPU. And to make sure that whether AMD supports this extension for its CPU, check clinfo.

            You can also query the supported extensions for a platform using clGetPlatformInfo(), with the param_name parameter set to enumerated value CL_PLATFORM_EXTENSIONS, or for a specific device using clGetDeviceInfo() with param_name set to enumerated value CL_DEVICE_EXTENSIONS.

             

            The better way to approach this case is to:

            1. query the specific extension on host side using clGetDeviceInfo() function and if it is not supported, return an custom message.

            2. as per the opencl spec, each extension that affects kernel code compilation must add a defined macro with the name of the extension. This allows the kernel code to be compiled differently, depending on whether the extension is supported and enabled, or not. so for example the following macro could be used to test if a specific extension is supported or not. In kernel side add the following macro:

            #ifdef  cl_amd_c11_atomics

              #pragma OPENCL EXTENSION cl_amd_c11_atomics : enable

            #else

              //error, " cl_amd_c11_atomics extension not supported"

            #endif

            1 of 1 people found this helpful
              • Re: Issues testing HSA driver
                avillegas

                Hi Gopal,

                 

                Thank you for your answer.
                As you pointed, I checked that said extension is not available in the CPU.

                Finally, I have ended up implementing a plain (not OpenCL) C++11 version of the algorithm to be executed in the CPU.

                I appreciate your answer, as it could be useful in another context in the future.

                 

                Regards,

                Alejandro.