1 Reply Latest reply on Aug 12, 2017 4:28 AM by jpsollie

    help needed with BUILD_PROGRAM_FAILURE

    jpsollie

      to make sure my kernel would run on all kind of devices, I added an old radeon HD 7700 to my system, next to my R9 nano.

      Both are GCN cards, so I thought it would run without too much problems.

      But it seems more difficult than that:

      When initializing all devices, the program went into segmentation fault when initializing the context with multiple GPUs

      so I thought: not a problem, I have 128GB of RAM on this machine, I don't care if it takes a bit more memory, I'll just create another context for each GPU (tell me if this idea is stupid).

      and then, the segmentation fault was gone, but this was the output of my program:

      --------------------------------------------------------

      CL_SUCCESS

      Context creation for device 0 : CL_SUCCESS

      Program creation for device 0 : CL_SUCCESS

      Compile Program for device 0 : CL_SUCCESS

       

      attention! cl device Fiji is working in the mines from now ... RIP CL_SUCCESS

      Context creation for device 1 : CL_SUCCESS

      Program creation for device 1 : CL_SUCCESS

      Compile Program for device 1 : CL_BUILD_PROGRAM_FAILURE

      Internal error: Link failed.

      Make sure the system setup is correct.

      CL_SUCCESS

      Context creation for device 1 : CL_SUCCESS

      Program creation for device 1 : CL_SUCCESS

      Compile Program for device 1 : CL_SUCCESS

       

      attention! cl device pthread-AMD Opteron(TM) Processor 6276 is working in the mines from now ... RIP CL_SUCCESS

      WriteBuffer success

      WriteBuffer success

      Kernel success

      Kernel success

      ^C

       

       

       

      I am guessing I am missing some GCN-1.0 libraries.  Do the linux 17.30 drivers support the radeon 7700 cards?

      thanks

       

      *edit:

       

      sorry, I just saw I forgot the program source, here it is

      I already posed a lot of questions here, so tell me if It gets annoying

       

      *edit2:

      confirmed to be a problem with the 4.13-rc4 kernel instead of the previous versions tested (4.9.20, 4.10.17), but I don't think this is a kernel-related problem.

        • Re: help needed with BUILD_PROGRAM_FAILURE
          jpsollie

          ok, for people reading this, I have a proof it is a compiler problem:

          while my code is still dirty, it actually compiles on mesa 17.2-rc3!

          on amdgpu-pro it still crashes with link failed.

          hopefully, the amd engineers will fix this soon, because my program causes GPU faults, and doesn't run correctly on no gpu (while on amdgpu-pro it does on R9 nano)

           

          *edit:

          as always, cleaning up the code removed the link failure!

          while I have no idea what exactly went wrong (and the output is still wrong), it now compiles.

          so no blame for amd, I'd rather be more careful reporting bugs