9 Replies Latest reply on Jun 20, 2012 2:54 PM by Neverhood

    Does CAL runtime support HD7970?

    Neverhood

      Hello, everybody!

       

      I use CAL/IL for GPGPU calculations on AMD GPUs. But it seems neither APP SDK v2.6 nor v2.7 doesn't allow to use CAL runtime for Radeon HD7970.

      I can't tell you yet where the problem is exactly (my customer reports about this problem with HD7970), but most likely problem with kernel compilation. In "cal.h" header file there is no CAL target for Tahiti GPUs. I thought new  APP SDK  v2.7 would update this header file, but no, CAL Interface Header is still has version "1.00.0 Beta".

       

      Am I doomed to rewrite all my kernels with OpenCL to have HD7970 support?

       

      Best regards, Dmitry.

        • Does CAL runtime support HD7970?
          nou

          well CAL is deprecated now so. even when you manage access to 7xxx via CAL it is unsupported way.

          1 of 1 people found this helpful
            • Re: Does CAL runtime support HD7970?
              Neverhood

              Hello, nou.

               

              Where can I read about CAL been deprecated? Some official papers?

              It's a pity to have two pieces of code: one for old GPUs and one for modern GPUs. All the more CAL API was enough for my needs.

               

              Best regards, Dmitry.

                • Re: Does CAL runtime support HD7970?
                  afo

                  Hi,

                   

                  I believe it was in the Developer Release Notes for SDK 2.4, but they are not available.

                   

                  best regards,

                  Afo

                  1 of 1 people found this helpful
                  • Re: Does CAL runtime support HD7970?
                    realhet

                    Cal + 79xx still works, but with some changes:

                    In Catalyst 11.12 global buffer is broken, so you have to use IL_CS and UAV. This is the best driver for CAL.

                    In later Catalyst versions any attempt to read/write an UAV will freeze the GPU, but you can write data into the CB0. You can do it on 7xxx because it has regular caches instead of read-only constant memory's, only needs a little patching on the microcode.

                    Catalyst 12.6 beta: launching and/or querying an even's completion became a blocking operation, so you can't overlap kernels anymore.

                     

                    So it's pretty much dead, if you ask.

                     

                    I've switched to OpenCL recently, here are some good things in it:

                    - It has meaningful error messages, not just operational_errors or access violations. If you made mistakes in your code, it will tell you.

                    - You can still use the good old CAL compiler, the AMD_IL language imho is not deprecated, only the frontend is.

                    - You have better multiGPU scalability. (cal seems to have 1-2% penalty when using multiple GPUes)

                      • Re: Does CAL runtime support HD7970?
                        Neverhood

                        realhet wrote:

                         

                        I've switched to OpenCL recently, here are some good things in it:

                        - You can still use the good old CAL compiler, the AMD_IL language imho is not deprecated, only the frontend is.

                        - You have better multiGPU scalability. (cal seems to have 1-2% penalty when using multiple GPUes)

                         

                        I know how to link memory in CAL to kernel parameters, e.g. link memory to "cb0". But I can't understand, how to do this with OpenCL, if I will continue to write kernels in IL.

                          • Re: Does CAL runtime support HD7970?
                            realhet

                            >"how to do this with OpenCL, if I will continue to write kernels in IL"

                             

                            1. build a simple helloworld.cl application with your kernel parameters:

                            __kernel void F( __global uint *dst, __constant uint *cb)  {  dst[get_global_id(0)] = get_global_id(0) ^ cb[0]; }

                            It will use an UAV and a CB, and thats all we need.

                             

                            2. Use the following build parameters:

                            -save-temps=c:\cl -fno-bin-source -fno-bin-llvmir -fno-bin-amdil

                            This will generate an opencl .elf image that contains only the cal style .elf image. (That's actually an .elf inside an .elf file).

                            When you load it, the OpenCL api can't use anything else than the cal.elf image. (If it contains the an llvm image, then it will recompile it to amd_il, but that's not what we need.)

                            Also you will found the source .il and .isa files in the C:\ directory.

                             

                            3. Check the C:\ for new files! You will find two .il files, select the smaller one, the one without the debug info:

                            dcl_typeless_uav_id(10)_stride(4)_length(4)_access(read_write)   //this will be the UAV  (it's named UAV10, not UAV0)

                            dcl_cb cb1[2]     //this contains kernel arg offsets

                            ; Kernel arg setup: dst

                            mov r1.x, cb1[0].x    //this is the offset of the 'dst' parametes. 'dst' starts at UAV10[r1.x]

                            ; Kernel arg setup: cb

                            mov r1.y, l5.x  //this is the offset of the __constant uint *cb  parameter. It's constant zero in this example.

                            //cb0 will contain kernel parameters, for example the xyz domain ranges are stored here.

                            //cb2 will be the constant buffer

                            //r1021.xyz will contain the global thread Ids

                             

                            4. Take the .il file (the non-debug one), replace the example code in func 1027, with your own, and DON'T touch the comments, OpenCL runtime will use it.  Compile it with calcl.

                             

                            5. Replace the .text section in the opencl.elf with the newly compiled cal.elf.

                             

                            6. In the .symtab there is an entry named __OpenCl_F_Kernel,  update the size field so it will be equa to the cal.elf size.

                             

                            Unfortunately the OpenCL generated amd_il code is hardware dependent, so you have to do (or automatize)  this process on all cards for at least to get all the hw specific amd_il comments. For example on the hd4xxx it uses raw_uav0, and on 7xxx it uses typeless_uav10, so there are some differences.