9 Replies Latest reply on Jan 5, 2011 8:51 AM by lrog

    R7xx chipset OpenCL support



      I am working on my ms thesis regarding GPGPU computing with OpenCL. I am using Radeon HD 4870 as my compute device, but unfortunately found some problems with OpenCL compatibility:

      - no local memory support -> using global memory to emulate local

      - no texture support

      I know that those problems were mentioned some time ago on forum, but are there any plans on improving OpenCL support on RV770 (R7xx) chipsets in the nearest future?


        • R7xx chipset OpenCL support
          There is no plans to add new features for the R7XX chipsets. These devices are only compatible with OpenCL 1.0 with no extensions or image support. The hardware also does not support generic read/write in local memory. The improvements to these chips in the future will come from general improvements in our software and not from device specific feature improvements.
            • R7xx chipset OpenCL support

              Thanks for reply.

              By lack of local memory generic r/w support did you mean more sophisticated memory access and management model (descibed in IL Language specification, quoted below) by R700 chipset family, which isn't supported by CAL (as OpenCL API is above CAL API)?

              But when programming using raw CAL C calls and kernels written directly in IL, can I use/access local memory on stream processors ("SIMD Engine") units and/or texture memory?

              1.5 Access Model for Local Shared Memory

              Each processor has an amount of local memory that can be shared across the threads in a thread group. IL provides two models of memory access to local data store (LDS).

              The first memory access model, called owner-computes, is supported by the HD4000-family of devices. In owner-computes, each thread in a thread group owns a area of LDS memory. The size of the area is declared in the shader. Each thread in a group can write only to the area of memory it owns; however, a thread can read any chunk of memory that is owned by either itself or other threads. An LDS shared memory read is specified by (owner_thread_ID, offset): read the memory area owned by that thread_ID with an offset within the area.

              Different from the access model for threads within a wavefront, the access mode for different wavefronts (within a thread group) is specified by the sharing mode, which is either relative or absolute. If it is relative, new and consecutive space is allocated for each wavefront; if it is absolute, all wavefronts are mapped to the same set of memory starting at address 0. In this mode, wavefronts can overwrite each other’s data.

              The second memory access model is a general read write: each thread can read or write any address in the LDS. This model is supported on HD5XXX series graphics cards.

              Both models allow threads to read or write memory (video or system), but do not provide synchronization to memory.

              Supported inter-thread communication includes:
              • SR – Globally shared registers.
              • Sharing between all wavefronts in a SIMD.
              • Column sharing on the SIMD.
              • Persistent registers.
              • LDS – local data store - read/write. These are read/write registers that
              support sharing between all threads in a group.
              • Data sharing between all threads in a group.
              • Required synchronization.
              • Memory - read/write.
              • Constant buffers
              • Texture cache

            • R7xx chipset OpenCL support
              Yes, the CAL API and IL allows access to almost everything in the hardware. OpenCL however has higher level constraints that don't map to hardware all of the time.
                • R7xx chipset OpenCL support

                  Does it mean, that in newer versions of OpenCL standard (=>1.1), there's to be a mixed API<->HW layer, where OpenCL runtime will be at the same level as CAL?

                  In the "AMD CAL Programming Guide" there's a block diagram descibing API layers, where OpenCL is above CAL.

                    • R7xx chipset OpenCL support



                      Originally posted by: lrog Does it mean, that in newer versions of OpenCL standard (=>1.1), there's to be a mixed API<->HW layer, where OpenCL runtime will be at the same level as CAL?


                      MicahVillmow answered your question. "These devices are only compatible with OpenCL 1.0 with no extensions or image support"

                      If you need more than this either program in CAL and loose the ability to support other vendors hardware or upgrade to a 5xxx or 6xxx series card.


                        • R7xx chipset OpenCL support

                          Strickly speaking, I don't believe he _did_ answer the question.

                          Irog asked, "Does it mean, that in newer versions of OpenCL standard", not would newer versions be supported by the card.

                          Presumably, he's wondering whether an upgrade to 5xxx/6xxx will solve his problem.

                            • R7xx chipset OpenCL support

                              What I understood what Micah said is that the R7xx chipsets will stick with current OpenCL support (1.0, with no LDS and TEX). Also I asked about CAL regarding the CAL/OpenCL API architecture, as I thought that OpenCL API layer is above CAL (CAL/OpenCL programming guide), using runtime provided by CAL - what I understood CAL gives me option for LDS and TEX for R7xx family, while OpenCL does not, that's why the question about layer structure raised...

                              Either OpenCL layer is at the same level as CAL with need for direct bindings to HW or just doesn't/can't use what CAL provides.

                                • R7xx chipset OpenCL support


                                  The LDS specifications as specified  by OpenCL spec enforce some conditions which are not met by the LDS present in RV7xx chips. So the actual Local Data Share has not been made accessible from OpenCL, although it is available from CAL(no restrictictions here).In 5xxx and above series the LDS was modified so that Now it met the specifications of OpenCL and was made accessible.

                                  On AMD GPUs, OpenCL is written over CAL and not directly on hardware. So everything in openCL is mapped to CAL\IL before being converted into binary executable. But other hardwares map the OpenCL code to their IL which is different from AMD IL. So cross vendor coding is only possible at openCL level and not below that.