2 Replies Latest reply on Nov 24, 2010 5:59 PM by oscarbarenys1

    Suggestions of features for enhanced CPU backend!



      reading slides of 69xx compute features seems that still will not have all features supported by Fermi, as real function calls (with stack) which will allow recursion and function pointers, I was thinking that AMD could add to CPU backend while waiting for being implemented on hardware..

      Also this would be similar idea as CUDA-x86 which will have to support such CUDA features..

      Specifically I would add to CPU backend:

      *Real Function calls (supporting recursion) and function pointer support: this can be implemented right now pretty efficiently seems as Ocelot does in translating CUDA 3.x PTX files to LLVM..

      *Similar to printf add malloc and free.. This can be efficient too.. as Nvidia implements even on GPUs.. Your 69xx cards support calling malloc and free in GPU code as Fermi supports it?

      *Add asm("") function being able to insert  x86 assembly code in kernels

      CUDA allows asm function inside CUDA device functions..

      *Image support and autovectorization: intel supports both!

        • Suggestions of features for enhanced CPU backend!
          The CPU backend supports real function calls, inline asm, and function pointers, however OpenCL does not allow it. The CPU backend is the LLVM x86 backend with some modifications for OpenCL, so anything it supports we can support, however the OpenCL language does not support many of your requests yet.

            • Suggestions of features for enhanced CPU backend!

              Thanks Micah for your deep insight only thinking about

              what features can come next to OpenCL world..

              I think that similar as AMD thinks exposing your fast SAD 

              and GDS hardware implementations as soon as you can even if creating

              propietary extensions is good I think also as already your backend supports that and "if" it's not a real big effort AMD could create a similar extension or various ones supporting such features.. I think that can provide benefits for users already playing with it in CUDA hardware and that later will want to port to OpenCL for "some" portability if your backend supports it..

              Anyway thanks for answering.