8 Replies Latest reply on Oct 16, 2009 9:25 PM by edward_yang

    Multitasking ?

    kos
      Can I run different kernels on the same card in parallel ?

      Can I run different kernels on the same card in parallel ?

        • Multitasking ?
          Methylene

          Literally parallel?  I think that depends on the card and the size of the kernel.  The kernels are placed in a queue of sorts and I'm sure if you tried to call 2 small kernels at the same time they shouldn't have any issues running side by side on the card.

          But there's never any certainty about the order in which things are done unless you assure such things yourself.

          Maybe we can get some more commentary about the parallelism going.

          • Multitasking ?
            MicahVillmow
            Kos,
            A graphics card runs a data parallel code, not task parallel. So, there is no way to run two tasks at the same time. However since kernels are run asynchronously, it can be viewed by the user program that they are running at the same time.
              • Multitasking ?
                eduardoschardong
                Micah,

                RV770 is composed of 10 SIMD processors, each of then 16 elements wide running two 64 wide wavefront in 8 cycles, so 20 different wavefronts per chip.

                It's clear to me why I can't use 8 ALU's for a program and other 8 for another (SIMD), but why not having 10 wavefronts for a program and 10 for another?

                I may have multiple parallel programs that could benefict from a GPU, but they may not be big enough to use all 20 wavefronts (1280 elements), the ability to a program that use fewer than all resources don't block another would be appreciated.
              • Multitasking ?
                MicahVillmow
                Eduardo,
                Different kernel types could be run in parallel, this is how the unified shader architecture works, it can load balance between pixel, vertex and geometry shaders, but not of the same shader types. Since CAL only runs kernels in pixel shader mode, so there is no way for it to run multiple pixel shaders at once. So, it mainly is hardware support that is stopping this from being implemented, if the hardware supported it, we would probably expose it as it is very useful.
                • Multitasking ?
                  MicahVillmow
                  kos,
                  vertex and geometry shaders are graphics only and are not accessible from CAL. Also, compute shader is not part of the graphics pipeline and thus cannot be run in parallel with pixel shader. The only way to do parallel execution is to combine multiple kernels into a single kernel and do conditional execution based on the threads position in the execution domain.

                  i.e.

                  if (threadID < 1024) {
                  call 0; // kernel0
                  }else{
                  call 1; // kernel 1
                  }
                    • Multitasking ?
                      Nikolay_Mikhalev

                      Micah,

                      whether there will be a support of vertex and geometrical shaders in the future? Otherwise what sense to complicate IL specification on the description of registers and instructions for work with these types shaders?

                      • Multitasking ?
                        edward_yang

                         

                        Originally posted by: MicahVillmowThe only way to do parallel execution is to combine multiple kernels into a single kernel and do conditional execution based on the threads position in the execution domain. i.e. if (threadID < 1024) { call 0; // kernel0 }else{ call 1; // kernel 1 }


                        Hi MicahVillmow, thanks for the explanation. I'm new to GPGPU acceleration and learning the knobs in working with it.

                        I wonder in doing what you suggested above, wouldn't it (1) cause the wavefront to serially execute both branches, and (2) reduce the effective number of registers available per thread? Wouldn't it still limit the performance benefit of combining multiple kernels?

                        Is the ability of running concurrent compute shader programs being implemented/enabled in the next version of Brook+/CAL (or OpenCL)? Since the GPUs already allow multiple types of shaders to run in a pipelined fashion, there shouldn't be much problem to run multiple compute shaders also in a pipelined fashion. Or is there?

                        From a high level (algorithm-architecture) point of view, this seems to be very useful for writing more sophisticated parallel programs (than e.g. just matrix multiplication) and the sensible way to do GPGPU acceleration. \

                        Thanks a lot in advance!