7 Replies Latest reply on Feb 14, 2010 3:55 PM by ionel

    Multiple IL (compute) kernels to execute.

      How to run multiple compute kernels in chain from CAL without reloading modules

      Reposting from "Context, Compiler, Linker etc. " topic, to attract attention.

      1. To execute multiple images without need of reloading it into CAL: If multiple contexts will be created, then every context will have it's own image to execute. whether context switch will be noticible if one context will be started after another? Whether it is better solution then load module every time before start execution? (assuming only one image can be load per contexts, so to run the second IL image must be reloaded).

      2. Whether the same resource can be attached to different contexts for i/o? (context1 for input, context2 for output)?

      3. What is the example of running multiple functions in calCtxRunProgramGridArray if only one image can be loaded? If multiple ILs linked into the same image, what function names must be specified to run one after another, as sample says only about "main" as the only function to execute?

      4. Micah, in one of the posts you mentioned to use some input paramter int kernel to trigger between different kernel function calls (code paths) from the "main", so all functionality will be linked together. In this case, whether non-executing paths will be scheduled (code branches which will not be executed) degrading performance?


        • Multiple IL (compute) kernels to execute.
          Since IL by design can only have a single entry point, the method to do so is to pass the function you want in the constant buffer and then take a branch. The compiler will generate the ISA corresponding to the amount of resources of the largest branch.
          • Multiple IL (compute) kernels to execute.


            Does it mean kernels can not be lined up in one submissions, and there is no way LDS -es can be shared between kernels. Correct?


              • Multiple IL (compute) kernels to execute.
                This is incorrect, they can be shared if they fit within a single command buffer via the calCtxRunProgramGridArray api call. The array version is similiar to running multiple calCtxRunProgramGrid calls except that it gaurantees that LDS/SR data is persistent between kernel executions.
                  • Multiple IL (compute) kernels to execute.






                    calCtxRunProgramGridArray declared as
                    typedef CALresult (CALAPIENTRYP PFNCALCTXRUNPROGRAMGRIDARRAY)(CALevent* event,
                                                                      CALcontext ctx,
                                                                      CALprogramGridArray* pGridArray);

                    where CALprogramGridArray defined as:

                    typedef struct CALprogramGridArrayRec {
                        CALprogramGrid* gridArray;/**< array of programGrid structures */
                        CALuint     num;           /**< number of entries in the grid array */
                        CALuint     flags;         /**< misc grid array flags */
                    } CALprogramGridArray;

                    and struct what contains entry point for kernel to execute is:
                    /** CAL computational grid */
                    typedef struct CALprogramGridRec {
                        CALfunc     func;          /**< CALfunc to execute */
                        CALdomain3D gridBlock;     /**< size of a block of data */
                        CALdomain3D gridSize;      /**< size of 'blocks' to execute. */
                        CALuint     flags;         /**< misc grid flags */
                    } CALprogramGrid;

                    func is the "main",
                    gridBlock == dcl_num_thread_block
                    gridSize == pixels_to_handle/gridBlock.

                    To run 2 kernels within one calCtxRunProgramGridArray call func names must be different, what is not possible as stated in prev. message.
                    If recommendation is to update the const buffer to use const value as a conditional value as brunch condition, I can not see the way how const buffer can be updated: CALprogramGridArray has a pointer to 2 grids with no way of updating const buffers between them.
                    How const buffer can be updated then?









                    • Multiple IL (compute) kernels to execute.

                      Dear Micah,

                      Please confirm, that calModuleLoad can load more then 1 images into single context, so having more then 1 modules associated with the same context.

                      This will explain how it works with calCtxRunProgramGridArray, as "main"s from different modules could be set in func parameter.