2 Replies Latest reply on Jul 15, 2009 3:25 PM by ryta1203

    input&output question


      Hi everyone:

      I have a question about the kernel max inputs and outputs. The user-guide denotes that the max inputs 128, max outputs 8. But now I have a problem. My program needs 9 outputs. The kernel is defined like this:

      kernel void
      float t,
      float nx, float ny, float nz4,
      float dx, float dy, float dz,
      float K_Gx_a[],  float K_Gx_b[],  float K_Ex_a[],  float K_Ex_b[],  float  K_Ex_c[],  float K_Ex_d[],
      float K_Gy_a[],  float K_Gy_b[],  float K_Ey_a[],  float K_Ey_b[],  float K_Ey_c[],  float K_Ey_d[],
      float K_Gz_a[],  float K_Gz_b[],  float K_Ez_a[],  float K_Ez_b[],  float K_Ez_c[],  float K_Ez_d[],
      float4 HX[][], float4 HY[][], float4 HZ[][],
      float4 FX0<>, float4 FY0<>, float4 FZ0<>,
      float4 GX0<>, float4 GY0<>, float4 GZ0<>,
      float4 EX0<>, float4 EY0<>, float4 EZ0<>,
      float4 IND[][], float K_A[], float K_B[],
      out float4 EX<>, out float4 EY<>, out float4 EZ<>,
      out float4 FX, out float4 FY<>, out float4 FZ<>, out float4 GX<>, out float4 GY<>, out float4 GZ<>;

      Is there any way to sovle the problem? And is there any global memory to be uesd so as that we do not need to put the FX....GX...(those 6 outputs) as outputs but just as global variables? Or any other tips?

      My card is AMD HD4850 635/1986MHZ

      THX for your help!

        • input&output question

          kernel runs fine even if kernel has more than 8 output streams with some performance overhead.


          Brook compiler generates multipass code and runtime handles properly if kernel has more then 8 output streams.


          Ex: if kernel has 10 output. than

                Compiler generates two kernels. first kerne having 8 outputs and send kernel having 2 outputs but copies the kernel code as it is in both kernels which is a overhead if your kernel doing lot of computation.