2 Replies Latest reply on Nov 21, 2008 1:16 PM by jfkong

    a question about the IL code in the hellocal sample


      const CALchar* ILKernel =
      "dcl_input_position_interp(linear_noperspective) vWinCoord0.xy__\n"
      "dcl_output_generic o0\n"
      "dcl_cb cb0[1]\n"
      "sample_resource(0)_sampler(0) r0, vWinCoord0.xyxx\n"
      "div r0, r0, cb0[0].x\n"
      "mul o0, r0, cb0[0]\n"


      The thing I can't understand about the code is that:

      there are four components in r0 and therefore four values in o0.

      But the hellocal example is working on 256X256 of CAL_FORMAT_FLOAT_1.

      The memory resource declared is 2D FORMAT_FLOAT_4.

      There are 256X256 threads and each thread writes 4 floats.

      Why is the final result  256X256 floats instead of 256X256X4 floats?