2 Replies Latest reply on Jul 9, 2008 8:43 AM by xiaohema

    Test the instruction ushr

    xiaohema
      The result ushr may be uncorrect,I don't know why.

      Hi all,

           I am studing the AMD Compute Abstraction Layer Programming,and I plan to develop some gerenal purpose programs with the GPU of ATI. The graphics card which I used is sapphire toxic HD 3870. The OS is Windows XP32. The package of CAL is cal_brook_1.00.2_beta_xp32.zip, I download from the web of AMD. I test many instructions in term of the document of Intermediate Language(IL) Reference Manual.

          When I test the instruction ushr I found:
      1.The usage of theinstruction ushr is "ushr dst,src0,src1"(92 page of document il.pdf).
      2.when the most significance bit of src0 is zero,the result is correct.
      3.when the most significance bit of src0 is one,the result is uncorrect.
      example: dst = r0
      src0 = 0x88000000,0x78000000,0x80000000,0x70000000
      src1 = 1,1,1,1
      the result is: 0x04000000,0x3c000000,0x00000000,0x38000000
      I think the correct result is:
      0x44000000,0x3c000000,0x40000000,0x38000000
      4.I test many times with other format and find the same problem.
          I had contacted with the development center a month ago, but I didn't statisfy their answer. I expect somebody can help me to test the instruction ushr ,thanks a lot.

          There may be a mistake with the instruction ushr,may be I didn't find the correct way to use the instruction ushr. I want to resolve the problem.

        • Test the instruction ushr
          sgratton

          Hi there,

          I've just tested the ushr instruction on my system (linux 64 bit, package cal_brook_1.01.0_beta_lnx64.zip, calGetVersion returning 1.1.1) and it seems to work fine, returning the values you hope for the inputs you suggest. To avoid the instruction being optimized away (checking the output of calclDisassembleObject) I had to pass the values through constant buffers rather than use literal registers in the IL code. I don't know how the optimized values get generated in the latter case: perhaps there could be an issue there if that is what you've been trying (perhaps involving your c++ compiler?).

          Best,
          Steven.