1 Reply Latest reply on Dec 16, 2009 10:00 AM by empty_knapsack

    where are the new 5870 (Evergreen) instructions documented ?

    neuralll
      CF_POPCNT gds_write sad4

      Hi guys. I started working with nvidia cards but bought a ATI HD 5970 card since I did read on beyond3d that they sport interesting new instructions

      POPCNT , SAD and mainly because ati have 64k buffer to share data between all SIMD not just workgroup. It's called Global Data Store but now that I had bought the card I can't find anything about this in docs or samples. I plan to load short string just once to global data store and then have all simd units compare to this strng in paralell . And no I don't care that gds don't have working locking mech. since I will just read and for each kernel just once. Idea is to save zillion of loads of the same data from slow mem if all threads compare to the same strings (but chosen during runtime) and also prevent cache pollution. I know that there is constant mem for this but the strings that I compare to are calculated by kernel itself. Or is there way to output to constant mem so if there is 10000 reads but 1 write then caching of constant mem will still be more efficient then global mem ?

      Then I do something like histogram and popcnt instruction is very usefull there to spot difference between two histograms fast.

      Only place that I found those instructions are strings in compiller aticaldd.dll

      CF_POPCNT

      gds_atomic_ordered_alloc
      gds_ushort_read_ret
      gds_short_read_retgds_ubyte_read_ret
      gds_byte_read_ret
      gds_readwrite_ret
      gds_read2_ret
      gds_read_rel_ret
      gds_read_ret
      gds_cmp_xchg_spf_ret
      gds_cmp_xchg_ret
      gds_xchg2_ret
      gds_xchg_rel_ret
      gds_xchg_ret
      gds_mskor_ret
      gds_xor_ret
      gds_or_ret
      gds_and_ret
      gds_max_uint_ret
      gds_min_uint_ret
      gds_max_int_ret
      gds_min_int_ret
      gds_dec_ret
      gds_inc_ret
      gds_rsub_ret
      gds_sub_ret
      gds_add_ret
      gds_short_write
      gds_byte_write
      gds_cmp_store_spf
      gds_cmp_store
      gds_write2
      gds_write_rel
      gds_write
      gds_mskor
      gds_xor
      gds_or
      gds_and
      gds_max_uint
      gds_min_uint
      gds_max_int
      gds_min_int
      gds_dec
      gds_inc
      gds_rsub
      gds_sub
      gds_add

      GDS OP is for R800 up only\n
      gds_atomic_ordered_alloc

      sadhi
      sad4

      sad_accum_prev_uint
      sad_accum_uint
      sad_accum_hi_uint

      IL_OP_SAD
      IL_OP_SAD_HI
      IL_OP_SAD_4

      They seem like Ati Close To Metal thingies. Opposed to this SAD instruction it seems like GDS CTM thingies are not yet mapped to IL language so probably only way to use them now would be to program everything in CTM ? I know that everything will be released in some time but in the same time nvidia guys are quite agressively pushing their tesla 1070 to our marketing.