1 Reply Latest reply on Oct 14, 2015 5:29 AM by mah

    Modifying MSR to disable the prefetcher

    mah

      Hello,

      On my AMD Opteron 6274 (15h), I have modified MSR to disable the HW prefetcher. According to the BKDG, I have to change the 13th bit of MSRC001_1022 to 1. So I ran

       

      [root@tiger exe]# wrmsr -a 0xc0011022 0x2000

      [root@tiger exe]# rdmsr -a -x -0 0xc0011022

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

      0000000000002000

       

       

       

       

       

       

      As you can see, the prefetcher has been disabled on all 32 cores. Now, when I run ocount command, I see some stats for the prefetcher.

       

      [root@tiger exe]# ocount -e CPU_CLK_UNHALTED,RETIRED_INSTRUCTIONS,DATA_CACHE_ACCESSES,DATA_CACHE_MISSES,DATA_PREFETCHER,PREFETCH_INSTRUCTIONS_DISPATCHED,REQUESTS_TO_L2,L2_CACHE_MISS,L2_PREFETCHER_TRIGGER ./bzip2_base.amd64-m64-gcc44-nn

      spec_init

      Loading Input Data

      Duplicating 13329296 bytes

      Input data 67108864 bytes in length

      Compressing Input Data, level 5

      Compressed data 15115419 bytes in length

      Uncompressing Data

      Uncompressed data 67108864 bytes in length

      Uncompressed data compared correctly

      Compressing Input Data, level 7

      Compressed data 14615506 bytes in length

      Uncompressing Data

      Uncompressed data 67108864 bytes in length

      Uncompressed data compared correctly

      Compressing Input Data, level 9

      Compressed data 14448493 bytes in length

      Uncompressing Data

      Uncompressed data 67108864 bytes in length

      Uncompressed data compared correctly

      Tested 64MB buffer: OK!

       

      Events were actively counted for 35.7 seconds.

      Event counts (scaled) for /home/mahmood/spec-cpu2006-x86_64/exe/bzip2_base.amd64-m64-gcc44-nn:

              Event                                    Count                    % time counted

              CPU_CLK_UNHALTED                         107,255,826,409          55.54

              DATA_CACHE_ACCESSES                      55,980,588,486           66.67

              DATA_CACHE_MISSES                        1,713,225,386            66.65

              DATA_PREFETCHER                          1,069,985,468            66.66

              L2_CACHE_MISS                            243,353,144              66.67

              L2_PREFETCHER_TRIGGER                    239,087,660              55.57

              PREFETCH_INSTRUCTIONS_DISPATCHED         78,746                   66.68

              REQUESTS_TO_L2                           2,793,780,634            44.45

              RETIRED_INSTRUCTIONS                     108,770,366,847          55.56

       

       

       

       

      Why prefetcher stats are non-zero?

        • Re: Modifying MSR to disable the prefetcher
          mah

          Here is my findings with the AMD's prefetcher.

           

          According to the BKDG, there are two MSRs for that:

           

          1) The MSR on page 591 which is MSRC001_102B Combined Unit Configuration 3 (CU_CFG3). Bit #18 has been described as “PfcDis. Read-write. Reset: 0. 1=Prefetcher disabled”
          The default value is 0 so I flip it. Now I see zero stats

           

           

          Performance counter stats for './bzip2_base.amd64-m64-gcc44-nn':

           

              55,860,447,518 L1-dcache-loads:uk

          0 L1-dcache-prefetches:uk

          0 L1-dcache-prefetch-misses:uk

           

            36.372604375 seconds time elapsed

           

           

           

           

           

          2) MSRC001_1022 which I described in the previous post.

           

          Still I have a question that what is the difference between these two MSR's? I think the first one (MSRC001_102B) is related to the L2 prefetcher, but I am not sure. Thanks for any reply...