47 48 49 50 51 1,858 Replies Latest reply on Dec 10, 2017 4:00 PM by kertp Go to original post
      • 720. Re: gcc segmentation faults on Ryzen / Linux
        shmerl

        Can anyone with mce freezes / reboots, post your common thermal graph? You can do it for example using Ksysguard in KDE (by adding sensor for CPU temperature).

         

        Mine looks like this:

        vH4Ha3F.png

        See these periodic spikes in CPU temperature to around +40°C which are gradually straightened by the fans. Not sure if it should affect mce though, the system looks pretty well cooled to me.

        • 721. Re: gcc segmentation faults on Ryzen / Linux
          xtronom

          The 4.12.5 kernel still crashes for me on Slackware. One machine was on BIOS defaults running the mesa test, it finished with 8/10 successful builds. Second machine was running libdrm test on 2 cores and first build crashed after 34 minutes. The numbers look typical.

          • 722. Re: gcc segmentation faults on Ryzen / Linux
            in2tactics

            After reading all 49 pages of this thread, I figured I should add my own experience. I have a Ryzen 7 1700 running at stock frequency that has been exhibiting erratic behavior. I have been getting MCE errors and random reboots while using Windows 10. Additionally, when I installed Debian 9.1, I was able to produce segfaults using the Phoronix Test Suite with the same settings as Michael Larabel used over at Phoronix to confirm the segfault issue. I have also noticed that my particular processor is very sensitive to memory speed and timings as I am unable to run memory at speeds specified by my motherboard's memory qvl. I was able to find a stable memory configuration that consistently passes MemTest86 overnight and Stressful App Test with no errors, but the memory runs below what I expected based on the qvl. At this point, I am reasonably certain that my remaining issues are simply due to a particularly flakey piece of silicon as adjusting the CPU and SoC voltages in accordance with some of the previous posts had no effect. I have not checked which production week my processor is as of yet, but I will document that at some point.

             

            System Specifications:

            Ryzen 7 1700 @ stock frequency

            Cooler Master Hyper 212 EVO

            ASUS PRIME X370-PRO with BIOS 0805 AGESA 1.0.0.6a

            G.SKILL TridentZ F4-3000C15D-32GTZ @ 2800MHz

            • 723. Re: gcc segmentation faults on Ryzen / Linux
              fujii

              I got the second RMA'd processor. As mcl00 already reported in #642, it has stepping 1 and UA 1725SUS printed on the processor while my original one has UA 1707PGT and the first RMA'd one has UA 1716PGT.

              I tested more than 13 hours last night with stock BIOS settings and no workarounds (it means SMT enabled, uOP cache enabled, ASLR enabled), and got no segfaults.

              2 of 2 people found this helpful
              • 724. Re: gcc segmentation faults on Ryzen / Linux
                udamanfunks

                UA1725SUS

                 

                17 is supposed to be the YEAR (2017)

                25 is supposed to be the WEEK (so this means end of June).
                SUS is supposed to be where it was packaged

                 

                Please post the UA numbers that the rest of you guys get.  Hopefully, we can find out what the cutover date is when this was fixed in the fabs.

                • 725. Re: gcc segmentation faults on Ryzen / Linux
                  supercom32

                  @fuji: That's great news. Congrats on getting a working CPU! If this is indeed a Silicon issue, then there's no need to randomly keep poking around!

                   

                   

                  • 726. Re: gcc segmentation faults on Ryzen / Linux
                    knutjbj

                    Are there any update on this problems.

                    • 727. Re: gcc segmentation faults on Ryzen / Linux
                      apache14

                      Hi all,

                       

                      How long does AMD normally get round to replying to service requests? It's been 2 days for me (got the confirmation email to say that I had raised an issue) but nothing since (is there a site I can login to to track the status ?)

                       

                      Also my current R7 1700 is UA 1707SUT and surfers from the segfaults and the kernel logged MCEs

                      • 728. Re: gcc segmentation faults on Ryzen / Linux
                        Deluxe

                        It took 6 days between posting service request and first reply from tech support guy in my case.

                        • 729. Re: gcc segmentation faults on Ryzen / Linux
                          apache14

                          Ahh cheers. At least that gives AMD time to sort a generic solution fo everyone who is having these issues.

                           

                          I'd imagine that it will just be a replacement chip (as long as they know the newer chips are not affected)

                          • 730. Re: gcc segmentation faults on Ryzen / Linux
                            oleyska

                            do you have to send in CPU first or ?

                            I'd rather not loose my personal rig for x amount of time

                            • 731. Re: gcc segmentation faults on Ryzen / Linux
                              zombie28

                              This sounds interesting. Could you repeat your tests while underclocking your CPU (if there is such an option in your BIOS)?

                              • 732. Re: gcc segmentation faults on Ryzen / Linux
                                bradc

                                ryzennewbie wrote:

                                 

                                I've created and uploaded the USB image to:

                                https://ufile.io/h1r14

                                 

                                Thanks for providing this. I have no experience with *BSD, so I downloaded this and I'm running it at the moment.

                                ryzen_provoke_freeze just reboots the machine when it gets to 0x...40, same as everyone elses.

                                 

                                This might be an interesting data point though :

                                I've been running ryzen_stress_test now for over 11 hours and not had a failure. A looping kernel compile in linux will fail at least once an hour on the same box.

                                 

                                I'll let it run for another day or so and see if it breaks.

                                1 of 1 people found this helpful
                                • 733. Re: gcc segmentation faults on Ryzen / Linux
                                  ryzennewbie

                                  Thank you very much for trying out that image.

                                   

                                  That freeze during "ryzen_provoke_freeze.sh" is expected - that script pins the program "ryzen_provoke_freeze" to core 0 which seems to be mainly responsible for interrupt managing and therefore "dies" at first. If you run the program "ryzen_provoke_freeze" directly, so that it rotates through all cores, you can be lucky and it will run through. But this behaviour is now circumvented by increasing the "safe zone" towards the top of the memory - see [base] Revision 321899

                                   

                                  Getting failures during "ryzen_stress_test" is a hard one, I know - for the FreeBSD devs as well; at the moment, I cannot reproduce that myself after running for 24h. Furthermore, you won't get any segfaults there only "unable to rename" errors, where some object files suddenly disappear. It's still not clear what causes this.

                                   

                                  I had good results with compiling "ghc" from the ports tree; first eight failures, then successes; as it needs a warming-up time to succeed:

                                  -------------------------------------------------------------------------------------------------------------------------------

                                  root@capetown2:/root/#cat nohup.out

                                  umount: /tmp/ports.ghc: not a file system root directory

                                  [Wed Aug  9 13:09:09 CEST 2017] building... failed

                                  [Wed Aug  9 13:09:41 CEST 2017] building... failed

                                  [Wed Aug  9 13:10:11 CEST 2017] building... failed

                                  [Wed Aug  9 13:10:41 CEST 2017] building... failed

                                  [Wed Aug  9 13:11:11 CEST 2017] building... failed

                                  [Wed Aug  9 13:11:41 CEST 2017] building... failed

                                  [Wed Aug  9 13:12:29 CEST 2017] building... failed

                                  [Wed Aug  9 13:13:21 CEST 2017] building... failed

                                  [Wed Aug  9 13:14:38 CEST 2017] building... success

                                  [Wed Aug  9 13:43:26 CEST 2017] building... success

                                  [Wed Aug  9 14:12:19 CEST 2017] building... success

                                  [Wed Aug  9 14:41:16 CEST 2017] building... success

                                  [Wed Aug  9 15:10:08 CEST 2017] building...

                                   

                                  root@capetown2:/root/work/src/#grep exited /var/log/messages

                                  Aug  9 09:21:00 capetown kernel: pid 59222 (doxygen), uid 0: exited on signal 6 (core dumped)

                                  Aug  9 09:40:32 capetown kernel: pid 60176 (doxygen), uid 0: exited on signal 6 (core dumped)

                                  Aug  9 13:09:41 capetown kernel: pid 6871 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:10:11 capetown kernel: pid 11481 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:10:41 capetown kernel: pid 16079 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:11:11 capetown kernel: pid 20689 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:11:41 capetown kernel: pid 25287 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:12:29 capetown kernel: pid 29885 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:13:22 capetown kernel: pid 34539 (ghc), uid 0: exited on signal 10

                                  Aug  9 13:14:38 capetown kernel: pid 39195 (ghc), uid 0: exited on signal 10

                                  -------------------------------------------------------------------------------------------------------------------------------

                                  but that requires a full-fledged FreeBSD installation that cannot be done on a USB drive easily - at least, I can't do that easily.

                                   

                                  So, I'm now trying to fiddle around with compiling GCC, MESA directly and without the ports tree...

                                   

                                   

                                  Thanks again for testing...

                                  • 734. Re: gcc segmentation faults on Ryzen / Linux
                                    shmerl

                                    bradc wrote:

                                     

                                    Thanks for providing this. I have no experience with *BSD, so I downloaded this and I'm running it at the moment.

                                    ryzen_provoke_freeze just reboots the machine when it gets to 0x...40, same as everyone elses.

                                     

                                    Is that freeze different from mce freezes caused by waking up from C state sleep, or it's the same thing? And if so, can Linux kernel developers work around it similarly?

                                    47 48 49 50 51