1 2 3 4 440 Replies Latest reply on Jul 28, 2017 2:59 AM by vitkor Go to original post
      • 15. Re: gcc segmentation faults on Ryzen / Linux
        alfonsor

        I finally found a way to make parallel compilation works (here): enabling LLC (Load Line Calibration) in BIOS. But it makes me very worried...

        1 of 1 people found this helpful
        • 16. Re: gcc segmentation faults on Ryzen / Linux
          amdmatt

          Enabling low levels of LLC (Levels 1-2) is not dangerous, it just ensures less voltage droop when the processor is under heavy load.

          • 17. Re: gcc segmentation faults on Ryzen / Linux
            maxrussell

            Same problem here on CentOS (I'll try LLC option Tomorrow).

            No problem on Win10.

            • 18. Re: gcc segmentation faults on Ryzen / Linux
              amdmatt

              Hi Folks,

               

              I appreciate your patience and i have some suggestions you can try.

               

              In the Asus BIOS there is an option called called OPCache Control. Disabling this may resolve this issue.

               

              Another suggestion is to try disabling SMT. Look for an option in the Bios called 'Disable SMT'.

               

              Please try one or both of the suggestions above depending on which Motherboard you have and let me know how you get on.

              • 19. Re: gcc segmentation faults on Ryzen / Linux
                alfonsor

                Not all the BIOS have the OPCache option, f.e. my Gygabyte K7 doesn't. Disabling SMT alleviates the problem, but it doesn't solve it. As well the LLC I cited, which greatly reduces the segfaults, but they are still present.

                 

                The problem is not about the time needed to solve the iusse ("patience"), but it is if there will ever be a solution. I am developing a sense of "this is how it is".

                • 20. Re: gcc segmentation faults on Ryzen / Linux
                  dryatu

                  This is almost word to word what I'm seeing.

                   

                  Disabling SMT didn't remove the problem and there's no option to disable OPCache on MSI Mortar Arctic. Setting the LLC to 5/8 didn't help either.

                  • 21. Re: gcc segmentation faults on Ryzen / Linux
                    sat

                    > In the Asus BIOS there is an option called called OPCache Control. Disabling this may resolve this issue.

                    > Another suggestion is to try disabling SMT. Look for an option in the Bios called 'Disable SMT'.

                     

                    Thank you for sharing this information. However, as I said at May 23, 2017 6:14 PM in this thread,

                    the above mentioned workarounds didn't work for me. In addition, although my BIOS is ASUS PRIME X370 pro,

                    there would be no OPCache option.

                     

                     

                    Is AMD trying to make a new AGESA fixes this problem, not only finding workarounds which may work for some people?

                     

                    • 22. Re: gcc segmentation faults on Ryzen / Linux
                      mcl00

                      I am having exactly the same issues as the original poster (and the numerous others that have posted on the gentoo forum linked in the original message). I do not have an OP Cache setting in my motherboard (MSI X370 Gaming Pro Carbon) so I am not able to disable it. Turning off SMT does not fix the problem.

                       

                      I have tried various combinations of the following with little to no effect:

                      Disabling SMT

                      Disabling Cool'n'Quiet

                      Varying clock speeds and timings of my RAM (Corsair 3200MHz) as low as 1866MHz CL 16.

                      Using the "performance" CPU governor.

                      Various LLC settings from auto (off?) through 4.

                      Setting the NB voltage up to 1.15V

                       

                      While the problem encountered is 'random' segmentation faults in that they do not occur in any fixed memory address or particular part of a compile, the system will very consistently crash / segfault in any highly multi-threaded process that uses a lot of RAM. To reproduce the issue, I simply loop through compiling mesa 17.0 with -j16 and the build directory mounted to tmpfs (i.e. a ramdisk location for the build files). If I make it past 10 minutes without a segfault it's a lucky run.

                       

                      I can't monitor CPU temperatures within Linux yet, but this does not appear to be heat related - cool ambient temperatures with the case open and a room fan blowing directly into the case did not increase the stability to any noticeable degree (and the CPU temperatures in Windows running prime95 with 16 workers stay reasonable).

                       

                      Note that this problem is not limited to compilation tasks in Linux - prime95 will throw errors as well. It's just much less frequent (e.g. where compiling mesa in a ramdisk will segfault in minutes, prime95 can go for a few hours before complaining.)

                       

                      I would really appreciate a response as "This question is Assumed Answered." is not true. The problem exists, and even if disabling SMT "fixed" it (which I'll repeat - it doesn't) that isn't an answer.

                       

                      EDIT: I forgot to mention I also tried each of my memory sticks (2x8GB) independently without any improvement. If one stick was bad, you would expect to see segfaults with that stick but not the other. Both sticks together and each independently all display the same behaviour. Note that I haven't tried every combination of settings with every permutation of memory installed - just the default settings with the single DIMMs.)

                      Additionally, memtest86 will run through at least two cycles without error even with the RAM set at 3200MHz.

                      3 of 3 people found this helpful
                      • 23. Re: gcc segmentation faults on Ryzen / Linux
                        bridgman

                        How many of you seeing problems are running Gentoo ? So far my impression is "most" at least...

                        • 24. Re: gcc segmentation faults on Ryzen / Linux
                          mcl00

                          Since Gentoo is a source-based distribution, a significant amount of time setting up and/or updating the system involves compiling software packages, which increases the impact of this bug significantly for that community and is likely why you see the most comments from Gentoo users. I actually use Ubuntu as my primary OS, but I am able to reproduce the problem most consistently under Gentoo so that is what I have used to try out various BIOS tweaks. That said, to rule out OS-specific issues I did a test compile of gcc under Ubuntu 16.04 and was able to reproduce the problem (again, using make -j16). Under Ubuntu I didn't set the build directory up in a tmpfs mounted file system, but even running from my SSD and not from RAM I still can't consistently get through the full compile. I did once get it to compile twice in a row without a segfault, but that's the exception (and still not acceptable...)

                           

                          To rule out a "Linux-specific" issue, I ran prime95 with 16 threads under Windows 10 to see if that was stable. As noted earlier it was not (although earlier I failed to mention I was running prime95 in Windows). Prime95 does run successfully for significantly longer than a multi-threaded compile, however, so I have not been using that to test BIOS settings.

                           

                          To be fair to amdmatt's suggestions, disabling SMT is the one thing that makes the biggest difference for my system stability while compiling. My test compile of mesa-17.0 was able to successfully complete nearly 14 times in a row before crashing with SMT disabled and make reduced to -j8. That said, I still don't feel that this is an acceptable solution - I didn't buy a 16-thread processor to run it with half the threads disabled (and even then not be 100% sure that it's not going to crash, or corrupt my data).

                          • 25. Re: gcc segmentation faults on Ryzen / Linux
                            sat

                            > How many of you seeing problems are running Gentoo ? So far my impression is "most" at least...

                            As far as I know, actually most of them are Gentoo user. I guess It's because heavy compilation workload,

                            which causes this problem, is the daily work of Gentoo. I reproduced this problem on Ubuntu and

                            maxrussell(2017/06/01 17:33) is a CentOS user.

                             

                            This problem is not a distro specific one.

                            • 26. Re: gcc segmentation faults on Ryzen / Linux
                              bridgman

                              sat wrote:

                              I guess It's because heavy compilation workload, which causes this problem, is the daily work of Gentoo.

                              Hmm, good point. I had been thinking about Gentoo from the point of view of the compiler binaries having possibly been compiled with problematic compiler options or the kernel picking up some specific combination of patches but missed the "you do a lot of compiling" aspect.

                              • 27. Re: gcc segmentation faults on Ryzen / Linux
                                meihong

                                bridgman wrote:

                                I had been thinking about Gentoo from the point of view of the compiler binaries having possibly been compiled with problematic compiler options or the kernel picking up some specific combination of patches but missed the "you do a lot of compiling" aspect.

                                I don't think this is compiler options specific issue.

                                I have compiled everything on kernel 4.11.3 and gcc 6.3 with no `--march` option and `--mtune=generic` for generic amd64 CPUs, not only for Ryzen, but I still have this issue.

                                 

                                And also I think that heavy workload, especially compilation workload is NOT the reason of it but only brings it much faster.

                                • 28. Re: gcc segmentation faults on Ryzen / Linux
                                  atomsymbol

                                  bridgman wrote:

                                   

                                  How many of you seeing problems are running Gentoo ? So far my impression is "most" at least...

                                  My Ryzen 1600 seems to be fine while having -j12 in MAKEOPTS, but it is just a few days old so it may be too early to tell. The CPU isn't overclocked.

                                   

                                  I was experiencing Linux boot issues after I installed Ryzen, but those seem to have been resolved. Windows 10 is booting and working fine.

                                  • 29. Re: gcc segmentation faults on Ryzen / Linux
                                    foppe

                                    Also running into this intermittently on fc25 while compiling kernels, even as the system's rock stable otherwise (per prime95 torture, memtest).

                                    Using a ab350pro4 with most recent bios (agesa 1004 based), so no access to LLC or opcache settings. Ryzen 1600 at stock voltage+speeds initially, currently at a tested-stable/modest OC of 3.7&3.25v. CMK16GX4M2B3000C15 @ 2933 MHz XMP profile.

                                    1 2 3 4