3 4 5 6 7 1,895 Replies Latest reply on Jun 14, 2018 1:36 PM by constantinx Go to original post
      • 60. Re: gcc segmentation faults on Ryzen / Linux
        sat

        > Can you post raw output (not using the script) on Git to see POF please?

         

        Let me clarify what you want. My image is to get the log.txt of the build failure case

        in the following command under WSL. Is it correct?

         

        $ cd src/linux

        $ make defconfig

        $ make -j16 &>log.txt

         

        And if correct, which number of this log do you want? Just one is OK?

        • 61. Re: gcc segmentation faults on Ryzen / Linux
          yiyihu

          Hi, my configuration is

          Asrock B350 Pro4 With latest BIOS(AGESA 1.0.0.6)

          r7 1700x

          Gskill Ripjaws V 32G (16G x 2) running at 2133

           

          I try to build gentoo, and meet this bug too, With OpCache enabled, There will be sporadic segfault, (always happen)

          After I disable OpCache, I've compiled whole gentoo for 3 times, There is no segfault anymore.

          So, I believe this is caused by a bug in OpCache.

          I'm just curious, wether this hardware bug can be fixed via BIOS update? I don't mean workaround.

          And if the OpCache is wholly disabled, How much performance impact there will be please?

           

          Thanks!

          • 62. Re: gcc segmentation faults on Ryzen / Linux
            whiskey-foxtrot

            Correct - I'm trying to isolate this and run a comparison as I can't replicate it - it's driving me nuts.

            • 63. Re: gcc segmentation faults on Ryzen / Linux
              whiskey-foxtrot

              and to clarify - I'm using gcc 7 - haven't tried anything older which most distros still use.

              • 64. Re: gcc segmentation faults on Ryzen / Linux
                whiskey-foxtrot

                which GCC are you building it against?

                 

                I'm not saying there aren't any CPU issues - as every release (both Intel and AMD) have them, every time. It generally takes some time for compilers to build work-arounds as they catch up. The most subtle bugs in compilers can trigger errors not experienced with previous CPUs.

                 

                I'm still trying to replicate this and so far building kernel 4.11.4 on a loop overnight hasn't generated anything sadly. Next step is to downgrade my build system to use anything older than gcc-7; I might as well since I'm not keeping this installation.

                • 65. Re: gcc segmentation faults on Ryzen / Linux
                  alfonsor

                  here  I never had problems with the kernel; I can let the kernel compile in a loop all day long; an easy way to trigger the problem is to start a mesa compilation with -j16 in a loop with a parallel gcc compilation (whatever version) with -j16; sooner or later mesa fails with a segfault in bash (sometimes gcc segfaults itself)

                   

                  this happens with  the whole system compiled with gcc 5, 6 or 7 with no cflags optimization or with optimizations

                  • 66. Re: gcc segmentation faults on Ryzen / Linux
                    yiyihu

                    The kernel version I use is

                    Linux localhost 4.11.4-gentoo #2 SMP Thu Jun 8 20:59:54 -00 2017 x86_64 AMD Ryzen 7 1700X Eight-Core Processor AuthenticAMD GNU/Linux

                     

                    The gcc versions I tried are both gcc 5.4.0 and gcc 6.3.0, I'm not building the kernel, I just do something like

                    while :; do if ! emerge media-libs/mesa;  then break; fi; done

                    after that, I leave the test pc over night, we may meet the error when OpCache is enabled after a period.

                    This is the way I test if the system is ok.

                    After the mesa-test-loop runs long enough with OpCache disabled, I feel it may be stable,

                    Then I try 'emerge -e system && emerge -e world'. And the command finishes successfully 2 times as far as I tried.

                     

                    With OpCache enabled, I never meet it pass both 'emerge -e system' and 'emerge -e world', Though, sometimes, the 'emerge -e system' may finish.

                     

                    And I have MAKEOPTS='-j 16' in my /etc/portage/make.conf

                    If you need the make.conf, I'll no paste somewhere.

                     

                    Thanks!

                    • 67. Re: gcc segmentation faults on Ryzen / Linux
                      raydude

                      I want to take a different approach.

                      I'm running Gentoo on a Gigabyte mATX mobo, with a Ryzen 5 1600 and Galax DDR4-3600 DRAM. I'm running 1.4 V core, stock cooler with a 750 watt EVGA PSU. I'm running at 3.8 GHz and RAM is running with BIOS default DDR timing at 2933 MHz. I'm running gcc 4.9.4 with no march option.

                       

                      cat /proc/sys/kernel/randomize_va_space
                      2

                       

                      I emerged gcc-6.3 this week and word last week without issue at a -j12. I haven't had any problems.

                      Can someone (with a gigabyte mobo, preferably) give me a BIOS / CPU / DRAM / Voltage configuration that is known to have problems and an emerge that will cause the failure. I want to see if I can reproduce it with my rig.

                       

                      Thanks.

                      1 of 1 people found this helpful
                      • 68. Re: gcc segmentation faults on Ryzen / Linux
                        whiskey-foxtrot

                        I used Mesa v11.2.0 (- default source on Xenial/Ubuntu) since 17.1.2 required way too many dependencies I didn't feel like hunting down.

                         

                        I've ran 14+ compiles (make clean ; make -j16) using gcc-7 and -j 16 without any issues - except for me getting pissed at sloppy errors shown in Mesa itself. I stopped counting but somewhere around the 18th time I did end up with a segfault. This is after I also started running "stress -c 16"! Without running "stress", nothing happens on this system.

                        Screenshot from 2017-06-10 14-19-26.png

                        with strace:

                        Screenshot from 2017-06-10 14-55-50.png

                        I'm not worried about the temps as the fans barely spun up which only happens around the 50c range. I'll have to find some other way to test this as compiling Mesa with all its errors isn't quite reliable as a test.

                        • 69. Re: gcc segmentation faults on Ryzen / Linux
                          alfonsor

                          So you are probably among the lucky ones without the bug. There are many users with the bug and many users without the bug. The weight of those "manies" I don't know.

                           

                          And that is the real problem: not everybody has the bug. How is it possible? Are only some cpus affected?

                           

                          Arg.

                          • 70. Re: gcc segmentation faults on Ryzen / Linux
                            foppe

                            What would interest me more is if there are people with the motherboards identified thus far who aren't running into issues, because I keep thinking this is more a mobo/bios issue (perhaps related to SoC voltage? Voltage regulation ability of the mosfets? etc.) than a 'CPU' issue as such.

                            1 of 1 people found this helpful
                            • 71. Re: gcc segmentation faults on Ryzen / Linux
                              raydude

                              I have a Gigabyte AB350M-D3H-CF. I'm using the BIOS F1 2/20/2017, I

                              believe it's a 1.0.0.4a BIOS.

                               

                               

                              Does anyone with this motherboard have the issue?

                              1 of 1 people found this helpful
                              • 72. Re: gcc segmentation faults on Ryzen / Linux
                                whiskey-foxtrot

                                I don't know if the issue is just isolated to a few or if it's indeed a general issue with the CPU. Like I said, I got mine to crash, but that's only by running the "stress" program at full blast as well. I would like to know as well, but there's so little centralized information available - and what I would like to see is a reporting form on AMD's site with the variables (cpu, mobo, OS, crashes - broken down per error, etc etc) so we can watch for patterns. Right now we're just trying to piece it all together from spread out sources without a baseline/test or avenue for reporting.

                                 

                                All my other Ryzen systems are pretty much the same except I also have some 1700X floating around; motherboards are all Asus Crosshair VI Hero, all G.Skill RAM and EVGA PSU's.

                                 

                                To AMD: Please provide a standardized form just for the new CPUs to help narrow these problems down; limit text input and provide as many options as possible that pertain to Ryzen specifically.

                                • 73. Re: gcc segmentation faults on Ryzen / Linux
                                  sat

                                  Unfortunately, and surprisingly, this problem disappeared on WSL

                                  when make -j16 &>log.txt (usually I redirect it to /dev/null).

                                  Oh the other hand, on Ubuntu, it happens as usual even if  make -j16 &>log.txt.

                                   

                                  This problem is really sensitive about any changes, mb, hardware/software

                                  settings, and so on.

                                  • 74. Re: gcc segmentation faults on Ryzen / Linux
                                    alfonsor

                                    I don't think it is sensitive to changes, it is very random. I mean, I can replicate it easily with the usual mesa/gcc parallel compilation, but sometimes everything just works fine for hours then suddenly things start to go bad. And I can't find any patterns to justify why two seconds before things worked and now they are not. No reboot, no changes, no ambient temperature increase, nothing.

                                    3 4 5 6 7