cancel
Showing results for 
Search instead for 
Did you mean: 

General Discussions

ptheb
Adept I

Xubuntu & Ryzen 7 2700X: Weird application crash/io-errors after resume: Anyone else experience this too?

I'm a long time GNU/Linux user, and recently upgraded to an AMD Ryzen 2700X setup (new mainboard, RAM, new everything). The system is Xubuntu 18.04.3 (64bit) and running fine, but when I resume from suspend I noticed odd instability behavior - across different applications and drives.

Such as:

  * thunderbird / firefox crash right after start. Infinitely until I reboot the machine.

  * Virtualbox VMs with saved state fail to start with "VERR_SSM_INTEGRITY_DECOMPRESSION" error

I don't find anything suspicious in the logs, but I've never experienced thunderbird/firefox crashes of that kind in over 15 years... - and these instabilities only happen when I've resumed from suspend. Never in the first session after a clean reboot. I've ran memtest86 several times: Says memory is fine.

The data for the affected applications is stored on different physical drives - so I don't think it's a storage media I/O error. But it feels like some data is "dirty" after resume...

Does anyone else experience such a behavior?

I'm grateful for any input or suggestions!

0 Likes
1 Solution
ptheb
Adept I

SOLVED!

One of the 2 RAM modules was faulty. I had it replaced now and things seem to be running fine.

View solution in original post

6 Replies

I run Ubuntu and other distributions in virtual machines. Not many at all use them for desktop use.

Go get Windows 10 and use that which is know to work fine on Ryzen. Motherboard makers are not supporting Linux on them that I have seen.

0 Likes
ptheb
Adept I

Haven't updated the BIOS yet (still need to wait for project to finish), but I've found out something more:

It seems that freed diskspace is not correctly reported as "free" after suspend/resume.

I've noticed that "~/.xsession-errors" was completely filling up the partition where /home is located - when I deleted it, "df -h" still showed that the disk claimed to be full - but it wasn't. Until I rebooted: Then the disk usage was displayed correctly again.

This might explain this inconsistent "odd" behavior, and the common denominator may be that this SSD (driver?) doesn't resume well. If that's the case, than this ain't the right forum, but I thought I'd post the update here in case someone has similar issue - and, it's not resolved yet, so who knows

0 Likes
ptheb
Adept I

No good news

I've updated the BIOS to 7B89V27 (2020-04-27*): Identical behavior.

I've replaced the SSD (Kingston V300S37A with PNY CS1311): Identical behavior.

Found something reproducible:

  • After fresh boot (no resume), everything runs fine.
  • run "debsums -s": No errors
  • suspend & resume
  • run "debsums -s": Random package files are reported as corrupted. Then run md5sums on these files: The hashcodes stay identical (=corrupt), regardless how often they're re-calculated.
  • shutdown & reboot
  • run "debsum -s": Everything back to normal.

xsession-errors filling up is just a symptom:

  • ~/.xsession-errors was growing so fast due to random application errors after resume (matches debsums fault-report)
  • The free diskspace not being freed after deleting xsession-errors, was correct behavior: because it was still opened.
  • So it's not Xorg/xsession-errors.

Here's why I still can't rule out hardware (CPU, mainboard, RAM) issues:

I ran memtest86 again (the proprietary PassMark version 8.3):

pastedImage_2.jpg

Great! I thought now I've finally found it: It's the RAM.

But then I re-ran the same tests again (4 passes): Flawless.

Then I re-ran the same tests again, and got 1 completely different error:

20200621_195537.jpg

It seems like "something" is not running smoothly on this hardware, but since I don't have equal spare parts laying around to swap and test, I'm running out of ideas how to narrow it down

I'll try to remove 1 RAM bar at a time and see if that changes something. But if that doesn't solve it, the only 2 options left are CPU or mainboard.

Any ideas?

(*) There is a more recent BIOS, but it's just a few days old (2020-06-12) and marked as BETA (7B89V284) and its changelog doesn't mention any fix.

0 Likes

BIOS updates a bad for disclosing details

0 Likes

I've now tried using only 1 RAM module at a time: Great improvements!

I'm able to suspend/resume and work on this PC now for 2 days without data corruption issues on RAM module #1.

BUT: I've tried swapping the RAM modules and use only RAM #2 and expected the errors to reappear immediately, BUT: suspend/resume worked several times without issues - and failed once after resume from overnight suspend.

If the errors don't appear with module #1 for a few days, I'll swap again and re-run memtest on the suspected-faulty module #2. After hunting down the source of this erratic behavior for a month, I'd be very happy if it's simply the RAM module.

Keeping fingers crossed!

ptheb
Adept I

SOLVED!

One of the 2 RAM modules was faulty. I had it replaced now and things seem to be running fine.