Linux Reboots during idle time and MCE errors (Ubuntu 16.04.3.)

Is Linux stable on Ryzen during idle times?  Idle times meaning - not actively running any apps.  As far as tests go, this one is pretty easy to execute; all it takes is time.  To conduct this test, I started a VNC session to the machine - something I could connect to from time to time without having to physically be at the computer.  This test ran for three days with the third day ending in a strange almost hung state.  The system would respond to pings, allow me to ssh to it and type in a username and password, but never respond with a command prompt.  I was not able to interact with the system with connected keyboard or mouse either.  After rebooting, I took a look in /var/log/syslog and found the following:

Oct 26 07:35:03 ryzen7 anacron[7524]: Job `cron.daily' terminated
Oct 26 07:35:03 ryzen7 anacron[7524]: Normal exit (1 job run)
Oct 26 07:44:06 ryzen7 kernel: [221161.641751] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 07:44:06 ryzen7 kernel: [221161.762758] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:06:42 ryzen7 kernel: [222517.707128] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:06:42 ryzen7 kernel: [222517.827954] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:14:08 ryzen7 kernel: [222963.730150] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:14:08 ryzen7 kernel: [222963.845966] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:17:01 ryzen7 CRON[16270]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Oct 26 08:23:04 ryzen7 kernel: [223499.756719] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:23:04 ryzen7 kernel: [223499.877604] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:28:07 ryzen7 kernel: [223802.776797] nouveau 0000:28:00.0: DRM: DDC responded, but no EDID for DVI-I-1
Oct 26 08:31:13 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:31:13 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:31:13 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:31:23 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:31:23 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:31:23 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:31:33 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:31:33 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:31:33 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:31:43 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:31:43 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:31:43 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:31:53 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:31:53 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:31:53 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:03 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:03 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:03 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:06 ryzen7 kernel: [224041.662913] INFO: rcu_sched detected stalls on CPUs/tasks:
Oct 26 08:32:06 ryzen7 kernel: [224041.662924]     0-...: (0 ticks this GP) idle=b9c/0/0 softirq=3521708/3521708 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662928]     1-...: (31 GPs behind) idle=3e8/0/0 softirq=2327060/2327060 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662932]     6-...: (0 ticks this GP) idle=0c4/0/0 softirq=3437103/3437103 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662935]     7-...: (19 GPs behind) idle=498/0/0 softirq=2336767/2336767 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662938]     12-...: (3 GPs behind) idle=ca4/0/0 softirq=2646584/2646584 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662941]     13-...: (73 GPs behind) idle=b82/0/0 softirq=2110429/2110430 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662943]     15-...: (77 GPs behind) idle=cb2/0/0 softirq=2279584/2279584 fqs=0 
Oct 26 08:32:06 ryzen7 kernel: [224041.662944]     (detected by 2, t=15202 jiffies, g=3043662, c=3043661, q=1125)
Oct 26 08:32:06 ryzen7 kernel: [224041.662947] Task dump for CPU 0:
Oct 26 08:32:06 ryzen7 kernel: [224041.662948] swapper/0       R  running task        0     0      0 0x00000008
Oct 26 08:32:06 ryzen7 kernel: [224041.662950] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.662957]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.662959]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.662962]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.662964]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.662965]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.662968]  ? rest_init+0x77/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.662971]  ? start_kernel+0x482/0x4a3
Oct 26 08:32:06 ryzen7 kernel: [224041.662973]  ? early_idt_handler_array+0x120/0x120
Oct 26 08:32:06 ryzen7 kernel: [224041.662975]  ? x86_64_start_reservations+0x24/0x26
Oct 26 08:32:06 ryzen7 kernel: [224041.662977]  ? x86_64_start_kernel+0x143/0x166
Oct 26 08:32:06 ryzen7 kernel: [224041.662979]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.662980] Task dump for CPU 1:
Oct 26 08:32:06 ryzen7 kernel: [224041.662981] swapper/1       R  running task        0     0      1 0x00000008
Oct 26 08:32:06 ryzen7 kernel: [224041.662983] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.662985]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.662986]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.662988]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.662989]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.662991]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.662994]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.662995]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.662996] Task dump for CPU 6:
Oct 26 08:32:06 ryzen7 kernel: [224041.662997] swapper/6       R  running task        0     0      1 0x00000008
Oct 26 08:32:06 ryzen7 kernel: [224041.662998] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663000]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.663002]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.663003]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.663005]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663006]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663008]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.663009]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.663010] Task dump for CPU 7:
Oct 26 08:32:06 ryzen7 kernel: [224041.663011] swapper/7       R  running task        0     0      1 0x00000000
Oct 26 08:32:06 ryzen7 kernel: [224041.663012] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663014]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.663016]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.663017]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.663019]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663020]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663022]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.663023]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.663024] Task dump for CPU 12:
Oct 26 08:32:06 ryzen7 kernel: [224041.663024] swapper/12      R  running task        0     0      1 0x00000008
Oct 26 08:32:06 ryzen7 kernel: [224041.663026] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663028]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.663029]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.663031]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.663032]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663034]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663036]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.663037]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.663038] Task dump for CPU 13:
Oct 26 08:32:06 ryzen7 kernel: [224041.663038] swapper/13      R  running task        0     0      1 0x00000000
Oct 26 08:32:06 ryzen7 kernel: [224041.663039] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663041]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.663043]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.663044]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.663046]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663047]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663049]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.663050]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.663051] Task dump for CPU 15:
Oct 26 08:32:06 ryzen7 kernel: [224041.663052] swapper/15      R  running task        0     0      1 0x00000000
Oct 26 08:32:06 ryzen7 kernel: [224041.663053] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663055]  ? cpuidle_enter_state+0xfa/0x2d0
Oct 26 08:32:06 ryzen7 kernel: [224041.663056]  ? cpuidle_enter+0x17/0x20
Oct 26 08:32:06 ryzen7 kernel: [224041.663058]  ? call_cpuidle+0x23/0x40
Oct 26 08:32:06 ryzen7 kernel: [224041.663059]  ? do_idle+0x17f/0x1f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663061]  ? cpu_startup_entry+0x71/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663062]  ? start_secondary+0x154/0x190
Oct 26 08:32:06 ryzen7 kernel: [224041.663063]  ? start_cpu+0x14/0x14
Oct 26 08:32:06 ryzen7 kernel: [224041.663066] rcu_sched kthread starved for 15202 jiffies! g3043662 c3043661 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
Oct 26 08:32:06 ryzen7 kernel: [224041.663067] rcu_sched       S    0     7      2 0x00000000
Oct 26 08:32:06 ryzen7 kernel: [224041.663068] Call Trace:
Oct 26 08:32:06 ryzen7 kernel: [224041.663070]  __schedule+0x232/0x700
Oct 26 08:32:06 ryzen7 kernel: [224041.663073]  ? dequeue_task_fair+0x4ee/0xb20
Oct 26 08:32:06 ryzen7 kernel: [224041.663074]  schedule+0x36/0x80
Oct 26 08:32:06 ryzen7 kernel: [224041.663076]  schedule_timeout+0x1ea/0x3f0
Oct 26 08:32:06 ryzen7 kernel: [224041.663078]  ? del_timer_sync+0x50/0x50
Oct 26 08:32:06 ryzen7 kernel: [224041.663080]  rcu_gp_kthread+0x551/0x910
Oct 26 08:32:06 ryzen7 kernel: [224041.663083]  kthread+0x109/0x140
Oct 26 08:32:06 ryzen7 kernel: [224041.663084]  ? rcu_note_context_switch+0x100/0x100
Oct 26 08:32:06 ryzen7 kernel: [224041.663086]  ? kthread_create_on_node+0x60/0x60
Oct 26 08:32:06 ryzen7 kernel: [224041.663088]  ret_from_fork+0x2c/0x40
Oct 26 08:32:13 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:13 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:13 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:23 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:23 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:23 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:33 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:33 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:33 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:43 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:43 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:43 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:32:53 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:32:53 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:32:53 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:03 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:03 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:03 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:13 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:13 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:13 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:23 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:23 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:23 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:33 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:33 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:33 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:43 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:43 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:43 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:33:53 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:33:53 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:33:53 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:03 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:03 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:03 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:13 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:13 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:13 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:23 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:23 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:23 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:33 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:33 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:33 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:43 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:43 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:43 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:34:53 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:34:53 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 08:34:53 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 08:35:03 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 08:35:03 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 09:32:43 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.

...

Oct 26 09:32:43 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 09:32:43 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 09:32:53 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 09:32:53 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 09:32:53 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 09:33:03 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 09:33:03 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 09:33:03 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.
Oct 26 09:33:13 ryzen7 rtkit-daemon[3684]: The canary thread is apparently starving. Taking action.
Oct 26 09:33:13 ryzen7 rtkit-daemon[3684]: Demoting known real-time threads.
Oct 26 09:33:13 ryzen7 rtkit-daemon[3684]: Demoted 0 threads.

<then garbage>

 

This issue appears to be known and is discussed at the following sites:

Uubuntu-16-04-compile-custom-kernel-for-ryzen
Ubuntu Launchpad bug 1690085

During the three day test, I did not observe any MCE erros or reboots.  The issue I described though is of the same catagory and severe enough to discourage many linux users.

To discuss this article click here.

Next up, The RMA.