Skip to content

rcu_sched detected stalls after local live migration #376

Description

@gjcolombo

Propolis commit: c455784

Host OS:

$ cat /etc/versions/build
heads/master-0-g717646f711

Guest OS: Debian 11 nocloud, Linux debian 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21) x86_64 GNU/Linux

Repro steps:

  • install stress-ng in the guest
  • stress-ng --vm 1 --vm-bytes 2G --verify -v &
  • live migrate the VM a few times, back and forth between the same two Propolis servers on a single host
  • fg
  • kill the stress-ng job
  • stress-ng --timer 32 --timer-freq 1000000 &
  • migrate some more
  • fg, kill the stress-ng job

Expected: guest is generally happy
Observed: guest gets dyspepsia after running the timer stress test:

root@debian:~# [  501.250937] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  501.254890] rcu:     1-...!: (0 ticks this GP) idle=7f0/0/0x0 softirq=2364/2364 fqs=1  (false positive?)
[  501.254890]  (detected by 3, t=21009 jiffies, g=1749, q=1077)
[  501.254890] Sending NMI from CPU 3 to CPUs 1:
[  501.266638] NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x20
[  501.254890] rcu: rcu_sched kthread starved for 15759 jiffies! g1749 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[  501.254890] rcu:     Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[  501.254890] rcu: RCU grace-period kthread stack dump:
[  501.254890] task:rcu_sched       state:I stack:    0 pid:   12 ppid:     2 flags:0x00004000
[  501.254890] Call Trace:
[  501.254890]  __schedule+0x282/0x870
[  501.254890]  schedule+0x46/0xb0
[  501.254890]  schedule_timeout+0x8b/0x150
[  501.254890]  ? __next_timer_interrupt+0x110/0x110
[  501.254890]  rcu_gp_kthread+0x51b/0xbc0
[  501.254890]  ? rcu_cpu_kthread+0x190/0x190
[  501.254890]  kthread+0x11b/0x140
[  501.254890]  ? __kthread_bind_mask+0x60/0x60
[  501.254890]  ret_from_fork+0x22/0x30

Other observations:

  • There are a few of these messages in the serial logs (at guest uptimes 501.254, 438.230, 417.186, and 354.166).
  • The host machine was not otherwise loaded especially heavily during this time except for the VM/migration work.
  • The messages are not correlated with migrations; the last couple of them occurred in the same Propolis server a couple of minutes after it had been migrated into.
  • This guest complained about TSC inaccuracy at about 65 seconds of uptime. This is likely because the prior runs of stress-ng --vm tanked migration performance by dirtying a bunch of pages (another case to investigate in Investigate/profile live migration performance #324).

@jmpesp saw a similar issue in local testing earlier this week, but that was without the bits needed to enable the interrupt state transfer implemented in #367. Unless I've missed something, that should have been enabled here (both the Propolis bits and the necessary bhyve bits were present).

This VM no longer seems to be producing any RCU complaints, but I'll hold it in its current state for now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    guest-osRelated to compatibility and/or functionality observed by guest software.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions