Skip to content

arm64 KVM problem when SCS is enabled #1096

@nathanchance

Description

@nathanchance

Thanks to the hard work of upstream developers, the Raspberry Pi 4 can be easily booted on mainline, which is rather neat since I now have an actual piece of hardware that I can use to run mainline kernels on :)

One of the things I wanted to try was spawning a guest with KVM with a clang built kernel, as we have received a report of it not working when BTI was enabled: https://lore.kernel.org/linux-arm-kernel/20200615105524.GA2694@willie-the-truck/

It works fine when just building defconfig (which is how I verified ClangBuiltLinux/boot-utils#23):

$ src/boot-utils/boot-qemu.sh -a arm64 -k src/linux/out/arm64
...
+ timeout --foreground 3m unbuffer qemu-system-aarch64 -enable-kvm -cpu host -machine virt -append 'console=ttyAMA0 ' -display none -initrd /home/pi/src/boot-utils/images/arm64/rootfs.cpio -kernel /home/pi/src/linux/out/arm64/arch/arm64/boot/Image.gz -m 512m -nodefaults -serial mon:stdio
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[    0.000000] Linux version 5.8.0-rc5-00048-gf8456690ba8e (pi@raspberrypi) (clang version 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2), LLD 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2)) #1 SMP PREEMPT Wed Jul 15 22:15:55 MST 2020
[    0.000000] Machine model: linux,dummy-virt
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 32 MiB at 0x000000005e000000
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x5def7100-0x5def8fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] psci: probing for conduit method from DT.
[    0.000000] psci: PSCIv1.0 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] psci: SMC Calling Convention v1.1
[    0.000000] percpu: Embedded 23 pages/cpu s53912 r8192 d32104 u94208
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] CPU features: detected: EL2 vector hardening
[    0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[    0.000000] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 129024
[    0.000000] Policy zone: DMA
[    0.000000] Kernel command line: console=ttyAMA0 
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 443264K/524288K available (13756K kernel code, 2188K rwdata, 7308K rodata, 1600K init, 484K bss, 48256K reserved, 32768K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu: 	RCU event tracing is enabled.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=1.
[    0.000000] 	Trampoline variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GICv2m: range[mem 0x08020000-0x08020fff], SPI[80:143]
[    0.000000] random: get_random_bytes called from start_kernel+0x1c8/0x384 with crng_init=0
[    0.000000] arch_timer: cp15 timer(s) running at 54.00MHz (virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
[    0.000003] sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
[    0.000085] Console: colour dummy device 80x25
[    0.000139] Calibrating delay loop (skipped), value calculated using timer frequency.. 108.00 BogoMIPS (lpj=216000)
[    0.000145] pid_max: default: 32768 minimum: 301
[    0.000184] LSM: Security Framework initializing
[    0.000221] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.000232] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.001250] rcu: Hierarchical SRCU implementation.
[    0.001638] EFI services will not be available.
[    0.001697] smp: Bringing up secondary CPUs ...
[    0.001702] smp: Brought up 1 node, 1 CPU
[    0.001705] SMP: Total of 1 processors activated.
[    0.001713] CPU features: detected: 32-bit EL0 Support
[    0.001718] CPU features: detected: CRC32 instructions
[    0.001724] CPU features: detected: 32-bit EL1 Support
[    0.013238] CPU: All CPU(s) started at EL1
[    0.013267] alternatives: patching kernel code
[    0.014198] devtmpfs: initialized
[    0.015106] KASLR disabled due to lack of seed
[    0.015329] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.015339] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
[    0.015961] pinctrl core: initialized pinctrl subsystem
[    0.016611] thermal_sys: Registered thermal governor 'step_wise'
[    0.016613] thermal_sys: Registered thermal governor 'power_allocator'
[    0.016682] DMI not present or invalid.
[    0.017002] NET: Registered protocol family 16
[    0.019912] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
[    0.019996] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.020050] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.020088] audit: initializing netlink subsys (disabled)
[    0.020711] cpuidle: using governor menu
[    0.020795] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.020829] ASID allocator initialised with 65536 entries
[    0.021369] Serial: AMBA PL011 UART driver
[    0.024205] audit: type=2000 audit(0.020:1): state=initialized audit_enabled=0 res=1
[    0.027281] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 39, base_baud = 0) is a PL011 rev1
[    0.150693] printk: console [ttyAMA0] enabled
[    0.152523] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.154174] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
[    0.155763] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.157514] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
[    0.159836] cryptd: max_cpu_qlen set to 1000
[    0.164319] ACPI: Interpreter disabled.
[    0.166166] iommu: Default domain type: Translated 
[    0.167526] vgaarb: loaded
[    0.168386] SCSI subsystem initialized
[    0.170162] usbcore: registered new interface driver usbfs
[    0.171563] usbcore: registered new interface driver hub
[    0.172885] usbcore: registered new device driver usb
[    0.174542] pps_core: LinuxPPS API ver. 1 registered
[    0.175747] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.177912] PTP clock support registered
[    0.179051] EDAC MC: Ver: 3.0.0
[    0.180512] FPGA manager framework
[    0.181427] Advanced Linux Sound Architecture Driver Initialized.
[    0.183407] clocksource: Switched to clocksource arch_sys_counter
[    0.185031] VFS: Disk quotas dquot_6.6.0
[    0.186041] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.187924] pnp: PnP ACPI: disabled
[    0.191669] NET: Registered protocol family 2
[    0.193179] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes, linear)
[    0.195275] TCP established hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.197396] TCP bind hash table entries: 4096 (order: 4, 65536 bytes, linear)
[    0.199166] TCP: Hash tables configured (established 4096 bind 4096)
[    0.200930] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.202516] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.204339] NET: Registered protocol family 1
[    0.205685] RPC: Registered named UNIX socket transport module.
[    0.207127] RPC: Registered udp transport module.
[    0.208363] RPC: Registered tcp transport module.
[    0.209486] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.211030] PCI: CLS 0 bytes, default 64
[    0.212136] Unpacking initramfs...
[    0.231281] Freeing initrd memory: 3448K
[    0.246266] hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
[    0.248349] kvm [1]: HYP mode not available
[    0.250401] Initialise system trusted keyrings
[    0.251703] workingset: timestamp_bits=44 max_order=17 bucket_order=0
[    0.256595] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.260710] NFS: Registering the id_resolver key type
[    0.262053] Key type id_resolver registered
[    0.263078] Key type id_legacy registered
[    0.264167] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.265901] 9p: Installing v9fs 9p2000 file system support
[    0.291619] Key type asymmetric registered
[    0.292640] Asymmetric key parser 'x509' registered
[    0.293852] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 245)
[    0.295680] io scheduler mq-deadline registered
[    0.296778] io scheduler kyber registered
[    0.301747] pl061_gpio 9030000.pl061: PL061 GPIO chip registered
[    0.304029] pci-host-generic 4010000000.pcie: host bridge /pcie@10000000 ranges:
[    0.305838] pci-host-generic 4010000000.pcie:       IO 0x003eff0000..0x003effffff -> 0x0000000000
[    0.308075] pci-host-generic 4010000000.pcie:      MEM 0x0010000000..0x003efeffff -> 0x0010000000
[    0.310214] pci-host-generic 4010000000.pcie:      MEM 0x8000000000..0xffffffffff -> 0x8000000000
[    0.312469] pci-host-generic 4010000000.pcie: ECAM at [mem 0x4010000000-0x401fffffff] for [bus 00-ff]
[    0.314760] pci-host-generic 4010000000.pcie: PCI host bridge to bus 0000:00
[    0.316568] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.317889] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    0.319385] pci_bus 0000:00: root bus resource [mem 0x10000000-0x3efeffff]
[    0.321194] pci_bus 0000:00: root bus resource [mem 0x8000000000-0xffffffffff]
[    0.322968] pci 0000:00:00.0: [1b36:0008] type 00 class 0x060000
[    0.326279] EINJ: ACPI disabled.
[    0.334756] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.337860] SuperH (H)SCI(F) driver initialized
[    0.339345] msm_serial: driver initialized
[    0.341133] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    0.349370] loop: module loaded
[    0.350875] megasas: 07.714.04.00-rc1
[    0.352903] physmap-flash 0.flash: physmap platform flash device: [mem 0x00000000-0x03ffffff]
[    0.367656] 0.flash: Found 2 x16 devices at 0x0 in 32-bit bank. Manufacturer ID 0x000000 Chip ID 0x000000
[    0.370007] Intel/Sharp Extended Query Table at 0x0031
[    0.375502] Using buffer write method
[    0.376553] physmap-flash 0.flash: physmap platform flash device: [mem 0x04000000-0x07ffffff]
[    0.386353] 0.flash: Found 2 x16 devices at 0x0 in 32-bit bank. Manufacturer ID 0x000000 Chip ID 0x000000
[    0.388812] Intel/Sharp Extended Query Table at 0x0031
[    0.395426] Using buffer write method
[    0.396900] Concatenating MTD devices:
[    0.397828] (0): "0.flash"
[    0.398526] (1): "0.flash"
[    0.399190] into device "0.flash"
[    0.406102] libphy: Fixed MDIO Bus: probed
[    0.407769] tun: Universal TUN/TAP device driver, 1.6
[    0.409569] thunder_xcv, ver 1.0
[    0.410459] thunder_bgx, ver 1.0
[    0.411272] nicpf, ver 1.0
[    0.412661] hclge is initializing
[    0.413537] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
[    0.415290] hns3: Copyright (c) 2017 Huawei Corporation.
[    0.416760] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[    0.418468] e1000: Copyright (c) 1999-2006 Intel Corporation.
[    0.419939] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[    0.421353] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    0.422829] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
[    0.424636] igb: Copyright (c) 2007-2014 Intel Corporation.
[    0.426000] igbvf: Intel(R) Gigabit Virtual Function Network Driver - version 2.4.0-k
[    0.427940] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[    0.429590] sky2: driver version 1.30
[    0.431044] VFIO - User Level meta-driver version: 0.3
[    0.433371] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    0.435025] ehci-pci: EHCI PCI platform driver
[    0.436273] ehci-platform: EHCI generic platform driver
[    0.437653] ehci-orion: EHCI orion driver
[    0.438719] ehci-exynos: EHCI Exynos driver
[    0.439880] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    0.441403] ohci-pci: OHCI PCI platform driver
[    0.442540] ohci-platform: OHCI generic platform driver
[    0.443948] ohci-exynos: OHCI Exynos driver
[    0.445229] usbcore: registered new interface driver usb-storage
[    0.448317] rtc-pl031 9010000.pl031: registered as rtc0
[    0.449655] rtc-pl031 9010000.pl031: setting system clock to 2020-07-17T04:54:20 UTC (1594961660)
[    0.452277] i2c /dev entries driver
[    0.455814] sdhci: Secure Digital Host Controller Interface driver
[    0.457342] sdhci: Copyright(c) Pierre Ossman
[    0.458658] Synopsys Designware Multimedia Card Interface Driver
[    0.460685] sdhci-pltfm: SDHCI platform and OF driver helper
[    0.462812] ledtrig-cpu: registered to indicate activity on CPUs
[    0.465151] usbcore: registered new interface driver usbhid
[    0.466553] usbhid: USB HID core driver
[    0.469978] NET: Registered protocol family 17
[    0.471346] 9pnet: Installing 9P2000 support
[    0.472602] Key type dns_resolver registered
[    0.473856] registered taskstats version 1
[    0.474896] Loading compiled-in X.509 certificates
[    0.476812] input: gpio-keys as /devices/platform/gpio-keys/input/input0
[    0.480346] ALSA device list:
[    0.481195]   No soundcards found.
[    0.482276] uart-pl011 9000000.pl011: no DMA platform data
[    0.485104] Freeing unused kernel memory: 1600K
[    0.487532] Run /init as init process
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Saving random seed: [    0.541978] random: dd: uninitialized urandom read (512 bytes read)
OK
Starting network: OK
Linux version 5.8.0-rc5-00048-gf8456690ba8e (pi@raspberrypi) (clang version 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2), LLD 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2)) #1 SMP PREEMPT Wed Jul 15 22:15:55 MST 2020
Linux version 5.8.0-rc5-00048-gf8456690ba8e (pi@raspberrypi) (clang version 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2), LLD 12.0.0 (https://github.com/llvm/llvm-project 30c382a7c6607a7d898730f8d288768110cdf1d2)) #1 SMP PREEMPT Wed Jul 15 22:15:55 MST 2020
Stopping network: OK
Saving random seed: [    0.606922] random: dd: uninitialized urandom read (512 bytes read)
OK
Stopping klogd: OK
Stopping syslogd: OK
umount: devtmpfs busy - remounted read-only
umount: can't unmount /: Invalid argument
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[    2.635581] Flash device refused suspend due to active operation (state 20)
[    2.637358] Flash device refused suspend due to active operation (state 20)
[    2.639185] reboot: Power down
+ RET=0
+ set +x

However, as soon as I enable CONFIG_SHADOW_CALL_STACK, attempting to spawn a KVM guest kills the machine; I see the qemu-system-aarch64 but no other output then my mosh session disconnects and the green light on the Pi stops flashing. I am unsure of how to get a previous kernel log on "regular" Linux (I know that Android has pstore) so I am not sure how to further debug this.

I am going to do some research to see if this is a clang issue or more rooted in the kernel. Attempting to bisect probably won't prove fruitful for two reasons: SCS was only merged in 5.8-rc1 and Raspberry Pi 4 support has only been good for the past couple of kernel versions.

cc @samitolvanen

Metadata

Metadata

Assignees

Labels

[ARCH] arm64This bug impacts ARCH=arm64[BUG] linuxA bug that should be fixed in the mainline kernel.[FIXED][LINUX] 5.8This bug was fixed in Linux 5.8

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions