Arch boot recovery

• 2 minute read • problemlinux

I upgraded both identical laptops. One I took with me, and it worked without any problems. Today I was puzzled to find the second machine unable to boot.

...
Waiting 10 seconds for device [...]
ERROR: mounting '/dev/mapper/luksdev' on real root

GRUB dropped the computer to emergency shell, but it wasn’t helpful. The keyboard didn’t work, including Ctrl-Alt-Del. Time for data recovery (faster than downloading the backup) from a live boot, a half-scripted reinstall, and recovering the files (~2-3h total).

I was not asked for the disk encryption password — the partition couldn’t be unlocked. GRUB trying to mount from /dev/mapper/ smelled something wrong with LUKS.

With a dysfunctional emergency shell, I tried booting from a different entry, without luck. I grabbed my keyring with USB drives. The one which is supposed to have Fedora on it booted to Arch instead.

With the pain of the reinstallation sinking time, I thought if maybe, this time, I could recover the system successfully. Could the boot environment be fixed without any serious GRUB magician skills?

Having already searched the web for the errors, I gave a few commands a try in the emergency shell. Just chroot in, and go:

# lsblk
...
nvme0n1     259:0    0  1.8T  0 disk
├─nvme0n1p1 259:1    0  511M  0 part
└─nvme0n1p2 259:2    0  1.8T  0 part

Judging by the size, the first one is boot, and the second root.

# cryptsetup luksOpen /dev/nvme0n1p2 main
# mount /dev/mapper/main /mnt
# mount /dev/nvme0n1p1 /mnt/boot
mount: boot: mount point does not exist.
# mkdir /mnt/boot
# mount /dev/nvme0n1p1 /mnt/boot
# ls /mnt
boot @ @home @log @pkg @.snapshots

I forgot to specify the Btrfs subvolume…

# umount /mnt/boot
# rm -d /mnt/boot
# umount /mnt
^R main
# mount /dev/mapper/main -o subvolume=@ /mnt
# ls /mnt
bin boot dev etc f home lib lib64 mnt opt proc r root run sbin srv sys tmp usr var
^R mnt/boot
# mount /dev/nvme0n1p1 /mnt/boot

Maybe I could fix this easier. I recall not reading the output for updates on the failing laptop. Maybe something went wrong.

# iwctl station wlan0 scan
# iwctl station wlan0 connect '...'
# arch-chroot /mnt
# pacman -Syu
* mkinitpcio stuff, the slowest part in upgrades *
#
^D
# reboot

It has been a few days since I last updated — simply updating again should rebuild the boot environment anyway. The logs agreed with me.

With a naive solution, the laptop booted normally after the upgrade. If there wasn’t an upgrade, I could’ve tried reinstalling the kernel package.