Conversation
aramprice
left a comment
There was a problem hiding this comment.
Seems like a reasonable change.
I'm curious why folks haven't seen this previously? Was there some other process, or ordering that ensured that the mount-point was already created?
Thank you for looking into this! I hit this while testing the CPI changes in cloudfoundry/bosh-google-cpi-release#382. I was changing machine_type (N2 -> N4), disk_type (pd-ssd -> hyperdisk-balanced) and disk_size (1GB -> 4GB) at the same time. The disk type change goes through the CPI's new snapshot-and-recreate path, which creates a bigger disk but preserves the old partition table from the snapshot. On the agent side, From my understanding, this never triggered because normal BOSH disk operations don't produce a disk where the partition doesn't fill the whole disk. The new snapshot-and-recreate path is what creates that condition. I also updated the description to make it more clear. |
Encountered this when migrating from N2 to N4 machine types on GCP while also changing disk type from pd-ssd to hyperdisk-balanced and increasing disk size (1GB/2GB -> 4GB).
AdjustPersistentDiskPartitioning has a path that temporarily mounts the persistent disk to grow the filesystem after resizing the partition. Unlike MountPersistentDisk, it doesn't create the mount point directory first. On a freshly recreated VM,
/var/vcap/storedoesn't exist yet, so the mount fails:This was never triggered because it requires a very specific condition: the disk must already have a partition that doesn't fill the whole disk. That doesn't happen in normal operations - but when the CPI changes disk types via snapshot-and-recreate, it copies the old (smaller) partition table onto a larger disk, which is exactly what triggers it.
Fix: add
MkdirAllbefore the mount, same as MountPersistentDisk already does.