Overview
In some Nextgen Gateway environments, you may encounter volume mount failures with an error related to mkfs.ext4, typically during persistent volume setup using Longhorn or other CSI drivers.
This issue results in the pod failing to start, as the volume fails to mount due to filesystem creation errors — often caused by corrupted metadata or I/O issues on the underlying block device.
Symptoms
- Pod remains in
ContainerCreatingorInitstate - Volumes fail to attach or format correctly
kubectl describe podshows repeatedFailedMountwarnings
How to Identify the Issue
Run the following command to inspect the pod:
kubectl describe pod <pod-name> -n <namespace>Sample Output (Truncated for Clarity):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 5m43s (x953 over 8h) kubelet (combined from similar events): MountVolume.MountDevice failed for volume "pvc-aa342bf3-ac19-4d06-80c6-c307ba47f190" : rpc error: code = Internal desc = format of disk "/dev/longhorn/pvc-aa342bf3-ac19-4d06-80c6-c307ba47f190" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/ab9e1a1eda52e31ff54a34d2515152637055932de1b4b0f4c68f8f411b370efe/globalmount") options:("defaults") errcode:(exit status 1) output:(mke2fs 1.47.0 (5-Feb-2023)
Warning: could not erase sector 2: Input/output error
Creating filesystem with 76800 4k blocks and 76800 inodes
Filesystem UUID: 4ae0356b-c34f-4c0c-bd74-45601ce70ee5
Superblock backups stored on blocks:
32768
Allocating group tables: done
Warning: could not read block 0: Input/output error
Warning: could not erase sector 0: Input/output error
Writing inode tables: done
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: mkfs.ext4: Input/output error while writing out and closing file system)Root Cause
This error typically indicates a low-level disk I/O issue when formatting the Longhorn volume with mkfs.ext4. Common causes include:
- Disk corruption or bad sectors on the underlying storage
- Node hardware issues (e.g., failing SSD/HDD)
- Longhorn replica corruption or degraded volume
- Improper node shutdown or power loss
As a result, mkfs.ext4 fails during volume formatting and the pod cannot mount the persistent volume.
Resolution Steps
Follow these steps to recover the volume:
Step 1: Wipe the Beginning of the Disk
Clear any existing filesystem signatures or corrupted metadata:
sudo dd if=/dev/zero of=/dev/sdX bs=1M count=100Warning
This will erase data at the beginning of the disk. Make sure the volume is no longer in use.This command is often used to:
- Remove corrupted partition tables or filesystem metadata
- Prepare a disk for reformatting
- Resolve certain I/O errors during volume mount or format
Step 2: Format the Device
Attempt to format the volume again using mkfs.ext4:
sudo mkfs.ext4 /dev/sdXIf formatting is successful, the volume should now be mountable by the pod.
Step 3: Restart the VM (if issue persists)
If the above steps don’t resolve the issue:
- Restart the affected VM or node.
- After reboot, retry the pod deployment.
Volume mount errors due to mkfs.ext4 failures are typically caused by residual or corrupted data on the disk. Wiping the beginning of the block device and reformatting usually resolves the problem. If not, a VM reboot may help reset the volume state.