Operating System Corruption due to snapshot consolidation

There are some cases where due to VMware snapshot consolidation the VM disk systems get corrupted. I faced with the same issue on ESXi 6.5 environment. I researched on the issue and finally found out that this is a known issue on some VMWare versions. The symptoms are as follows;

Symptoms:

  • Applications such as databases may report block-level data inconsistency.
  • Guest operating systems may report file system metadata inconsistencies
  • The VM fails to boot when it is running from an SEsparse snapshot. (SEsparse is a snapshot format introduced in vSphere 5.5 for large disks, and is the preferred format for all snapshots in vSphere 6.5 and above with VMFS-6)

Affected Environments:

VMFS-5 or NFS Datastores: VMs with virtual disks >2TB and snapshots. On VMFS-5 and NFS, the SEsparse format is used for virtual disks that are 2 TB or larger
VMFS-6 Datastores: VMs with snapshots. SEsparse is the default format for all snapshots on VMFS-6 datastores.

Impacted vSphere releases: vSphere 6.5 and above with VMFS-6 and any VM with snapshots | vSphere 5.5 and above when VMs with virtual disks >2TB have snapshots.

So basically if the VM size is more than 2TB in size and the VMFS versions are VMFS5 & VMFS6, there might be data corruptions on the infrastructure.

The issue has been resolved starting from the following ESXi versions;

  • vSphere 6.7 Update 1 
  • vSphere 6.5 Patch
  • vSphere 6.0 Patch

As a workaround, disabling “IO coalescing” for SEsparse can be carried out on host system level.

for more information, please refer the KB https://kb.vmware.com/s/article/59216

[source: VMware Knowledge Base]