Permanent Device Loss (PDL) and All-Paths-Down (APD)

There is a new storage-related feature, VM Component Protection (VMCP), that protects virtual machines from possible storage issues.

There are two different types of methods that can be managed by VMCP:

PDL: It occurs when the storage array issues a SCSI sense code indicating that the device is unavailable (for example, a failed LUN).

APD: Usually, related to an underlying storage/networking issue, different from a PDL because the host doesn’t have enough information to determine if the device loss is temporary or permanent.

The below URL will provide all the necessary information to drill down the issue and the resolution steps;

https://kb.vmware.com/s/article/2004684

[source: VMware]

Analyzing ESXi DUMP files

We all are aware of Windows BSOD (Blue Screen Of Death), what about PSOD (Purple Screen Of Death).

PSOD is a fatal crash of VMware ESX/ESXi hosts which kills all active Virtual Machines. A diagnostic screen with white type on a purple background.

This PSOD is also generating a DUMP file, so that the Administrators can drill down the Issue and carry out a proper RCA.

Before jumping into the DUMP file analysis, it is always recommended to analyze the ESXi log files;

If the issue ie related to the Host system, you can analyze the below files;

  • VMkernel summary – /var/log/vmksummary.log
  • ESXi host agent log – /var/log/hostd.log

With the help of those above log files, we can easiliy identify whether a DUMP file has been generated or not

If a DUMP file has been generated, below is the set of steps to be carried out;

Continue reading “Analyzing ESXi DUMP files”