Sometimes I see that a certain virtual machine on the VMWare ESXi host freezes and it’s impossible to turn off or restart it from vSphere console by any means. Rebooting the entire ESXi host due to a single virtual machine is not entirely advisable (especially if you have only one ESXi host, or the remaining servers in the DRS cluster are not able to cope with the extra load of virtual machines from the server to be restarted). In this post I’ll cover how to force kill an unresponsive (hung) virtual machine on VMWare ESXi host.
If the virtual machine process on the ESXi server freezes, it stops responding on vCenter Reset
/Power Off
commands, and returns one of the following errors to any action:
- Another task is already in progress;
- The virtual machine might be performing concurrent operations. Actions: Complete the concurrent operation and retry the power-off operation; The virtual machine is in an invalid state;
- The attempted operation cannot be performed in the current state.
In such cases, you can manually kill the virtual machine process on the ESXi host from the ESXi Shell or PowerCLI command prompt.
First you need to determine on which ESXi host the hung virtual machine is running. To do this, find the VM in the vSphere Client interface. The ESXi host name on which VM is running is specified on the Summary tab in the Related Object -> Host section.
Next, SSH access protocol must be enabled on your ESXi host. You can enable this from the vSphere interface. Click on the ESXi host name, go to Configure -> Services -> SSH -> Start.
Now you can connect to this host via SSH using the putty client and list the VMs running on the ESXi host:
esxcli vm process list
Copy the “World ID” of the problem virtual machine.
To terminate the process of a hung virtual machine on an ESXi host, use the following command:
esxcli vm process kill --type=[soft,hard,force] --world-id=WorldNumber
There are three kill types of the VM process:
- Soft – the safest way to kill the VMX process (similar to kill -SIGTERM);
- Hard – immediate termination of the VM process (kill -9);
- Force – the hardest VM process stop mode. Should be used last if nothing else helps.
Make sure that there are no active snapshots, backups, and similar tasks for VM, and VM not in the “Virtual Machine disks consolidation is needed” state. Otherwise, you can break your VM and you will have to restore it from backup.
Let’s try to softly stop the VM with the specified ID:
esxcli vm process kill --type=soft -w=20598249
The VM should be powered off.
You can stop the frozen virtual machine using the PowerCLI (this is convenient, because when connecting to vCenter you don’t need to find the hostname on which the VM is running and enable SSH shell on it). Check that the VM is running:
get-vm “web1" | select name,PowerStates
Force stop the VM process with the command:
stop-vm -kill "web1" -confirm:$false
Also, you can stop an unresponsive VMWare virtual machine using the ESXTOP utility.
Open the SSH session, enter the esxtop, press “c” to display CPU resources and then SHIFT+V to display only virtual machine processes.
Then press “f” (to select fields to be displayed), “c” (to display the LWID- Leader World Id) and then press ENTER.
In the Name column, find the virtual machine to be stopped and note its LWID number in the corresponding column.
Now you have to press “k” (kill) and enter the LWID number of the virtual machine that you want to force shut down.
And the last way of VM “hard“ power off is to use the kill tool. This method will stop not only the VM, but also all child processes.
Get the parent process ID of the VM:
ps | grep "web2"
Kill the VM process:
kill -9 24288474
After such a “hard reset”, the installed OS will boot in the Recovery mode. In the case of guest Windows, the screen will look like this.
Thanks Corey
Haven’t had to find stuck VMs in ages and I forgot about the esxcli.
-RichPo