How a Virtual Machine Can Bring Down the Whole ESX Server

A coworker and I ran up against a very interesting situation at a virtualization consulting customer's site the other day. We got an after-hours call from the customer that said he was working on the console of a new Windows 2008 virtual machine. He was trying to set the IP address on the NIC and accidentally choose the “bridge network adapters” setting. Afterwards, he was unable to get to anything in the internal network from this server and several other VMs could not communicate with the internal network either. My coworker connected via VPN just fine, but was unable to ping the vmhost2. He could ping the SBS server, one terminal server, and the ISA server. We discussed over the phone that the particular ESX server that those servers were on must have somehow gotten isolated from the network. Sure enough, when my coworker checked the NIC status on vmhost1, it showed that all NICs connected to the LAN network were disconnected. We decided to go onsite and check out what was going on. On the way out, I realized what had happened. When the two NICs got bridged on that VM, it created a loop and must have looped a BPDU and err-disabled the port. Once onsite we confirmed that the port was down and portfast was NOT enabled on that port.

So, the warning here is two fold…yes, a VM can take down the whole ESX server. And second, its best to turn on portfast for ports connected to ESX servers. They don’t understand STP anyway.

Networking VPN RDP VMware NIC ESX Windows 2008 Server LAN