Note: I’ve recently re-done my lab and found significant changes to this post, so I’m re-writing this post and splitting it into two parts.
- For reference: My physical Lab setup
- Continue on with the following parts:
This guide documents how I setup my working nested ESX/NSX lab after having numerous problems setting up the nested ESX 6.5 appliance with NSX with existing documentation. One issue is that most nested ESX/NSX documentation covers ESX5.5, and much has changed since then, and it’s not immediately clear what parts to skip or use.
This guide covers items specific to nested ESXi & NSX. Since this is a lab, only a single NSX controller is used to save resources, but if you have the capacity all 3 can be installed.
Thanks to William Lam’s ESX appliance and blog entries below I was able to get started with nested ESXi, but I continued to have random issues with networking as there are numerous moving parts and most of the physical networking is not described. My lab tries to stay close to production recommendations of keeping networking, storage, and vMotion on separate vLANS, and this lab maintains that separation as much as possible.
I’m moving away from William Lam’s ‘vGhetto ESXi’ appliance, as most of the features added in that appliance are now handled directly by ESXi 6.5 and later such as:
- The vGhetto ESXi appliance has ESX Mac learn dvFilter installed, which conflicts with the newer method of using ESX-LearnSwitch on the physical host.
- ESXi 6.0+ installer now has vmtools pre-installed , so that feature of the appliance isn’t needed anymore. When ESXi detects that is is running as a VM, the vmtools is activated automatically.
- The ESXi cloning referenced in link #3 had numerous issues with dropped packets on vMotion and NFS so I’m not recommending cloning for ESXi 6.5. Creating new ESXi hosts with a fresh ISO install had no issues. Cloning is untested for ESXi 6.7
- ESXi 6.7 and later have ESX-Learnswitch VIB pre-installed so skip the install on the physical host.
There were a few other issues I ran into along the way. Hopefully this will save you time if you run into them along the way.
- There is an issue (link #4) with nested ESXi on a host running also running NSX causing loss of connectivity over vxlan tunnels. The easiest solution I chose is to remove NSX from the physical host. NSX manager can still run on the physical host. The other advantage is that vCenter, and other VM’s don’t need to be excluded from NSX firewall ( Just run all management on the physical host )
- Some configurations of dvs portgroups ( with ESX-Learnswitch) caused the nested ESX nic to fail to load on boot. Creating new dvs portgroups fixed this issue.
- The cloning method (link #3) for ESXi VM’s only appears to only create new Mac addresses for vmk0 (Mgmt) vmkernel, as duplicate Mac’s appear on other vmKernels causing intermittent network issues. I have since removed other vmkernel nics from my ESX 6.5 template.
- Nested ESXi 6.5 virtual appliance (vGhetto)
- Nested ESXi LearnSwitch (vGhetto)
- How to clone a nested ESXi VM (vGhetto)
- NSX issue on dvs in nested ESXi (telecomOccasionally)
- Nested Virtualization (Limitless) installing ESX, trunked portgroup
- Original Aug 2018
- Changed vxlan to VLAN 100, added VLAN 100 mtu change, added vmkping tests for vTEP connectivity. Dec 18, 2018
- I have had continued issues with intermittent dropped packets on vMotion and NFS interfaces. This causes slow vMotion and large latencies on NFS. Seems to occur only when multiple Nested ESXi VM’s are running so I suspect something is wrong with ESX cloning process in link #3 as of ESX 6.5. I’ve re-created the entire environment with nested ESXi’s created from ISO and all the issues have disappeared – Jan 8 2019