October 1, 2014 Leave a comment
In FVP version 1.5, PernixData introduced a nice little feature that allows a user to specify the network to use for all FVP peering/replica traffic. This added quite a bit of flexibility in adapting FVP to a wider variety of environments. It can also come in handy when testing performance characteristics of different network speeds, similar to what I did when testing FVP over Infiniband. While the "network configuration" setting is self-explanatory, and ultra-simple, it is ESXi that makes it a little more adventurous.
VMkernels and rules to abide by. …Sort of.
"In theory there is no difference between theory and practice. In practice, there is." — Yogi Berra
Under the simplest of arrangements, FVP will use the vMotion network for its replica traffic. If your vMotion works, then FVP works. FVP will also work in a multi-NIC vMotion arrangement. While it can’t use more than one VMkernel, vMotion certainly can. Properly configured, vMotion will use whatever links are available, leaving more opportunity and bandwidth for FVP’s replica traffic. This can be especially helpful in 1GbE environments. Okay, so far, so good. The problem can become when an ESXi host has multiple VMkernels in the same subnet.
The issues around having multiple VMkernels on a single host in one IP subnet is nothing new. The accepted practice has been to generally stay away from multiple VMkernels in a single subnet, but the lines blur a bit when factoring the VMkernel’s intended purpose.
- In VMware Support Insider Post, it states to only use one VMkernel per IP Subnet. Well, except for iSCSI storage, and vMotion.
- In VMware KB 2007467, it states: "Ensure that both VMkernel interfaces participating in the vMotion have the IP address from the same IP subnet."
The motives for recommending isolation of VMkernels is pretty simple. The VMkernel network stack uses a single routing table to route traffic. Two hosts talking to each other on one subnet with multiple VMkernels may not know what interface to use. The result can be unexpected behavior, and depending on what service is sitting in the same network, even a loss of host connectivity. This behavior can also vary depending on the version of ESXi being used. ESXi 5.0 may act differently than 5.1, and 5.5 changes the game even more with the ability to create Custom TCP/IP stacks per VMkernel adapter.which could give each VMkernel its own routing table.
So what about FVP?
How does any of this relate to FVP? For me, this initial investigation stemmed from some abnormally high latencies I was seeing on my VMs. This is quite the opposite effect I’m used to having with FVP. As it turns out, when FVP was pinned to my vMotion-2 network, it was correctly sending out of the correct interface on my multi-NIC vMotion setup, but the receiving ESXi host was using the wrong target interface (vMotion-1 VMkernel on target host), which caused the latency. Just like other VMkernel behavior, it naturally wanted to always choose the lower vmk number. Configuring FVP to use vMotion-1 network resolved the issue instantly, as vMotion-1 in my case it was using vmk1 instead of vmk5. Many thanks to the support team for noticing the goofy communication path it was taking.
Testing similar behavior with vMotion
While the symptoms showed up in FVP, the cause is an ESXi matter. While not an exact comparison, one can simulate a similar behavior that I was seeing with FVP by doing a little experimenting with vMotion. The experiment simply involves taking an arrangement originally configured for Multi-NIC vMotion, disabling vMotion on the network with the lowest vmk number on both hosts, kicking off a vMotion, and observing the traffic via esxtop. (Warning. Keep this experiment to your lab only).
For the test, two ESXi 5.5 hosts were used, and mult-NIC vMotion was set up in accordance to KB 2007467. One vSwitch. Two VMkernel ports (vMotion-0 & vMotion-1 respectively) in an active/standby arrangement. The uplinks are flopped on the other VMkernel. Below is an example of ESX01:
And both displayed what I’d expect in the routing table.
The tests below will show what the traffic looks like using just one of the vMotion networks, but only where the "vMotion" service is enabled on one of the VMkernel ports.
Test 1: Verify what correct vMotion traffic looks like
First, let’s establish what correct vMotion traffic will look like. This is on a dual NIC vMotion arrangement in which only the network with the lowest numbered vmk is ticked with the "vMotion" service.
The screenshot below is how the traffic looks from the source on ESX01. The green bubble indicates the anticipated/correct VMkernel to be used. Success!
The screenshot below is how traffic looks from the target on ESX02. The green bubble indicates the anticipated/correct VMkernel to be used. Success!
As you can see, the traffic is exactly as expected, with no other traffic occurring on the other VMkernel, vmk2.
Test 2: Verify what incorrect vMotion traffic looks like
Now let’s look at what happens on those same hosts when trying to only the higher numbered vMotion network. The “vMotion” service was changed on both hosts to the other VMkernel, and both hosts were restarted. What is shown below is how the traffic looks on a dual NIC vMotion arrangement in which the network with the lowest numbered vmk has the "vMotion" service unticked, and the higher numbered vMotion network has the service enabled.
The screenshot below is how the traffic looks from the source on ESX01. The green bubble indicates the anticipated/correct VMkernel to be used. The red bubble indicates the VMkernel it is actually using. Uh oh. Notice how there is no traffic coming from vmk2, where it should be coming from? It’s coming from vmk1, exactly like the first test.
The screenshot below is how traffic looks from the target on ESX02. The green bubble indicates the anticipated/correct VMkernel to be used.
As you can see, under the described test arrangement, ESXi can and may use the incorrect VMkernel on the source, when vMotion is disabled on the vMotion network with the lowest VMkernel number, and active on the other vMotion network. It was repeatable with both ESXi 5.0 and ESXi 5.5. The results were consistent in tests with host uplinks connected to the same switch versus two stacked switches. The tests were also consistent using both standard vSwitches and Distributed vSwitches.
The experiment above is just a simple test to better understand how the path from the source to the target can get confused. From my interpretation, it is not unlike that of which is described in Frank Denneman’s post on why a vMotion network may accidently use a Management Network. (His other post on Designing your vMotion Network is also a great read, and applicable to the topic here.) Since FVP can only use one specific VMkernel on each host, I believe I was able to simulate the basics of why ESXi was making it difficult for FVP when pinning the FVP replica traffic on my higher numbered vMotion network in my production network. Knowing this lends itself to the first recommendation below.
A few different ways to configure FVP
After looking at the behavior of all of this, here are a few recommendations on using FVP with your VMkernel ports. Let me be clear that these are my recommendations only.
- Ideally, create an isolated, non-routable network using a dedicated VLAN with a single VMkernel on each host, and assign only FVP to that network. It can live in whatever vSwitch is most suitable for your environment. (The faster the uplinks, the better). This will isolate, and insure the peer traffic is flowing as designed, and will let a multi-NIC vMotion arrangement work by itself. Here is an example of what that might look like:
- If for some reason you can’t do the recommendation above, (maybe you need to wait on getting a new VLAN provisioned by your network team) use a vMotion network, but if it is a multi-NIC vMotion, set FVP to run on the vMotion network with lowest numbered VMkernel. According to, yes, another great post from Frank, this was the default approach for FVP prior to exposing the ability to assign FVP traffic to a specific network.
Remember that if there is ever a need to modify anything related to the VMkernel ports (untick the "vMotion" configuration box, adding or removing VMkernels), be aware that the routing interface (as seen via esxcfg-route -l ) may not change until there is a host restart. You may also find that using esxcfg-route -n to view the host’s arp table handy.
The ability for you to deliver your FVP traffic to it’s East-West peers in the fastest, most reliable way will allow you to make the most of FVP offers. Treat the FVP like a first class citizen in your network, and it will pay off with better performing VMs.
And a special thank you to Erik Bussink and Frank Denneman for confirming my vMotion test results, and my sanity.
Thanks for reading.