My vSphere Home Lab. 2016 edition

Here we go again. I had no intention of writing a follow-up to my "Home Lab 2015 edition" post last year, as I didn’t foresee any changes to the lab in the coming year that would be interesting enough to write about. 

So much for predicting the future.

Sometimes Home Lab environments tend to border on vanity projects. I would like to think the recent changes in my lab were done out of need, but rationalizing wants into needs is common enough to be considered a national pastime. Nevertheless, my profession now has me testing workloads and new technologies on a daily basis, and this was a driving force behind these upgrades. Honest.

Demand often drives change. This is where the evolution of my Home Lab continues to mimic a production environment – just at a smaller scale. Budget, performance, capacity, space, and heat are all elements of a Home Lab design that are almost laughably similar to a production environment. Workloads evolve, and needs grow – quickly making previously used design inputs as inadequate. That is exactly what happened to me, and knew I had to invest in a few upgrades.

Compute – Performance/Testing Cluster
It was finally time to replace a few of the oldest components of the lab. My primary hosts that were built off of Intel Sandy Bridge processors used motherboards limited to just 32GB of RAM, and PCIe 2.0. I didn’t have any 10Gb connectivity without my old InfiniBand gear, and I was consistently pushing the CPUs to their limit.

I decided to go with a pair of SuperMIcro 5018D-FN4T rack mounted units. These are an incredibly small 1U form factor that feature built-in dual 10GbE and dual 1GbE interfaces, a dedicated IPMI port, a PCIe 3.0 slot, 4 drive bays, and can pack in up to 128GB of DDR4 memory. The motherboard uses the soldered on 8 core Xeon D-1540 chip and the power supply is built into the chassis. Both items reduce flexibility, but improve the no-brainer simplicity of the unit. What is most surprising when you get your hands on them is that they are incredibly small, yet still half empty when the case is cracked open. A third host will probably be in the works at some point, but it’s not necessary at this time.

image

It probably will come as no surprise that multiple PernixData FVP based acceleration tiers are an integral component of my infrastructure, so a few changes occurred in that realm.

1. Adding NVMe cards to use as a Flash based acceleration tier for FVP. For this lab arrangement, I used the Intel 750 NVMe based PCIe 3.0 card. While they are not officially on the VMware HCL, they are fine for the Home Lab, as they borrow heavily from the Intel DC P3xxxx line of NVMe cards that are on the VMware HCL. Intel NVMe cards are outstanding performers. Enjoy the benefits of completely bypassing all of the legacy elements the traditional storage stack on a host such as storage controllers and SCSI commands. NVMe based Flash devices is still limited by the physics of NAND Flash, but it is an incredible performer that can make any SSD based Flash drive look quite feeble in comparison. Just make sure to use Intel’s driver for vSphere.

2. More RAM to use as a DFTM acceleration tier in FVP. I placed 64GB of Micron Memory which allows me to allocate a nice chunk of RAM for FVP acceleration. The beauty of using memory as an acceleration tier avoiding all characteristics of NAND Flash, and the ability for it to leverage compression techniques. This typically increases the effective tier size between 30% and 70% depending on workload. The larger the tier size, the more content that can live in the tier, and the less eviction that occurs against the working set of data.

Compute – Management Cluster
A management cluster in a Home Lab is great. It has allowed me to really experiment with testing workloads and new technologies without any impact to the components that run the infrastructure. My Management Cluster now comprises of three Intel NUCs. I would have been perfectly happy with just a couple of NUCs as a Management cluster, but unfortunately the 16GB RAM limitation makes that a bit tough. Eventually, the NUCs will outlive their usefulness in the lab, but the great part about them is that they can easily be used as a desktop workstation, or media server. For now, they will continue to serve their purpose as a Management Cluster.

Switching
Upgrading my network meant adding 10GbE connectivity. For this, I chose a Netgear XS708E, 8 port, 10GbE switch. This would serve as a fast interconnect for east-west traffic between hosts. My adventures with InfiniBand were always interesting and educational. It’s an amazing technology, but there was just too much administrative overhead to the gear I was using. Unfortunately, there are not too many small, affordable 10GbE switches out there. The Dell 12 port X4012 10GbE switch looked really appealing based on the specs, but the ports are SFP+, so that would have meant rethinking a number of things. As for the Netgear, what do I think of it?  After configuring the product, I’m convinced the folks at Netgear wanted to punish anyone who buys the unit. All of the configuration items that should be so basic in a CLI or web based UI are obfuscated in a proprietary interface that seems to be missing half of the options you’d expect. Dear Netgear, please let me configure LAGs, trunks, MTU size, and VLANs with something remotely resembling common sense. It does work, but if I could do it over, I’d choose something else.

My network core still consists of a Cisco SG300-20 Layer 3 switch. Moving away from hosts that had 6, 1GbE ports down to hosts that had just two 1GbE ports and two 10GbE ports meant that I was able to free up space on this switch. That switch still has a bit of a premium price for a 20 port L3 switch, but it has been a rock solid component of my lab for over 4 years now.

Ancillary Components
One thing I was tired of dealing with was my wireless gateway. I’ve grown sour on any consumer based WiFi/Router solutions available. Most aren’t stable, and lack features that require one to crack them with a DD-WRT build. Memory leaks and other reboot inducing behaviors are not what you want to deal with when attempting to access the lab remotely, so it was time to take a new approach. I went with the following for my gateway and wireless needs.

Motorola SB6121 DOCSIS 3.0 Cable Modem. This was purchased to replace the oversized cable modem provided by the service provider. It’s small, affordable, and prevents the cable company from changing settings on me, as they often would with their own unit.

Ubiquiti EdgeRouter PoE. This 5 port unit serves as my gateway, where one leg feeds downstream to my core switch, and another leg is used as a DMZ for my WiFi. This is a great switch that offers everything that I was looking for. Trunking, static routes, NAT and Firewalling. The multiple PoE ports makes it easy to add new wireless access points.

Ubiquiti UniFi AP Wireless Access Point. These access points pair nicely with the PoE based router above.

It’s been a rock solid, winning combination. Always on, with no random need to reboot. Total control over configuration, and no silliness from the cable provider. Mission accomplished.

Storage
This was one of the few components that didn’t change. Storage is served up by two, 5-bay Synology units with a mix of SSDs and spinning disk. I had plenty of capacity, with enough options to test various media if needed.

Mounting
Until this latest refresh, a $25 utility rack had housed the assortment of oddly shaped lab gear pretty well. With the changeover to small 1U rackmount servers and additional switchgear, it was time for an official enclosure. I went with a Tripp Lite 9U Wall Mount Cabinet. It will eventually be wall mounted, but for the time being, sits perfectly on a $12 moving dolly from Harbor Freight. The cabinet has some nice mounting ports for supplementary exhaust fans should the need arise.

Relocation
Within the first few minutes of powering up the new hosts, I realized the arrangement was going to need a new home. Server room loud?  No. But moving from 38dB to 50+dB is loud enough that you wouldn’t want to be working by it all day. There is no way 1U fans spinning at 8,000 RPM will ever be soothing. I had been quite proud of how quiet my lab gear had been up until this point. I stayed away from 1U anything, and when with quiet fans wherever I could. I tried desperately to suppress the noise, replacing all of the fans with ultra-quiet Noctua fans. Unfortunately, ultra-quiet can also mean they don’t move much air. It’s not good to disregard any delta in CFM between fans. The heat alarms made it very clear this wasn’t going to work, and I didn’t want to burn up perfectly good gear. I chose to place all of the factory fans back in the 1U servers, and the 10GbE switch, and used the Noctua fans as supplementary fans in each device. They do help the primary fans to spin at a lower rate, so the effort wasn’t a total waste. The 9U cabinet will be relocated to a more permanent location than it is now, but for the time being, its making a coat closet nice and warm.

What it looks like
The entire lab, including the UPS is now self-contained, which should make its final relocation straight forward. The entire arrangement (5 hosts, 2 switches, 2 Synology NAS units, etc.) draws between 250 – 300 watts depending upon the load. Considering the old, much less capable arrangement ran at about 200 watts, I was pretty happy with the result.

image

In the spirit of full disclosure, the cabinet door does cover up some rather careless cable management practices. Regardless, I am thrilled with the end result and how it performs. A space efficient arrangement that is extremely powerful.

No matter how little, or how much you decide to invest in a Home Lab, I’ve learned that the satisfaction seems to be directly proportional to how much value it brings to you. Whether it be a hobby, used for professional growth, or a part of your day-to-day job duties, any sense of buyer’s remorse only seems to creep in when it’s not used. For my circumstances, that doesn’t seem to be a problem.

Your Intel NUC Home Lab questions answered

With my recent post on what’s currently running in my vSphere Home Lab, I received a number of questions about one particular part of the lab; that being my Management Cluster built with Intel NUCs. So here is a quick compilation of those questions (with answers) I’ve had around this topic.

Why did you go with a NUC?  There are cheaper options.
My approach for a Home Lab Management Cluster was a bit different than my regular Lab Cluster. I wanted to take a minimalist approach, and provide just enough resources to get my primary VMs off of my other two hosts that I do a majority of my testing against. In other words, less is more. There is a bit of a price premium with a NUC, but there also is a distinct payoff with them that often gets overlooked. If they do not keep up with your needs in the Home Lab, they can be easily repurposed as a workstation or a media PC. The same can’t be said for most Home Lab gear.

Is there anything special you have to do to run ESXi on a NUC?
Nothing terribly difficult. The buildup of ESXi on the NUC is relatively straightforward, and there is a growing number of posts that walk through this nicely. The primary steps are:

  1. Build a customized ISO by packing up an Intel NIC driver, and a SATA Controller driver, and place it on a bootable USB.
  2. Temporarily disable AHCI in the BIOS for just the installation process
  3. Install ESXi
  4. Re-enable AHCI in the BIOS after the installation of ESXi is complete.

How many cores?
The 3rd generation NUC is built off of the Intel Core i5-4250U (Haswell) chipset. It has two physical cores, and will present 4CPUs with Hyper-Threading. After managing and watching real workloads for years, my position on Hyper-Threading is a bit more conservative than many others. It is certainly better than nothing, but many times the effective performance gain is limited, and varies with workload characteristics. It’s primary benefit with the NUC is that you can run a 4vCPU VM if you need to. Utilization of the CPU from a cluster perspective are often hovering below 10%.

Is working with 16GB of RAM painful?
Having just 16GB of RAM might be a more visible pain if it were serving something other than Management VMs. The biggest issue with a "skinny" two node Management Cluster usually comes in when you have to throw one into maintenance mode. But much like having a single switch in a Home Lab, you just deal with it. Below is what the Memory usage on these NUCs look like.

clip_image001

There are a few options to improve this situation.

1. Trimming up some of your VMs might be a good start. Virtual Appliances like the VCSA are built with a healthy chunk of RAM configured by default (supporting all of that Java goodness). Here is a way to trim up memory resources on the VCSA, although, I have not done this yet because I haven’t needed to. Just don’t use the Active Memory metric as your sole data point to trim up a VM Memory configuration. See Observations with the Active Memory metric in vSphere on how easily that metric can be misinterpreted.

2. Look into a new option for increasing RAM density on the NUC. Yeah, this might blow your budget, but if you really want 32GB of RAM in a NUC, you can do it. At this time, the cost delta for making a modification like this is pretty steep, and it may make more sense to purchase a 3rd NUC for a Management Cluster.

3. Adjust expectations and accept the fact that you might have a little memory ballooning or swapping going on. This is by far the easiest, and most affordable way to go.

How is ESXi on a single NIC?
Well, there is no getting around the fact that the NUC comes with a single 1GbE NIC. This means no redundancy, and limited bandwidth. The good news is that with just one NIC, you can monitor this quite easily in vCenter!   Since you are running all services and data across a single uplink, it may be in your best interest to run a Virtual Distributed Switch (VDS) to properly control ingress and egress traffic, and make sure something like a vMotion isn’t going to wreak havoc on your environment. However, transitioning a vCenter VM to a VDS with a single uplink can sometimes be a little adventurous, so you might want to plan ahead.

If you must have a 2nd NIC to the host, take a look here. Nicholas Farmer showed quite a bit of ingenuity in coming up with a second uplink. Also, don’t forget to look at his great mini-rack he made for the NUCs out of Legos. Great stuff.

How do they perform?
Exactly the way I want them to perform. Out of sight, and out of mind.  Again, my primary lab work is performed on my Micro-ATX style hosts, so as long as the NUCs can keep the various Management and infrastructure VMs running, then that is good with me. Some VMs are easy to trim up to provide minimal resources (Linux based syslog servers, DNS, etc.) while others are more difficult or not worth the hassle.

Why did you put two SSDs in them?
This was for flexibility. I wanted one (mSATA) drive for the possibility of local storage if I decided to place any VMs locally, as well as another device (2.5" SSD) for other uses. In this case, I decided to apply a little PernixData FVP magic on them and use one of them in each host to accelerate the VMs. The image below shows the latency of the VCSA, which has about 99% writes. Note how the latency dropped after transitioning the VM from Write-Through (read caching) to Write-Back (read and write caching) to a consistently low level. Not bad considering all traffic is riding across a single link, and the flash device is an old SSD.

(click on image to enlarge)

VCSA

 

Would you recommend them?
I think the Intel NUCs serve as a great little host for a Management Cluster in a Home Lab. They won’t be replacing my Micro-ATX style boxes any time soon, nor should they ever be part of a real environment, but they allow me the freedom to experiment and test on the primary part of the Lab, which is what a Home Lab is all about.

Thanks for reading.

– Pete

My vSphere Home Lab. 2015 edition

The vSphere Home Lab. For some, it is a tool for learning. For others it is a hobby. And for the rest of us, it is a weird addiction rationalized as one of the first two reasons. Home Labs come in all shapes and sizes, and there really is no right or wrong way to create one. Apparently interest in vSphere Home Labs hasn’t waned, as there are now countless resources available online illustrating various designs. At one of our recent Seattle VMUG meetings, we gave a presentation on Home Lab arrangements and ideas. There was great interaction from the audience, and we received several comments afterward on how much they enjoyed the discussion and learning about what others were doing. If you are a VMUG leader and are looking for ideas for presentations, I’d recommend this topic at one of your own local meetings.

Much like a real Data Center, Home Labs are a continual work in progress. Shiny new gear often sits by the warts. Replacing the old equipment with new gear usually correlates to how much time and money you wish to dedicate to the effort. I marvel at some setups by others in the industry. A few of the more recent ones to keep an eye on is the work Erik Bussink does with his high speed networking, and the cool setup Jason Langer has with his half height cabinet and rack mounted hosts all on 10GbE. Pretty funny considering how many companies are still running 3 hosts with 1GbE networking.

In my conversations with others in the community, I realized I didn’t have a post I could direct someone to when they would ask what I used in my own environment. Well, let me lay it out for you, as of February, 2015.

Compute
Primary vSphere cluster
2 hosts currently make up this cluster, and consist of the following:

  • Lian LI PC-V351B chassis paired with a Scythe SY 1225SL 12L 120mm case fan.
  • SuperMicro MBD-X9SCM-F-O LGA Motherboard with IPMI (a must!)
  • Intel E3-1230 Sandy Bridge 3.2Ghz CPU (single socket, 4 physical cores)
  • 32GB RAM
  • Seasonic X series SS-400FL Power supply
  • Qty 3, Intel E1G42ETBLK dual port NIC
  • Mellanox MT25418 DDR 2 port InfiniBand HCA (10Gb per connection)
  • 8GB USB drive (boot)
  • 2TB SATA disk for local storage (testing)
  • Qty 2.  SATA based SSDs.  (varies with testing)

Management Cluster
At this time, a single host makes up this cluster, but intend to add a second unit.

  • Intel NUC BOXD54250WYKH1 Intel Core i5-4250U
  • Intel 530 240GB mSATA SSD
  • Crucial 16GB Kit (2x8GB)
  • Extra drive bay (for additional 2.5" SSD if needed)

The ATX style hosts have served quite well over the last 2 1/2 years. They are starting to show their age, but are quiet, and power efficient (read: low heat). Unfortunately they max out at just 32GB of RAM, which gets eaten up pretty quickly these days. The chassis started out very empty at first, but as I started to add SSDs and spinning disks for additional testing, InfiniBand cards, along with the occasional PCIe flash card or storage controller, I don’t have much room to spare anymore.

The Intel NUC is an interesting solution. In a vSphere Home Lab, the biggest constraints are that they are limited to 16GB of RAM, and a single 1GbE NIC. Since these units will serve as my management cluster, it should be fine, and it allows me to be more destructive on the primary two host cluster. They also fit into the small server rack quite nicely. I prefer the slightly thicker D54250WYKH as opposed to the traditional Intel D54250WYK model. It’s slightly thicker, but allows for an additional internal 2.5" drive. This offers a lot of flexibility if you wanted to keep some VMs on local storage, or possibly do some limited testing with host based caching. If they ever become too underpowered, they will always find use as a media server or workstation.

Network
Most of my networking needs flow through a Cisco SG300-20. This is a feature rich, layer 3 switch that I’ve written about in the past (Using the Cisco SG300-20 Layer 3 switch in a home lab) I’ve used up all 20 ports, and really need another one. However, with the advent of other good layer 3 switches out there, and with the possibility of eventually moving my Lab to 10GbE, I’ve been making do with what I have.

As noted on my post Testing Infiniband in the home lab with PernixData FVP I introduced InfiniBand as a relatively affordable way to test high speed interconnects between hosts. I avoid the need to have an InfiniBand switch because with only two hosts, I can simply directly connect them. They are only passing vMotion and PernixData FVP traffic, so there is no need worry about routing. A desire to add a 3rd or 4th host gets complex, as I’d have to take the plunge and invest in an IB switch. (loud and not cheap).

Storage
Persistent storage comes from a Synology DS1512+ and a Synology DS1514+ NAS. unit. Both are 5 bay units, and have a mix of spinning disk, and SSDs. The primary difference being that the DS1514+ has four, 1GbE ports on it versus the older DS1512+ has two. One unit is used for housing the majority of my Lab VMs, and non-lab based file storage, while the other is used for experimentation and performance testing. Realistically I only need one Synology unit, but I was able to pick up the newer model for a great price, and I couldn’t refuse. My plan is to split out lab duties and general storage needs to the separate units.

Synology has seemed to have won the battle of storage in the home labs. Those who own them know that while they are a little pricey, they are well worth it, and offer so many other benefits beyond just serving up block or file storage for a vSphere cluster.

Battery Backup
My luck with UPS units in the home has not been anything to brag about. It’s usually a case of looking like they work until you really need them. So far the best luck I’ve had is with the unit I’m currently using. It is a CyberPower 1500AVR. With the entire lab drawing around 200 watts, this means that there is only about a 25% load on the UPS.

Server rack
A two shelf wire utility rack from Lowe’s fit the bill quite nicely. It is small, affordable ($25), and seemed to house the goofy form factor of the Lian Li ATX chassis. The only problem is that if I add a another ATX style host, I may have to come up with a better rack solution.

Workstation
While I had a good lab environment to test with, up until a few months ago, the workstation sitting next to the lab was old, tired, and no longer functional. I found myself not even using it. So I replaced it with an Intel NUC as well. There is a bit of a price premium when buying the NUC, but the form factor, performance, simplicity, and power consumption all make it a no-brainer in my book. The limitations it has as a vSphere host (single NIC, and 16GB of RAM max) is not an issue when used as a workstation. It performs great, and powers a dual Monitor setup really well.

What it looks like
Standing at just 35” high, you can see that it is pretty self contained.

front

back

    The Home Lab Road Map / Wish List
    When you have a Home Lab, you have plenty of time to think about what you want next.  The "what you have" is never quite the same as the "what you want."  So here is the path I’ll probably be taking:

  • A second Intel NUC to serve as a 2 node Management cluster. (Done.  See here)
  • 10GbE switch.  My primary hesitation on this is cost, and noise. 
  • New hosts.  I tempted to go the route of a 2U rack mounted chassis so that I can go to three or four hosts more efficiently. With SuperMicro offering some motherboards with a built-in 10GbE port, that is pretty enticing.
  • New gateway.  As the lab grows more sophisticated, the more the network topology looks like a small production environment.  That is why a proper router/firewall on this wish list.
  • New wireless AP.  Not technically part of the Home Lab, but plays an important role for obvious reasons.  I need a wireless AP that is not prone to memory leaks and manual reboots every three or four days.
  • Affordable PCIe based flash is really making inroads in the enterprise, but it’s still not affordable enough for the home.  I hope this changes, as PCIe avoids so many headaches with flash that runs through a traditional storage controller.

Lessons learned over the years
A few takeaways have come from spending many hours working with my Home Lab.  These reflect personal preferences more than anything, but they might save you some effort along the way as well.

1. The best Home Lab is the one that you use.  For quite some time, I used a nested lab on a burley laptop, in addition to the physical setup.  Ultimately the physical Home Lab won out because it fit more of what I wanted to test and work on.  If my interests were more focused on scripting or workflow automation, perhaps a nested lab would be fine.  But I’m a bit too much of a gear-head, and my job now focuses in performance on top of real hardware.  I also didn’t care to power up and power down the entire nested lab each time I wanted to work on the laptop.

2.  While "lab" implies all things experimental, it is common to have a desire for some services to be running all the time.  Perhaps your lab has some responsibilities as a media server.  Or in my case, it also runs my Horizon View environment that I use for remote access.  This makes the idea of tearing down a lab on a whim a bit more complex.  It’s where a Management cluster can come in handy. Having it physically segregated helps to keep things operational when you want to do a complete rebuild, or experiment with a beta version of vSphere.

3.  I stay away from the cheap SSDs.  They have no place in a real Data Center, and aren’t much better in the home.  When it comes to flash, you get what you pay for.  And sometimes, even when you pay, you still don’t get good performing SSDs.  Spend your money wisely.  Buying something multiple times over doesn’t save much money in the end.  And remember, controllers matter too.

4.  Initially I wanted to configure an arrangement that consumed as little power as possible.  Keeping the power down means keeping the heat generated down, and thus the noise.  Since my entire sits just an arms-length from where I work, it was important in the beginning, and important now.  The entire setup draws about 200 watts of power and makes 38dB of noise 3 feet away.  I’ve refused to add anything loud or hot, and if I’m forced to, the lab will have to be relocated into a new area.

5.  There is always a way to do things a little cheaper.  But consider what your time is worth, and remember the reason why you have a Home Lab in the first place.  That has driven several of my purchasing decisions, and helps remove some of the petty obstacles that can sidetrack the best of us from working on what we intended to.

6.  While some technologies and practices trickle down from production environments to the Home Lab.  Sometimes the opposite happens.  Two good examples of this might be the use of the VCSA (vSphere 5.5 or later), and letting ESXi run on a USB or MicroSD card.  And that is the beauty of a lab.  It invites experimentation, and filters out what looks good on paper, versus what actually works.  Keep an open mind, and use it for what it is good for; making mistakes, and learning .

Thanks for reading

– Pete

 

Testing InfiniBand in the home lab with PernixData FVP

One of the reasons I find the latest trends in datacenter architectures so interesting is the innovative approaches used to address deficiencies associated with more traditional arrangements. These innovations have been able to drive more of what almost everyone needs; better storage performance and better scalability.

The caveat to some of these newer arrangements is that it can put heavy stress on the plumbing that connects these servers. Distributed storage technologies like VMware VSAN, or clustered write buffering techniques used by PernixData FVP and Atlantis Computing’s USX leverage these interconnects to accelerate storage traffic. Turn-key Hyperconverged solutions do too, but they enjoy the luxury of having full control over the hardware used. Some of these software based solutions might need some retrofitting of an environment to run optimally or meet their requirements (read: 10GbE or better). The desire for the fastest interconnect possible between hosts doesn’t always align with budget or technical constraints, so it makes most sense to first see what impact there really is.

I wanted to test the impact better bandwidth would have between servers a bit more, but do to constraints in my production environment, I needed to rely on my home lab. As much as I wanted to throw 10GbE NICs in my home lab, the price points were too high. I had to do it another way. Enter InfiniBand. I’m certainly not the only one to try InfiniBand in a home lab, but I wanted to focus on two elements that are critical to the effectiveness of replica traffic. The overall bandwidth of the pipe, and equally important, the latency. While I couldn’t simulate an exact workload that I see in my production environment, I could certainly take smaller snippets of I/O patterns that I see, and model them the best I can.

InfiniBand is really interesting. As Joeb Jackson put it in a NetworkWorld.com article, "InfiniBand is architecturally sacrilegious" as it combines many layers of the OSI model. The results can be stunning. Transport latencies in the 2 microsecond neighborhood, and a healthy roadmap to 200Gbps and beyond. It’s sort of like the ’66 AC Shelby Cobra of data transports. Simple, and perhaps a little rough around the edges, but brutally fast. While it doesn’t have the ubiquity of RJ/Ethernet, it also doesn’t have the latencies that are still a part of those faster forms of Ethernet.

At the time of this writing, the InfiniBand drivers for ESXi 5.5 weren’t quite ready for VSAN testing yet, so the focus of this testing is to see how InfiniBand behaves when used in a PernixData FVP deployment. I hope to publish a VSAN edition in the future. I simply wanted to better understand if (and how much) a faster connection would improve the transmission of replica traffic when using FVP in WB+1 mode (local flash, and 1 peer). My production environment is very write intensive, and uses 1GbE for the interconnects. Any insight gained here will help in my design and purchasing roadmap for my production environment.

Testing:
Testing occurred on a two host cluster backed by a Synology DS1512+. Local flash leveraged SATA III based EMLC SSD drives using an onboard controller. 1GbE interconnects traversed a Cisco SG300-20 using a 1500 byte MTU size. For InfiniBand, each host used a Mellanox MT25418 DDR 2 port HCA that offered 10Gb per connection. They were directly connected to each other, and used a 2044 byte MTU size. InfiniBand can be set to 4092 bytes but for compatibility reasons under ESXi 5.5, 2044 is the desired size.

I tend to prefer testing that relies on observational patterns versus one final, empirical number. These tests were no different, and while they attempt to simulate a very brief snippet of a workload in my production environment, I find that I still gain a much better understanding from a time based performance graph than an insulated final number.

The test case was a simple one, but would be enough to illustrate the differences I was hoping to see. The test comprised of a 2vCPU VM using 2 workers on a 100% write, 100% random workload lasting for 1 minute. The test was run three times. First with WB+0 (no peer/replica traffic), then WB+1 (one peer) using a 1GbE connection, and finally WB+1 over a single 10Gb InfiniBand connection. Each screen capture I provide will show them in that order. That test case was repeated 3 times. First with 256KB I/O sizes, followed by 32KB, then onto 4KB. I ran the tests several times in different order to ensure I wasn’t introducing inflated or deflated performance due to previous tests or caching. All were repeated several times to flush out any anomalies.

(Click on each image for a larger view)

256KB I/O size test
Testing results using this I/O size is rarely published anywhere because it never bodes well in comparison to a smaller I/O size like 4KB. But my production workloads (compiling) often deal with these I/O sizes, so it is important for me to understand their behavior.

IOPS with 256KB I/O
256KB-IOPS

Latency with 256KB I/O
256KB-Latency

Throughput using 256KB I/O
256KB-Throughput

Observations from 256KB I/O test
Note that the IOPS and effective throughput on the WB+1 using InfiniBand was nearly identical to that of a WB+0 (local flash only) scenario. You can also see how much the 1GbE interface throttled down the performance, driving just half of the IOPS and throughput compared to InfiniBand. But also take a look at the terrible native latency (70ms) of large I/O sizes even when using WB+0 (no peer traffic. Just local flash). Also note that when peer traffic performance is improved, the larger backlog of data in the destager occurs.

32KB I/O size test
Just 1/8th the size of a 256KB I/O, this is still larger than most storage vendors like to advertise in their testing. My production workload often oscillates between 32KB and 256KB I/Os.

IOPS with 32KB I/O
32KB-IOPS

Latency with 32KB I/O
32KB-Latency

Throughput using 32KB I/O
32KB-Throughput

Observations from 32KB I/O test
Once again, the IOPS and effective throughput on the WB+1 using InfiniBand was nearly identical to that of a WB+0 (local flash only) scenario. You can also see how much the 1GbE interface throttled down the performance on throughput. Latency had only a minor improvement moving away from 1GbE, as the latency of the flash was about 6ms.

4KB I/O size test
The most common of I/O sizes that you might see, although it is more common on reads than writes. 1/64th the size of a 256KB I/O, it is tiny compared to the others, but important to test because of the attempt to learn if and how much a fatter, lower latency pipe helps in various I/O sizes.

IOPS with 4KB I/O
4KB-IOPS

Latency with 4KB I/O
4KB-Latency

Throughput using 4KB I/O
4KB-Throughput

Observations from 4KB I/O test
IOPS and effective throughput on the WB+1 using InfiniBand was nearly identical to that of a WB+0 (local flash only) scenario. But as the I/O sizes shrink, so does the effective total/concurrent payload size. So the differences between InfiniBand and 1GbE were less than on tests with larger I/O. Latencies of this I/O size were around 2ms.

Other observations that stood out
One of the first things that stood is illustrated below, with two 5 minutes test runs. Look at where the two arrows point. The arrow on the left points to the number of packets sent while using 1GbE. The arrow on the right shows the number of packets sent while using 10Gb InfiniBand. Quite a difference. Also notice that the effective throughput started out higher, but had to throttle back

Packetstransmitted

Findings:
The key takeaways from these tests:

    • A high bandwidth, low latency interconnect like InfiniBand can virtually eliminate any write redundancy penalty incurred in WB+1 mode.
    • From a single workload, I/O sizes of 32KB and 256KB saw between 65% and 90% improvement on IOPS and throughput. I/O sizes of 4KB saw essentially no improvement (many concurrent 4KB workloads likely would see a benefit however)..
  • Writes using larger I/O sizes were the clear beneficiary of a fatter pipe between servers. However, the native latencies of the flash devices under larger I/O sizes could not take advantage of the low latencies of InfiniBand. In other words, with large I/O sizes, the flash device themselves, or the bus they were using were by far the major impediment lower latency and faster I/O delivery
  • The smaller pipe of 1GbE throttled back the flash device’s ability to ingest the data as fast as InfiniBand. There was always a smaller amount of outstanding writes once the test was complete, but it came at the cost of poorer performance for 1GbE.
    A few other matters can come up when attempting to accurately interpret latencies. As VMware KB 2036863 points out, reporting of latencies accurately can sometimes be a challenge. Just something to be aware of.

Conclusion
InfiniBand was my affordable way to test how a faster interconnect would improve the abilities of FVP to accelerate replica storage I/O.  It lived up to the promise of high bandwidth with low latency. However, effective latencies were ultimately crippled by the SSDs, the controller, or the bus it was using. I did not have the opportunity to test other flash technologies such as PCIe based solutions from Fusion-IO or Virident, or the memory channel based solution from Diablo Technologies. But based on the above, it seems to be clear that how the flash is able to ingest the data is crucial to the overall performance of whatever solution that is using it.

Helpful Links
Erik Bussink’s great post on using InfiniBand with vSphere 5.5
http://www.bussink.ch/?p=1306 

Vladen Seget’s post on incorporating InfiniBand into his backing storage
http://www.vladan.fr/homelab-storage-network-speedup/

Mellanox, OFED and OpenSM bundles
https://my.vmware.com/web/vmware/details/dt_esxi50_mellanox_connectx/dHRAYnRqdEBiZHAlZA==
http://www.mellanox.com/downloads/Drivers/MLNX-OFED-ESX-1.8.1.0.zip
http://files.hypervisor.fr/zip/ib-opensm-3.3.16-64.x86_64.vib

Using the Cisco SG300-20 Layer 3 switch in a home lab

One of the goals when building up my home lab a few years ago was to emulate a simple production environment that would give me a good platform to learn and experiment with. I’m a big fan of nested labs, and use one on my laptop often. But there are times when you need real hardware to interact with. This has come up even more than I expected, as recent trends with leveraging flash on the host have resulted in me stuffing more equipment back in the hosts for testing and product evaluations.

Networking is the other area that can be helpful to have equipment that at least tries to mimic what you’d see in a production environment. Yet the options for networking in a home lab have typically been limited for a variety of reasons.

  • The real equipment is far too expensive, or too loud for most home lab needs.
  • Searching on eBay or Craigslist for a retired production unit can be risky. Some might opt for this strategy, but this can result in a power sucking, 1U noise maker that may have some dead ports on it, or worse, bricked upon arrival.
  • Consumer switches can be disappointing. Rig up a consumer switch that is lacking in features, and port count, and be left wishing you hadn’t gone this route.

I wanted a fanless, full Layer 3 managed switch with a feature set similar to what you might find on an enterprise grade switch, but not at an enterprise grade price. I chose to go with a Cisco SG300-20. This is a 20 port, 1GbE, Layer 3 switch. With no fans, the unit draws as little as 10 watts.

Continue reading “Using the Cisco SG300-20 Layer 3 switch in a home lab”