Fixing host connection issues on Dell servers in vSphere 5.x

I had a conversation recently with a few colleagues at the Dell Enterprise Forum, and as they were describing the symptoms they were having with some Dell servers in their vSphere cluster, it sounded vaguely similar to what I had experienced recently with my new M620 hosts running vSphere 5.0 Update 2.  While I’m uncertain if their issues were related in any way to mine, it occurred to me that I might not have been the only one out there who ran into this problem.  So I thought I’d provide a post to help anyone else experiencing the behavior I encountered.

Symptoms
The new cluster Dell M620 blades running vSphere 5.0 U2 that was being used as our Development Teams code compiling cluster were randomly dropping their connections.  Yep, not good.  This wasn’t normal behavior of course, and the effects ranged anywhere from still being up (but acting odd) to complete isolation of the host with no success at a soft recovery.  The hosts themselves had the latest firmware applied to them, and I used the custom Dell ESXi ISO when building the host.  Each service (Mgmt, LAN, vMotion, storage) were meshed so that one service didn’t depend on a single, multiport NIC adapter, but they still went down.  What was creating the problem?  I won’t leave you hanging.  It was the Broadcom network drivers for ESXi.

Before I figured out what the problem was, here is what I knew:

  • The behavior was only occurring on a cluster of 4 Dell M620 hosts.  The other cluster containing M610’s never experienced this issue.
  • They had occurred on each host at least once, typically when there was a higher likelihood for heavy traffic.
  • Various services had been impacted.  One time it was storage, while the other time it was the LAN side.

Blade configuration background
To understand the symptoms, and the correction a bit better, it is worth getting an overview of what the Dell M620 blade looks like in terms of network connectivity.  What I show below reflects my 1GbE environment, and would look different if I was using 10GbE, or with switch modules instead of passthrough modules.

The M620 blades come with a built in Broadcom NetXtreme II BCM M57810 10gbps Ethernet adapter.  This provides for two 10gbps ports on fabric A of the blade enclosure.  These will negotiate down to 1GbE if you have passthroughs on the back of the enclosure, as I do.

image

There are two spots in each blade that will accept additional mezzanine adapters for fabric B, and fabric C respectively.  In my case, since I also have 1GbE passthroughs on these fabrics as well, I chose to use the Broadcom NetXtreme BCM5719gbe adapter.  Each will provide 4, 1gbe ports.  With passthroughs, only two of the four on each adapter are reachable.  The end result is 6, 1GbE ports available for use for each blade.  Two for storage.  Two for Production LAN traffic, and two for vSphere Mgmt and vMotion.  All services needed (iSCSI, Mgmt, etc.) are assigned so that in the event of a single adapter failure, you’re still good to go.

image

And yes, I’d love to go to 10GbE as much as anyone, but that is a larger matter especially when dealing with blades and the enclosure that they reside in.  Feel free to send me a check, and I’ll return the favor with a nice post.

How to diagnose, and correct
On one of the cases, this event caused an All Paths Down from the host to my storage.  I looked in my /scratch/log for the host, with the intent of looking into the vmkernel and vobd.log files to see what was up.  The following command returned several entries that looked like below

less /scratch/log/vobd.log

2013-04-03T16:17:33.849Z: [iscsiCorrelator] 6384105406222us: [esx.problem.storage.iscsi.target.connect.error] Login to iSCSI target iqn.2001-05.com.equallogic:0-8a0906-d0a034d04-d6b3c92ecd050e84-vmfs001 on vmhba40 @ vmk3 failed. The iSCSI initiator could not establish a network connection to the target.

2013-04-03T16:17:44.829Z: [iscsiCorrelator] 6384104156862us: [vob.iscsi.target.connect.error] vmhba40 @ vmk3 failed to login to iqn.2001-05.com.equallogic:0-8a0906-e98c21609-84a00138bf64eb18-vmfs002 because of a network connection failure.

Then I ran the following just to verify what I had for NICs and their associations

esxcfg-nics -l

Name    PCI           Driver      Link Speed     Duplex MAC Address       MTU    Description
vmnic0  0000:01:00.00 bnx2x       Up   1000Mbps  Full   00:22:19:9e:64:9b 1500   Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet
vmnic1  0000:01:00.01 bnx2x       Up   1000Mbps  Full   00:22:19:9e:64:9e 1500   Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet
vmnic2  0000:03:00.00 tg3         Up   1000Mbps  Full   00:22:19:9e:64:9f 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic3  0000:03:00.01 tg3         Up   1000Mbps  Full   00:22:19:9e:64:a0 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic4  0000:03:00.02 tg3         Down 0Mbps     Half   00:22:19:9e:64:a1 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic5  0000:03:00.03 tg3         Down 0Mbps     Half   00:22:19:9e:64:a2 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic6  0000:04:00.00 tg3         Up   1000Mbps  Full   00:22:19:9e:64:a3 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic7  0000:04:00.01 tg3         Up   1000Mbps  Full   00:22:19:9e:64:a4 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic8  0000:04:00.02 tg3         Down 0Mbps     Half   00:22:19:9e:64:a5 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet
vmnic9  0000:04:00.03 tg3         Down 0Mbps     Half   00:22:19:9e:64:a6 1500   Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet

Knowing what vmnics were being used for storage traffic, I took a look at the driver version for vmnic3

ethtool -i vmnic3

driver: tg3
version: 3.124c.v50.1
firmware-version: FFV7.4.8 bc 5719-v1.31
bus-info: 0000:03:00.1

Time to check and see if there were updated drivers.

Finding and updating the drivers
The first step was to check the compatibility matrix out at the VMware Compatibility Guide for this particular NIC.  The good news was that there was an updated driver for this adapter; 3.129d.v50.1.  I downloaded the latest driver (vib) for that NIC to a datastore that was accessible to the host, so that it could be installed.  The process of making the driver available for installation, as well as the installation itself can certainly be done with the VMware Update Manager, but for my example, I’m performing these steps from the command line.  Remember to go into maintenance mode first. 

esxcli software vib install -v /vmfs/volumes/VMFS001/drivers/broadcom/net-tg3-3.129d.v50.1-1OEM.500.0.0.472560.x86_64.vib

Installation Result
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: Broadcom_bootbank_net-tg3_3.129d.v50.1-1OEM.500.0.0.472560
VIBs Removed: Broadcom_bootbank_net-tg3_3.124c.v50.1-1OEM.500.0.0.472560
VIBs Skipped:

The final steps will be to reboot the host, and verify the results.

ethtool -i vmnic3

driver: tg3
version: 3.129d.v50.1
firmware-version: FFV7.4.8 bc 5719-v1.31
bus-info: 0000:03:00.0

Conclusion
I initially suspected that the problems were driver related, but the symptoms generated from the bad drivers made it give the impression that there was a larger issue at play.  Nevertheless, I couldn’t get these drivers loaded up fast enough, and since that time (about 3 months), they have been rock solid, and behaving normally.

Helpful links
Determining Network/Storage firmware and driver version in ESXi
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1027206

VMware Compatibility Guide
http://www.vmware.com/resources/compatibility/search.php?deviceCategory=io&productid=19946&deviceCategory=io&releases=187&keyword=bcm5719&page=1&display_interval=10&sortColumn=Partner&sortOrder=Asc

My VMworld “Call for Papers” submission, and getting more involved

It is a good sign that you are in the right business when you get tremendous satisfaction from your career – whether it be from the daily challenges at work, or through professional growth, learning, or sharing.  It’s been an exciting month for me, as I’ve taken a few steps to get more involved.

First, I decided to submit my application for the 2013 VMware vExpert program.  I’ve sat on the sidelines, churning out blog posts for 4 years now, but with the encouragement of a few of my fellow VMUG comrades and friends, decided to put my hat in the game with others equally as enthusiastic as I am about what many of us do for a living.  The list has not been announced yet, so we’ll see what happens.  I’m also now officially part of the Seattle VMUG steering committee, contributing where I can to provide more value to the local VMUG community.

Next, I was honored to be recognized as a 2013 Dell TechCenter Rockstar.  Started in 2012, the DTC Rockstar program recognizes those Subject Matter Experts and enthusiasts who share their knowledge on the portfolio of Dell solutions in the Enterprise.  And I am flattered to be in great company with the others who have been recognized by their efforts.   Congratulations to the others who were recognized as well. 

And finally, I took a stab at submitting an abstract for consideration as a possible session at this year’s VMworld.  I can’t say I ever imagined a scenario in which I would be responding to VMware’s annual “Call for Papers”, but with real-life use cases comes really interesting stories. I had a really interesting story.  My session title is:

4370 – Compiling code in virtual machines: Identifying bottlenecks and optimizing performance to scale out development environments

image

This session was inspired from part 1 and part 2 of “Vroom! Scaling up Virtual Machines in vSphere to meet performance requirements.”  What transpired from the project was a fascinating exercise in assumptions, bottleneck chasing, and a modern virtualized infrastructure’s ability to scale up computational power immediately for an organization.  I’ve received great feedback from those posts, but the posts just skimmed the surface on what was learned. What better way to demonstrate a very unique use-case than to share the details with those who really care.  Take a look out at:  http://www.vmworld.com/cfp.jspa.  My submission is under the “Customer Case Studies” track, number 4730.  Public voting is now open.  If you don’t have a VMworld account, just create one – it’s free.  Click on the session to read the abstract, and if you like what you see, click on the “thumbs up” button to put in a vote for it.

Spend enough time in IT, and it turns out you might have an opinion or two on things.  How to make it all work, and how to keep your sanity.  I haven’t quite figured out the definitive answers to either one of those yet, but when there is an opportunity to contribute, I try my best to pay it forward to the great communities of geeks out there.  Thanks for reading.

Configuring a VM for SNMP monitoring using Cacti

There are a number of things that I don’t miss with old physical infrastructures.  One near the top of the list is a general lack of visibility for each and every system.  Horribly underutilized hardware running happily along side overtaxed or misconfigured systems, and it all looked the same.  Fortunately, virtualization has changed much of that nonsense, and performance trending data of VMs and hosts are a given.

Partners in the VMware ecosystem are able to take advantage of the extensibility by offering useful tools to improve management and monitoring of other components throughout the stack.  The Dell Management Plug-in for VMware vCenter is a great example of that. It does a good job of integrating side-band management and event driven alerting inside of vCenter.  However, in many cases you still need to look at performance trending data of devices that may not inherently have that ability on it’s own.  Switchgear is a great example of a resource that can be left in the dark.  SNMP can be used to monitor switchgear and other types of devices, but it’s use is almost always absent in smaller environments.  But there are simple options to help provide better visibility even for the smallest of shops.  This post will provide what you need to know to get started.

In this example, I will be setting up a general purpose SNMP management system running Cacti to monitor the performance of some Dell PowerConnect switchgear.  Cacti leverages RRDTool’s framework to deliver time based performance monitoring and graphing.  It can monitor a number of different types of systems supporting SNMP, but switchgear provides the best example that most everyone can relate to.  At a very affordable price (free), Cacti will work just fine in helping with these visibility gaps.  

Monitoring VM
The first thing to do is to build a simple Linux VM for the purpose of SNMP management.  One would think there would be a free Virtual Appliance out on the VMware Virtual Appliance Marektplace for this purpose, but if there is, I couldn’t find it.  Any distribution will work, but my instructions will cater toward the Debian distributions – particularly Ubuntu, or a Ubuntu clone like Linux Mint (my personal favorite).  Set it for 1vCPU and 512 MB of RAM.  Assign it a static address on your network management VLAN (if you have one).  Otherwise, your production LAN will be fine.  While it is a single purpose built VM, you still have to live with it, so no need to punish yourself by leaving it bare bones.  Go ahead and install the typical packages (e.g. vim, ssh, ntp, etc.) for convenience or functionality.

Templates are an option that extend the functionality in Cacti.  In the case of the PowerConnect switches, the template will assist in providing information on CPU, memory, and temperature.  A template for the PowerConnect 6200 line of switches can be found here.  The instructions below will include how to install this.

Prepping SNMP on the switchgear

In the simplest of configurations (which I will show here), there really isn’t much to SNMP.  For this scenario, one will be providing read-only access of SNMP via a shared community name. The monitoring VM will poll these devices and update the database accordingly.

If your switchgear is isolated, as your SAN switchgear might be, then there are a few options to make the switches visible in the right way. Regardless of what option you use, the key is to make sure that your iSCSI storage traffic lives on a different VLAN from your management interface of the device.  I outline a good way to do this at “Reworking my PowerConnect 6200 switches for my iSCSI SAN

There are a couple of options in connecting the isolated storage switches to gather SNMP data: 

Option 1:  Connect a dedicated management port on your SAN switch stack back to your LAN switch stack.

Option 2:  Expose the SAN switch management VLAN using a port group on your iSCSI vSwitch. 

I prefer option 1, but regardless, if it is iSCSI switches you are dealing with, you will want to make sure that management traffic is on a different VLAN than your iSCSI traffic to maintain the proper isolation of iSCSI traffic. 

Once the communication is in place, just make a few changes to your PowerConnect switchgear.  Note that community names are case sensitive, so decide on a name, and stick with it.

enable

configure

snmp-server location "Headquarters"

snmp-server contact "IT"

snmp-server community mycompany ro ipaddress 192.168.10.12

Monitoring VM – Pre Cacti configuration
Perform the following steps on the VM you will be using to install Cacti.

1.  Install and configure SNMPD

apt-get update

mv /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf.old

2.  Create a new /etc/snmp/snmpd.conf with the following contents:

rocommunity mycompanyt

syslocation Headquarters

syscontact IT

3.  Edit /etc/default/snmpd to allow snmpd to listen on all interfaces and use the config file.  Comment out the first line below and replace it with the second line:

SNMPDOPTS=’-Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1′

SNMPDOPTS=’-Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid -c /etc/snmp/snmpd.conf’

4.  Restart the snmpd daemon.

sudo /etc/init.d/snmpd restart

5.  Install additional perl packages:

apt-get install libsnmp-perl

apt-get install libnet-snmp-perl

Monitoring VM – Cacti Installation
6.  Perform the following steps on the VM you will be using to install Cacti.

apt-get update

apt-get install cacti

During the installation process, MySQL will be installed, and the installation will ask what you would like the MySQL root password to be. Then the installer will ask what you would like cacti’s MySQL password to be.  Choose passwords as desired.

Now, the Cacti installation is available via http://[cactiservername]/cacti with a username and password of "admin" Cacti will now ask you to change the admin password.  Choose whatever you wish.

7.  Download PowerConnect add-on from http://docs.cacti.net/usertemplate:host:dell:powerconnect:62xx and unpack both zip files

8.  Import the host template via the GUI interface.  Log into Cacti, and go to Console > Import Templates, select the desired file (in this case, cacti_host_template_dell_powerconnect_62xx_switch.xml), and click Import.

9.  Copy the 62xx_cpu.pl script into the Cacti script directory on server (/usr/share/cacti/site/scripts).  This may need executable permissions.  If you downloaded it to a Windows machine, but need to copy it to the Linux VM, WinSCP works nicely for this.

10.  Depending on how things were copied, there might be some line endings in the .pl file.  You can clean up that 62xx_cpu.pl file by running the following:

dos2unix 62xx_cpu.pl

Using Cacti
You are now ready to run Cacti so that you can connect and monitor your devices. This example shows how to add the device to Cacti, then monitor CPU and a specific data port on the switch.

1.  Launch Cacti from your workstation by browsing out to http://[cactiservername]/cacti  and enter your credentials.

2.  Create a new Graph Tree via Console > Graph Trees > Add.  You can call it something like “Switches” then click Create.

3.  Create a new device via Console > Devices > Add.  Give it a friendly description, and the host name of the device.  Enter the SNMP Community name you decided upon earlier.  In my example above, I show the community name as being “mycompany” but choose whatever fits.  Remember that community names are case sensitive.

4.  To create a graph for monitoring CPU of the switch, click Console > Create New Graphs.  In the host box, select the device you just added.   In the “Create” box, select “Dell Powerconnect 62xx – CPU” and click Create to complete.

5.  To create a graph for monitoring a specific Ethernet port, click Console > Create New Graphs.  In the Host box, select the device you just added.  Put a check mark next to the port number desired, and select In/Out bits with total bandwidth.  Click Create > Create to complete. 

6.  To add the chart to the proper graph tree, click Console > Graph Management.  Put a check mark next to the Graphs desired, and change the “Choose and action” box to “Place on a Tree [Tree name]

Now when you click on Graphs, you will see your two items to be monitored

image

By clicking on the magnifying glass icon, or by the “Graph Filters” near the top of the screen, one can easily zoom or zoom out to various sampling periods to suite your needs.

Conclusion
Using SNMP and a tool like Cacti can provide historical performance data for non virtualized devices and systems in ways you’ve grown accustomed to in vSphere environments.  How hard are your switches running?  How much internet bandwidth does your organization use?  This will tell you.  Give it a try.  You might be surprised at what you find.