Configuring a VM for SNMP monitoring using Cacti

There are a number of things that I don’t miss with old physical infrastructures.  One near the top of the list is a general lack of visibility for each and every system.  Horribly underutilized hardware running happily along side overtaxed or misconfigured systems, and it all looked the same.  Fortunately, virtualization has changed much of that nonsense, and performance trending data of VMs and hosts are a given.

Partners in the VMware ecosystem are able to take advantage of the extensibility by offering useful tools to improve management and monitoring of other components throughout the stack.  The Dell Management Plug-in for VMware vCenter is a great example of that. It does a good job of integrating side-band management and event driven alerting inside of vCenter.  However, in many cases you still need to look at performance trending data of devices that may not inherently have that ability on it’s own.  Switchgear is a great example of a resource that can be left in the dark.  SNMP can be used to monitor switchgear and other types of devices, but it’s use is almost always absent in smaller environments.  But there are simple options to help provide better visibility even for the smallest of shops.  This post will provide what you need to know to get started.

In this example, I will be setting up a general purpose SNMP management system running Cacti to monitor the performance of some Dell PowerConnect switchgear.  Cacti leverages RRDTool’s framework to deliver time based performance monitoring and graphing.  It can monitor a number of different types of systems supporting SNMP, but switchgear provides the best example that most everyone can relate to.  At a very affordable price (free), Cacti will work just fine in helping with these visibility gaps.  

Monitoring VM
The first thing to do is to build a simple Linux VM for the purpose of SNMP management.  One would think there would be a free Virtual Appliance out on the VMware Virtual Appliance Marektplace for this purpose, but if there is, I couldn’t find it.  Any distribution will work, but my instructions will cater toward the Debian distributions – particularly Ubuntu, or a Ubuntu clone like Linux Mint (my personal favorite).  Set it for 1vCPU and 512 MB of RAM.  Assign it a static address on your network management VLAN (if you have one).  Otherwise, your production LAN will be fine.  While it is a single purpose built VM, you still have to live with it, so no need to punish yourself by leaving it bare bones.  Go ahead and install the typical packages (e.g. vim, ssh, ntp, etc.) for convenience or functionality.

Templates are an option that extend the functionality in Cacti.  In the case of the PowerConnect switches, the template will assist in providing information on CPU, memory, and temperature.  A template for the PowerConnect 6200 line of switches can be found here.  The instructions below will include how to install this.

Prepping SNMP on the switchgear

In the simplest of configurations (which I will show here), there really isn’t much to SNMP.  For this scenario, one will be providing read-only access of SNMP via a shared community name. The monitoring VM will poll these devices and update the database accordingly.

If your switchgear is isolated, as your SAN switchgear might be, then there are a few options to make the switches visible in the right way. Regardless of what option you use, the key is to make sure that your iSCSI storage traffic lives on a different VLAN from your management interface of the device.  I outline a good way to do this at “Reworking my PowerConnect 6200 switches for my iSCSI SAN

There are a couple of options in connecting the isolated storage switches to gather SNMP data: 

Option 1:  Connect a dedicated management port on your SAN switch stack back to your LAN switch stack.

Option 2:  Expose the SAN switch management VLAN using a port group on your iSCSI vSwitch. 

I prefer option 1, but regardless, if it is iSCSI switches you are dealing with, you will want to make sure that management traffic is on a different VLAN than your iSCSI traffic to maintain the proper isolation of iSCSI traffic. 

Once the communication is in place, just make a few changes to your PowerConnect switchgear.  Note that community names are case sensitive, so decide on a name, and stick with it.

enable

configure

snmp-server location "Headquarters"

snmp-server contact "IT"

snmp-server community mycompany ro ipaddress 192.168.10.12

Monitoring VM – Pre Cacti configuration
Perform the following steps on the VM you will be using to install Cacti.

1.  Install and configure SNMPD

apt-get update

mv /etc/snmp/snmpd.conf /etc/snmp/snmpd.conf.old

2.  Create a new /etc/snmp/snmpd.conf with the following contents:

rocommunity mycompanyt

syslocation Headquarters

syscontact IT

3.  Edit /etc/default/snmpd to allow snmpd to listen on all interfaces and use the config file.  Comment out the first line below and replace it with the second line:

SNMPDOPTS=’-Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1′

SNMPDOPTS=’-Lsd -Lf /dev/null -u snmp -g snmp -I -smux -p /var/run/snmpd.pid -c /etc/snmp/snmpd.conf’

4.  Restart the snmpd daemon.

sudo /etc/init.d/snmpd restart

5.  Install additional perl packages:

apt-get install libsnmp-perl

apt-get install libnet-snmp-perl

Monitoring VM – Cacti Installation
6.  Perform the following steps on the VM you will be using to install Cacti.

apt-get update

apt-get install cacti

During the installation process, MySQL will be installed, and the installation will ask what you would like the MySQL root password to be. Then the installer will ask what you would like cacti’s MySQL password to be.  Choose passwords as desired.

Now, the Cacti installation is available via http://[cactiservername]/cacti with a username and password of "admin" Cacti will now ask you to change the admin password.  Choose whatever you wish.

7.  Download PowerConnect add-on from http://docs.cacti.net/usertemplate:host:dell:powerconnect:62xx and unpack both zip files

8.  Import the host template via the GUI interface.  Log into Cacti, and go to Console > Import Templates, select the desired file (in this case, cacti_host_template_dell_powerconnect_62xx_switch.xml), and click Import.

9.  Copy the 62xx_cpu.pl script into the Cacti script directory on server (/usr/share/cacti/site/scripts).  This may need executable permissions.  If you downloaded it to a Windows machine, but need to copy it to the Linux VM, WinSCP works nicely for this.

10.  Depending on how things were copied, there might be some line endings in the .pl file.  You can clean up that 62xx_cpu.pl file by running the following:

dos2unix 62xx_cpu.pl

Using Cacti
You are now ready to run Cacti so that you can connect and monitor your devices. This example shows how to add the device to Cacti, then monitor CPU and a specific data port on the switch.

1.  Launch Cacti from your workstation by browsing out to http://[cactiservername]/cacti  and enter your credentials.

2.  Create a new Graph Tree via Console > Graph Trees > Add.  You can call it something like “Switches” then click Create.

3.  Create a new device via Console > Devices > Add.  Give it a friendly description, and the host name of the device.  Enter the SNMP Community name you decided upon earlier.  In my example above, I show the community name as being “mycompany” but choose whatever fits.  Remember that community names are case sensitive.

4.  To create a graph for monitoring CPU of the switch, click Console > Create New Graphs.  In the host box, select the device you just added.   In the “Create” box, select “Dell Powerconnect 62xx – CPU” and click Create to complete.

5.  To create a graph for monitoring a specific Ethernet port, click Console > Create New Graphs.  In the Host box, select the device you just added.  Put a check mark next to the port number desired, and select In/Out bits with total bandwidth.  Click Create > Create to complete. 

6.  To add the chart to the proper graph tree, click Console > Graph Management.  Put a check mark next to the Graphs desired, and change the “Choose and action” box to “Place on a Tree [Tree name]

Now when you click on Graphs, you will see your two items to be monitored

image

By clicking on the magnifying glass icon, or by the “Graph Filters” near the top of the screen, one can easily zoom or zoom out to various sampling periods to suite your needs.

Conclusion
Using SNMP and a tool like Cacti can provide historical performance data for non virtualized devices and systems in ways you’ve grown accustomed to in vSphere environments.  How hard are your switches running?  How much internet bandwidth does your organization use?  This will tell you.  Give it a try.  You might be surprised at what you find.

Diagnosing a failed iSCSI switch interconnect in a vSphere environment

The beauty of a well constructed, highly redundant environment is that if a single point fails, systems should continue to operate without issue.  Sometimes knowing what exactly failed is more challenging than it first appears.  This was what I ran into recently, and wanted to share what happened, how it was diagnosed, and ultimately corrected.

A group of two EqualLogic arrays were running happily against a pair of stacked Dell PowerConnect 6224 switches, serving up a 7 node vSphere cluster.  The switches were rebuilt over a year ago, and since that time they have been rock solid.  Suddenly, the arrays started spitting out all kinds of different errors.  Many of the messages looked similar to these:

iSCSI login to target ‘10.10.0.65:3260, iqn.2001-05.com.equallogic:0-8a0906-b6cc21609-d200014832f4ecfb-vmfs001’ from initiator ‘10.10.0.10:52155, iqn.1998-01.com.vmware:esx1-70a98577’ failed for the following reason:
Initiator disconnected from target during login.

Some of the earliest errors on the array looked like this:

10/1/2012 1:01:11 AM to 10/1/2012 1:01:11 AM
Warning: Member PS6000e network port cannot be reached. Unable to obtain network performance data for the member.
Warning: Member PS6100e network port cannot be reached. Unable to obtain network performance data for the member.
10/1/2012 1:01:11 AM to 10/1/2012 1:01:11 AM
Caution: Some SNMP requests to member PS6100e for disk drive information timed out.
Caution: Some SNMP requests for information about member PS6100e disk drives timed out.

VMs that had guest attached volumes were generating errors similar to this:

Subject: ASMME smartcopy from SVR001: MPIO Reconfiguration Request IPC Error – iqn.2001-05.com.equallogic:0-8a0906-bd5d27503-7ef000ed5d54a8c1-ntfs001 on host SVR001

[01:01:11] MPIO failure during reconfiguration request for target iqn.2001-05.com.equallogic:0-8a0906-476f6bd06-0c500008a0c4c41f-ntfs002 with error status 0x16000000.

[01:01:11] MPIO failure during reconfiguration request for target iqn.2001-05.com.equallogic:0-8a0906-dc0da1609-2fe0014145f4e931-ntfs001 with error status 0x80070006.

Before I had a chance to look at anything, I suspected something was wrong with the SAN switch stack, but was uncertain beyond that.  I jumped into vCenter to see if anything obvious showed up.  But vSphere and all of the VMs were motoring along just like normal.  No failed uplink errors, or anything else noticeable.  I didn’t do much vSphere log fishing at this point because all things were pointing to something on the storage side, and I had a number of tools that could narrow down the problem.  With all things related to storage traffic, I wanted to be extra cautious and prevent making matters worse with reckless attempts to resolve.

First, some background on how EqualLogic arrays work.  All arrays have two controllers, working in an active/passive arrangement.  Depending on the model of array, each controller will have between two and four ethernet ports per controller, with each port having an IP address assigned to it.  Additionally, there will be a single IP address to define the “group” the member array is a part of.  (The Group IP is single IP used by systems looking for an iSCSI target, to let the intelligence of the arrays figure out how to distribute traffic across interfaces.)  If some of the interfaces can’t be contacted (e.g. disconnected cable, switch failure, etc.), the EqualLogic arrays will be smart enough to distribute across the active links.

The ports of each EqualLogic array are connected to the stacked SAN switches in a meshed arrangement for redundancy.  If there ware a switch failure, then one wouldn’t be able to contact the IP addresses of the ethernet ports connected to one of the switches.  But using a VM with guest attached volumes (which have direct access to the SAN), I could successfully ping all four interfaces (eth0 through eth3) on each array.  Hmm…

So then I decided to SSH into the array and see if I could perform the same test.  The idea would be to test from one IP on one of the arrays to see if a ping would be successful on eth0 through eth3 on the other array.  The key to doing this is to use an IP of one of the individual interfaces as the source, and not the Group IP.  Controlling the source and the target during this test will tell you a lot.  After connecting to the array via SSH, the syntax for testing the interfaces on the target array would be this:

ping –I “[sourceIP] [destinationIP]”  (quotes are needed!)

From one of the arrays, pinging all four interfaces on the second array revealed that only two of the four ports succeeded.  But the earlier test from the VM proved that I could ping all interfaces, so I chose to change the source IP as one of the interfaces living on the other switch.  Performed the same test, and the opposite results occurred.  The ports that failed on the last test passed on this test, and the ports that passed the last test, failed on this time.  This seemed to indicate that both switches were up, but the communication between switches were down. 

While I’ve never seen these errors on switches using stacking modules, I have seen the MPIO errors above on a trunked arrangement.  One might run into these issues more with trunking, as it tends to leave more opportunity for issues caused by configuration errors.  I knew that in this case, the switch configurations had not been touched for quite some time.  The status of the switches via the serial console stated the following:

SANSTACK>show switch
Management Standby Preconfig Plugged-in Switch Code
SW Status Status Model ID Model ID Status Version
1 Mgmt Sw PCT6224 PCT6224 OK 3.2.1.3
2 Unassigned PCT6224 Not Present 0.0.0.0

The result above wasn’t totally surprising, in that if the stacking module was down, the master switch wouldn’t be able to be able to gather the information from the other switch.

Dell also has an interesting little tool call “Lasso.”  The Dell Lasso Tool will help grab general diagnostics data from a variety of sources (servers, switches, storage arrays).  But in this case, I found it convenient to test connectivity from the array group itself.  The screen capture below seems to confirm what I learned through the testing above.

image

So the next step was trying to figure out what to do about it.  I wanted to reboot/reload the slave switch, but knowing both switches were potentially passing live data, I didn’t want to do anything to compromise the traffic.  So I employed an often overlooked, but convenient way of manipulating traffic to the arrays; turning off the interfaces on the array that are connected to the SAN switch that needs to be restarted.  If one turns off the interfaces on each array connected to the switch that needs the maintenance, then there will not be any live data passing through that switch.  Be warned that you better have a nice, accurate wiring schematic of your infrastructure so that you know which interfaces can be disabled.  You want to make things better, not worse.

After a restart of the second switch, the interconnect reestablished itself.  The interfaces on the arrays were re-enabled, with all errors disappearing.  I’m not entirely sure why the interconnect went down, but the primary objective was diagnosing and correcting in a safe, deliberate, yet speedy way.  No VMs were down, and the only side effect of the issue was the errors generated, and some degraded performance.  Hopefully this will help you in case you see similar symptoms in your environment.

Helpful Links

Dell Lasso Tool
http://www.dell.com/support/drivers/us/en/555/DriverDetails?driverId=4T3Y6&c=us&l=en&s=biz

Reworking my PowerConnect 6200 switches for my iSCSI SAN
https://vmpete.com/2011/06/26/reworking-my-powerconnect-6200-switches-for-my-iscsi-san/

Dell TechCenter.  A great resource all things related to Dell in the Enterprise.
http://en.community.dell.com/techcenter/b/techcenter/default.aspx

Reworking my PowerConnect 6200 switches for my iSCSI SAN

It sure is easy these days to get spoiled with the flexibility of virtualization and shared storage.  Optimization, maintenance, fail-over, and other adjustments are so much easier than they used to be.  However, there is an occasional reminder that some things are still difficult to change.  For me, that reminder was my switches I use for my SAN.

One of the many themes I kept hearing at this year’s Dell Storage Forum (a great experience I must say) throughout several of the breakout sessions I went to was “get your SAN switches configured correctly.”  A nice reminder to something I was all too aware of already; my Dell PowerConnect 6224 switches were not configured correctly since the day they replaced my slightly less capable (but rock solid) PowerConnect 5424’s.  I returned from the forum committed to getting my switchgear updated and configured the correct way.  Now for the tough parts…  What does “correct” really mean when it comes to the 6200 series switches?  And why didn’t I take care of this a long time ago?  Here are just a few excuses reasons. 

  • At the time of initial deployment, I had difficulty tracking down documentation written specifically for the 6224’s to be configured with iSCSI.  Eventually, I did my best to interpret the configuration settings of the 5424’s, and apply the same principals to the 6224’s.  Unfortunately, the 6224’s are a different animal than the 5424’s, and that showed up after I placed them into production – a task that I regretfully rushed.
  • When I deployed them into production, the current firmware was the 2.x generation.  It was my understanding after the deployment that the 2.x firmware on the 6200 series definitely had growing pains.  I also had the unfortunate timing that the next major revision came out shortly after I put them into production.
  • I had two stacked 6224 switches running my production SAN environment (a setup that was quite common for those I asked at the Dell Storage Forum). While experimenting with settings might be fun in a lab, it is no fun, and serious business when they are running a production environment. I wanted to make adjustments just once, but had difficulty confirming settings.
  • When firmware needs to be updated (a conclusion to an issue I was reporting to Technical Support), it is going to take down the entire stack.  This means that you’d better have everything that uses the SAN off unless you like living dangerously.  Major firmware updates will also require the boot code in each switch to be updated.  A true “lights out” maintenance window that required everything to be shut down.  The humble little 5424’s LAGd together didn’t have that problem.
  • The 2.x to 3.x firmware update also required the boot code to be updated.  However, you simply couldn’t run an “update bootcode” command.  The documentation made this very clear.  The PowerConnect Technical Support Team indicated that the two versions ran different algorithms to unpack the contents, which was the reason for yet another exception to the upgrade process. 

One of the many best practices recommended at the Forum was to stack the switches instead of LAGing them.  Stack, stack, stack was drilled into everyone’s head.  The reasons are very good, and make a lot of sense.

  • Stacking modules in many ways extend the circuiting of a single switch, thus the stacking module doesn’t have to honor or be limited by traditional Ethernet.
  • Managing one switch manages them all.
  • Better, more scalable bandwidth between switches
  • No messing around with LAG’s

But here lays the conundrum of many Administrators who are responsible for production environments.  While stacked 6224’s offer redundancy against hardware failure, they offer no redundancy when it comes to maintenance.  These stacked switches are seen as one logical unit, and may be your weakest link when it comes to maintenance of your virtualized infrastructure.  Interestingly enough, when inquiring further on effective strategies for updating under this topology, I observed a few things;  many other users who were stuck with this very same dilemma, and the answers provided weren’t too exciting.  There were generally three answers I heard from this design decision:

  • Plan for a “lights out” maintenance window.
  • Buy another set of two switches, stack those, then trunk the two together via 10Gbe,
  • Buy better switches. 

See why I wasn’t too excited about my options?

Decision time.  I knew I’d suffer a bit of downtime updating the firmware and revamping the configuration no matter what I did.  Do I stack them as recommended, only to be faced with the same dilemma on the next firmware upgrade?  Or do I LAG the switches together so that I avoid this upgrade fiasco in the future?  LAG’ing is not perfect either, and the more arrays I add (as well as the inter-array traffic increasing with new array features), the more it might compound some of the limitations of LAGs. 

What option won out?  I decided to give stacking ONE more try.  I had to keep the eye on my primary objective; correcting my configuration by way of firmware upgrade and build up a simple, pristine configuration from scratch.  The idea was that the configuration would initially contain the minimum set of modifications to get them working according to best practices.  Then, I could build off of the configuration in the future.  Also influencing my decision was finding out that recommended settings with LAGs apparently change frequently.  For instance, just recently, the recommended setting for flow control for the port channel in a LAG was changed.  These are the types of things I wanted to stay away from.  But with that said, I will continue to keep the option open to LAGing them, for the sole reason that it offers the flexibility for maintenance without shutting down your entire cluster.

So here was my minimum desired results for the switch stack after the upgrade and reconfiguration.  Pretty straight forward. 

  • Management traffic on another VLAN (VLAN 10) on port 1 (for uplinking) and port 2 (for local access).
  • iSCSI traffic on it’s own VLAN (VLAN 100), on all ports not including the management ports.
  • Essentially no traffic on the Default VLAN
  • Recommended global and port specific settings (flow control, spanning tree, jumbo frames, etc.) for iSCSI traffic endpoint connections
  • iSCSI traffic that was available to be routed through my firewall (for replication).

My configuration rework assumed the successful boot code and firmware upgrade to version 3.2.1.3.  I pondered a few different ways to speed this process up, but ultimately just followed the very good steps provided with the documentation for the firmware.  They were clear, and accurate.

By the way, on June 20th, 2011, Dell released their very latest firmware update (thank you RSS feed) to 3.2.1.3 A23.  This now includes their “Auto Detection” of ports for iSCSI traffic.  Even though the name implies a feature that might be helpful, the documentation did not provide enough information needed, and I decided to manually configure as originally planned.

For those who might be in the same boat as I was, here were the exact steps I did for building up a pristine configuration after updating the firmware and boot code.  The configuration below was definitely a combined effort by the folks from the EqualLogic and PowerConnect Teams, and me pouring over a healthy amount of documentation.  It was my hope that this combined effort would eliminate some of the contradictory information I found in previous best practices articles, forum threads, and KB articles that assumed earlier firmware.  I’d like to thank them for being tolerant of my attention to detail, and to get this right the first time.  You’ll see that the rebuild steps are very simple.  Getting confirmation on this was not.

Step 1:  Reset the switch to defaults (make a backup of your old config, just in case)
enable
delete startup-config
reload

 
Step 2:  When prompted, follow the setup wizard in order to establish your management IP, etc. 
 
Step 3:  Put the switch into admin and configuration mode.
enable
configure

 
Step 4:  Establish Management Settings
hostname [yourstackhostname]
enable password [yourenablepassword]
spanning-tree mode rstp
flowcontrol

 
Step 5: Add the appropriate VLAN IDs to the database and setup interfaces.
vlan database
vlan 10
vlan 100
exit
interface vlan 1
exit
interface vlan 10
name Management
exit
interface vlan 100
name iSCSI
exit
ip address vlan 10
 
Step 6: Create an Etherchannel Group for Management Uplink
interface port-channel 1
switchport mode access
switchport access vlan 10
exit
NOTE: Because the switches are stacked, port one on each switch will be configured in this channel-group which can then be connected to their core switch or intermediate switch for management access. Port two on each switch can be used if they need to plug a laptop into the management VLAN, etc.
 
Step 7: Configure/assign Port 1 as part of the management channel-group:
interface ethernet 1/g1
switchport access vlan 10
channel-group 1 mode auto
exit
interface ethernet 2/g1
switchport access vlan 10
channel-group 1 mode auto
exit
 
Step 8: Configure Port 2 as Management Access Switchports (not part of the channel-group):
interface ethernet 1/g2
switchport access vlan 10
exit
interface ethernet 2/g2
switchport access vlan 10
exit
 
Step 9: Configure Ports 3-24 as iSCSI access Switchports
interface range ethernet 1/g3-1/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
interface range ethernet 2/g3-2/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
NOTE:  Binding the xg1 and xg2 interfaces into a port-channel is not required for stacking. 
 
Step 10: Exit from Configuration Mode
exit
 
Step 11: Save the configuration!
copy running-config startup-config

Step 12: Back up the configuration
console#copy startup-config tftp://[yourTFTPip]/conf.cfg

In hindsight, the most time consuming aspect of all of this was trying to confirm the exact settings for the 6224’s in an iSCSI SAN.  Running in second was shutting down all of my VMs, ESX hosts, and anything else that connected to the SAN switchgear.  The upgrade and the rebuild was relatively quick and trouble-free.  I’m thrilled to have this behind me now, and I hope that by passing this information along, you too will have a very simple working example to build your configuration off of.  As for the 6224’s, they are working fine now.  I will continue to keep my fingers crossed that Dell will eventually provide a way to update firmware to a stacked set of 6200 series switches without a lights out maintenance window.

Replication with an EqualLogic SAN; Part 4

 

If you had asked me 6+ weeks ago how far along my replication project would be on this date, I would have thought I’d be basking in the glory of success, and admiring my accomplishments.

…I should have known better.

Nothing like several IT emergencies unrelated to this project to turn one’s itinerary into garbage.  A failed server (an old physical storage server that I don’t have room on my SAN for), a tape backup autoloader that tanked, some Exchange Server and Domain Controller problems, and a host of other odd things that I don’t even want to think about.  It’s overlooked how much work it takes to keep an IT infrastructure from not losing any ground from the day before.  At times, it can make you wonder how any progress is made on anything.

Enough complaining for now.  Lets get back to it.  

 

Replication Frequency

For my testing, all of my replication is set to occur just once a day.  This is to keep it simple, and to help me understand what needs to be adjusted when my offsite replication is finally turned up at the remote site.

I’m not overly anxious to turn up the frequency even if the situation allows.  Some pretty strong opinions exist on how best to configure the frequency of the replicas.  Do a little bit with a high frequency, or a lot with a low frequency.  What I do know is this.  It is a terrible feeling to lose data, and one of the more overlooked ways to lose data is for bad data to overwrite your good data on the backups before you catch it in time to stop it.  Tapes, disk, simple file cloning, or fancy replication; the principal is the same, and so is the result.   Since the big variable is retention period, I want to see how much room I have to play with before I decide on frequency.  My purpose of offsite replication is disaster recovery.  …not to make a disaster bigger.

 

Replication Sizes

The million dollar question has always been how much changed data, as perceived from the SAN will occur for a given period of time, on typical production servers.  It is nearly impossible to know this until one is actually able to run real replication tests.  I certainly had no idea.  This would be a great feature for Dell/EqualLogic to add to their solution suite.  Have a way for a storage group to run in a simulated replication where it simply collects statistics that would accurately reflect the amount of data that would be replicate during the test period.  What a great feature for those looking into SAN to SAN replication.

Below are my replication statistics for a 30 day period, where the replicas were created once per day, after the initial seed replica was created.

Average data per day per VM

  • 2 GB for general servers (service based)
  • 3 GB for servers with guest iSCSI attached volumes.
  • 5.2 GB for code compiling machines

Average data per day for guest iSCSI attached data volumes

  • 11.2 GB for Exchange DB and Transaction logs (for a 50GB database)
  • 200 MB for a SQL Server DB and Transaction logs
  • 2 GB for SharePoint DB and Transaction logs

The replica sizes for the VM’s were surprisingly consistent.  Our code compiling machines had larger replica sizes, as they write some data temporarily to the VM’s during their build processes.

The guest iSCSI attached data volumes naturally varied more from day-to-day activities.  Weekdays had larger amounts of replicated data than weekends.  This was expected.

Some servers, and how they generate data may stick out like sore thumbs.  For instance, our source code control server uses a crude (but important) way of an application layer backup.  The result is that for 75 GB worth of repositories, it would generate 100+ GB of changed data that it would want to replicate.  If the backup mechanism (which is a glorified file copy and package dump) is turned off, the amount of changed data is down to a very reasonable 200 MB per day.  This is a good example of how we will have to change our practices to accommodate replication.

 

Decreasing the amount of replicated data

Up to this point, the only step to reduce the amount of data replication is the adjustment made in vCenter to move the VM’s swap files off onto another VMFS volume that will not be replicated.  That of course only affects the VM’s paging files – not the guest VM’s paging files that are controlled by the OS.  I suspect that a healthy amount of changed data on the VMs are the paging files for the OS.  The amount of changed data on those VM’s looked suspiciously similar to the amount of RAM assigned to the VM.  There typically is some correlation to how much RAM an OS has to run with, and the size of the page file.  This is pure speculation at this point, but certainly worth looking into.

The next logical step would be to figure out what could be done to reconfigure VM’s to perhaps place their paging/swap files in a different, non-replicated location.   Two issues come to mind when I think about this step. 

1.)  This adds an unknown amount of complexity (for deploying, and restoring) to the systems running.  You’d have to be confident in the behavior of each OS type when it comes to restoring from a replica where it expects to see a page file in a certain location, but does not.  How scalable this approach is would also need to be asked.  It might be okay for a few machines, but how about a few hundred?  I don’t know.

2.)  It is unknown as to how much of a payoff there will be.  If the amount of data per VM gets reduced by say, 80%, then that would be pretty good incentive.  If it’s more like 10%, then not so much.  It’s disappointing that there seems to be only marginal documentation on making such changes.  I will look to test this when I have some time, and report anything interesting that I find along the way.

 

The fires… unrelated, and related

One of the first problems to surface recently were issues with my 6224 switches.  These were the switches that I put in place of our 5424 switches to provide better expandability.  Well, something wasn’t configured correctly, because the retransmit ratio was high enough that SANHQ actually notified me of the issue.  I wasn’t about to overlook this, and reported it to the EqualLogic Support Team immediately.

I was able to get these numbers under control by reconfiguring the NIC’s on my ESX hosts to talk to the SAN with standard frames.  Not a long term fix, but for the sake of the stability of the network, the most prudent step for now.

After working with the 6224’s, they do seem to behave noticeably different than the 5242’s.  They are more difficult to configure, and the suggested configurations from the Dell documentation seem were more convoluted and contradictory.  Multiple documents and deployment guides had inconsistent information.  Technical Support from Dell/EqualLogic has been great in helping me determine what the issue is.  Unfortunately some of the potential fixes can be very difficult to execute.  Firmware updates on a stacked set of 6224’s will result in the ENTIRE stack rebooting, so you have to shut down virtually everything if you want to update the firmware.  The ultimate fix for this would be a revamp of the deployment guides (or lets try just one deployment guide) for the 6224’s that nullifies any previous documentation.  By way of comparison, the 5424 switches were, and are very easy to deploy. 

The other issue that came up was some unexpected behavior regarding replication, and it’s use of free pool space.  I don’t have any empirical evidence to tie these two together, but this is what I had observed.

During this past month in which I had an old physical storage server fail on me, there was a moment where I had to provision what was going to be a replacement for this box, as I wasn’t even sure if the old physical server was going to be recoverable.  Unfortunately, I didn’t have a whole lot of free pool space on my array, so I had to trim things up a bit, to get it to squeeze on there.  Once I did, I noticed all sorts of weird behavior.

1.  Since my replication jobs (with ASM/ME and ASM/VE) leverage the free pool space for the creation of temporary replica/snap that is created on the source array, this caused problems.  The biggest one was that my Exchange server would completely freeze during it’s ASM/ME snapshot process.  Perhaps I had this coming to me, because I deliberately configured it to use free pool space (as opposed to a replica reserve) for it’s replication.  How it behaved caught me off guard, and made it interesting enough for me to never want to cut it close on free pool space again.

2.  ASM/VE replica jobs also seems to behave odd with very little free pool space.  Again, this was self inflicted because of my configuration settings.  It left me desiring a feature that would allow you to set a threshold so that in the event of x amount of free pool space remaining, replication jobs would simply not run.  This goes for ASM/VE and ASM/ME.

Once I recovered that failed physical system, I was able to remove that VM I set aside for emergency turn up.  That increased my free pool space back up over 1TB, and all worked well from that point on. 

 

Timing

Lastly, one subject matter came up that doesn’t show up in any deployment guide I’ve seen.  The timing of all this protection shouldn’t be overlooked.  One wouldn’t want to stack several replication jobs on top of each other that use the same free pool space, but haven’t had the time to replicate.  Other snapshot jobs, replicas, consistency checks, traditional backups, etc should be well coordinated to keep overlap to a minimum.  If you are limited on resources, you may also be able to use timing to your advantage.  For instance, set your daily replica of your Exchange database to occur at 5:00am, and your daily snapshot to occur at 5:00pm.  That way, you have reduced your maximum loss period from 24 hours to 12 hours, just by offsetting the times.