Reworking my PowerConnect 6200 switches for my iSCSI SAN

It sure is easy these days to get spoiled with the flexibility of virtualization and shared storage.  Optimization, maintenance, fail-over, and other adjustments are so much easier than they used to be.  However, there is an occasional reminder that some things are still difficult to change.  For me, that reminder was my switches I use for my SAN.

One of the many themes I kept hearing at this year’s Dell Storage Forum (a great experience I must say) throughout several of the breakout sessions I went to was “get your SAN switches configured correctly.”  A nice reminder to something I was all too aware of already; my Dell PowerConnect 6224 switches were not configured correctly since the day they replaced my slightly less capable (but rock solid) PowerConnect 5424’s.  I returned from the forum committed to getting my switchgear updated and configured the correct way.  Now for the tough parts…  What does “correct” really mean when it comes to the 6200 series switches?  And why didn’t I take care of this a long time ago?  Here are just a few excuses reasons. 

  • At the time of initial deployment, I had difficulty tracking down documentation written specifically for the 6224’s to be configured with iSCSI.  Eventually, I did my best to interpret the configuration settings of the 5424’s, and apply the same principals to the 6224’s.  Unfortunately, the 6224’s are a different animal than the 5424’s, and that showed up after I placed them into production – a task that I regretfully rushed.
  • When I deployed them into production, the current firmware was the 2.x generation.  It was my understanding after the deployment that the 2.x firmware on the 6200 series definitely had growing pains.  I also had the unfortunate timing that the next major revision came out shortly after I put them into production.
  • I had two stacked 6224 switches running my production SAN environment (a setup that was quite common for those I asked at the Dell Storage Forum). While experimenting with settings might be fun in a lab, it is no fun, and serious business when they are running a production environment. I wanted to make adjustments just once, but had difficulty confirming settings.
  • When firmware needs to be updated (a conclusion to an issue I was reporting to Technical Support), it is going to take down the entire stack.  This means that you’d better have everything that uses the SAN off unless you like living dangerously.  Major firmware updates will also require the boot code in each switch to be updated.  A true “lights out” maintenance window that required everything to be shut down.  The humble little 5424’s LAGd together didn’t have that problem.
  • The 2.x to 3.x firmware update also required the boot code to be updated.  However, you simply couldn’t run an “update bootcode” command.  The documentation made this very clear.  The PowerConnect Technical Support Team indicated that the two versions ran different algorithms to unpack the contents, which was the reason for yet another exception to the upgrade process. 

One of the many best practices recommended at the Forum was to stack the switches instead of LAGing them.  Stack, stack, stack was drilled into everyone’s head.  The reasons are very good, and make a lot of sense.

  • Stacking modules in many ways extend the circuiting of a single switch, thus the stacking module doesn’t have to honor or be limited by traditional Ethernet.
  • Managing one switch manages them all.
  • Better, more scalable bandwidth between switches
  • No messing around with LAG’s

But here lays the conundrum of many Administrators who are responsible for production environments.  While stacked 6224’s offer redundancy against hardware failure, they offer no redundancy when it comes to maintenance.  These stacked switches are seen as one logical unit, and may be your weakest link when it comes to maintenance of your virtualized infrastructure.  Interestingly enough, when inquiring further on effective strategies for updating under this topology, I observed a few things;  many other users who were stuck with this very same dilemma, and the answers provided weren’t too exciting.  There were generally three answers I heard from this design decision:

  • Plan for a “lights out” maintenance window.
  • Buy another set of two switches, stack those, then trunk the two together via 10Gbe,
  • Buy better switches. 

See why I wasn’t too excited about my options?

Decision time.  I knew I’d suffer a bit of downtime updating the firmware and revamping the configuration no matter what I did.  Do I stack them as recommended, only to be faced with the same dilemma on the next firmware upgrade?  Or do I LAG the switches together so that I avoid this upgrade fiasco in the future?  LAG’ing is not perfect either, and the more arrays I add (as well as the inter-array traffic increasing with new array features), the more it might compound some of the limitations of LAGs. 

What option won out?  I decided to give stacking ONE more try.  I had to keep the eye on my primary objective; correcting my configuration by way of firmware upgrade and build up a simple, pristine configuration from scratch.  The idea was that the configuration would initially contain the minimum set of modifications to get them working according to best practices.  Then, I could build off of the configuration in the future.  Also influencing my decision was finding out that recommended settings with LAGs apparently change frequently.  For instance, just recently, the recommended setting for flow control for the port channel in a LAG was changed.  These are the types of things I wanted to stay away from.  But with that said, I will continue to keep the option open to LAGing them, for the sole reason that it offers the flexibility for maintenance without shutting down your entire cluster.

So here was my minimum desired results for the switch stack after the upgrade and reconfiguration.  Pretty straight forward. 

  • Management traffic on another VLAN (VLAN 10) on port 1 (for uplinking) and port 2 (for local access).
  • iSCSI traffic on it’s own VLAN (VLAN 100), on all ports not including the management ports.
  • Essentially no traffic on the Default VLAN
  • Recommended global and port specific settings (flow control, spanning tree, jumbo frames, etc.) for iSCSI traffic endpoint connections
  • iSCSI traffic that was available to be routed through my firewall (for replication).

My configuration rework assumed the successful boot code and firmware upgrade to version 3.2.1.3.  I pondered a few different ways to speed this process up, but ultimately just followed the very good steps provided with the documentation for the firmware.  They were clear, and accurate.

By the way, on June 20th, 2011, Dell released their very latest firmware update (thank you RSS feed) to 3.2.1.3 A23.  This now includes their “Auto Detection” of ports for iSCSI traffic.  Even though the name implies a feature that might be helpful, the documentation did not provide enough information needed, and I decided to manually configure as originally planned.

For those who might be in the same boat as I was, here were the exact steps I did for building up a pristine configuration after updating the firmware and boot code.  The configuration below was definitely a combined effort by the folks from the EqualLogic and PowerConnect Teams, and me pouring over a healthy amount of documentation.  It was my hope that this combined effort would eliminate some of the contradictory information I found in previous best practices articles, forum threads, and KB articles that assumed earlier firmware.  I’d like to thank them for being tolerant of my attention to detail, and to get this right the first time.  You’ll see that the rebuild steps are very simple.  Getting confirmation on this was not.

Step 1:  Reset the switch to defaults (make a backup of your old config, just in case)
enable
delete startup-config
reload

 
Step 2:  When prompted, follow the setup wizard in order to establish your management IP, etc. 
 
Step 3:  Put the switch into admin and configuration mode.
enable
configure

 
Step 4:  Establish Management Settings
hostname [yourstackhostname]
enable password [yourenablepassword]
spanning-tree mode rstp
flowcontrol

 
Step 5: Add the appropriate VLAN IDs to the database and setup interfaces.
vlan database
vlan 10
vlan 100
exit
interface vlan 1
exit
interface vlan 10
name Management
exit
interface vlan 100
name iSCSI
exit
ip address vlan 10
 
Step 6: Create an Etherchannel Group for Management Uplink
interface port-channel 1
switchport mode access
switchport access vlan 10
exit
NOTE: Because the switches are stacked, port one on each switch will be configured in this channel-group which can then be connected to their core switch or intermediate switch for management access. Port two on each switch can be used if they need to plug a laptop into the management VLAN, etc.
 
Step 7: Configure/assign Port 1 as part of the management channel-group:
interface ethernet 1/g1
switchport access vlan 10
channel-group 1 mode auto
exit
interface ethernet 2/g1
switchport access vlan 10
channel-group 1 mode auto
exit
 
Step 8: Configure Port 2 as Management Access Switchports (not part of the channel-group):
interface ethernet 1/g2
switchport access vlan 10
exit
interface ethernet 2/g2
switchport access vlan 10
exit
 
Step 9: Configure Ports 3-24 as iSCSI access Switchports
interface range ethernet 1/g3-1/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
interface range ethernet 2/g3-2/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
NOTE:  Binding the xg1 and xg2 interfaces into a port-channel is not required for stacking. 
 
Step 10: Exit from Configuration Mode
exit
 
Step 11: Save the configuration!
copy running-config startup-config

Step 12: Back up the configuration
console#copy startup-config tftp://[yourTFTPip]/conf.cfg

In hindsight, the most time consuming aspect of all of this was trying to confirm the exact settings for the 6224’s in an iSCSI SAN.  Running in second was shutting down all of my VMs, ESX hosts, and anything else that connected to the SAN switchgear.  The upgrade and the rebuild was relatively quick and trouble-free.  I’m thrilled to have this behind me now, and I hope that by passing this information along, you too will have a very simple working example to build your configuration off of.  As for the 6224’s, they are working fine now.  I will continue to keep my fingers crossed that Dell will eventually provide a way to update firmware to a stacked set of 6200 series switches without a lights out maintenance window.

Follow

Get every new post delivered to your Inbox.

Join 641 other followers