Reworking my PowerConnect 6200 switches for my iSCSI SAN

It sure is easy these days to get spoiled with the flexibility of virtualization and shared storage.  Optimization, maintenance, fail-over, and other adjustments are so much easier than they used to be.  However, there is an occasional reminder that some things are still difficult to change.  For me, that reminder was my switches I use for my SAN.

One of the many themes I kept hearing at this year’s Dell Storage Forum (a great experience I must say) throughout several of the breakout sessions I went to was “get your SAN switches configured correctly.”  A nice reminder to something I was all too aware of already; my Dell PowerConnect 6224 switches were not configured correctly since the day they replaced my slightly less capable (but rock solid) PowerConnect 5424’s.  I returned from the forum committed to getting my switchgear updated and configured the correct way.  Now for the tough parts…  What does “correct” really mean when it comes to the 6200 series switches?  And why didn’t I take care of this a long time ago?  Here are just a few excuses reasons. 

  • At the time of initial deployment, I had difficulty tracking down documentation written specifically for the 6224’s to be configured with iSCSI.  Eventually, I did my best to interpret the configuration settings of the 5424’s, and apply the same principals to the 6224’s.  Unfortunately, the 6224’s are a different animal than the 5424’s, and that showed up after I placed them into production – a task that I regretfully rushed.
  • When I deployed them into production, the current firmware was the 2.x generation.  It was my understanding after the deployment that the 2.x firmware on the 6200 series definitely had growing pains.  I also had the unfortunate timing that the next major revision came out shortly after I put them into production.
  • I had two stacked 6224 switches running my production SAN environment (a setup that was quite common for those I asked at the Dell Storage Forum). While experimenting with settings might be fun in a lab, it is no fun, and serious business when they are running a production environment. I wanted to make adjustments just once, but had difficulty confirming settings.
  • When firmware needs to be updated (a conclusion to an issue I was reporting to Technical Support), it is going to take down the entire stack.  This means that you’d better have everything that uses the SAN off unless you like living dangerously.  Major firmware updates will also require the boot code in each switch to be updated.  A true “lights out” maintenance window that required everything to be shut down.  The humble little 5424’s LAGd together didn’t have that problem.
  • The 2.x to 3.x firmware update also required the boot code to be updated.  However, you simply couldn’t run an “update bootcode” command.  The documentation made this very clear.  The PowerConnect Technical Support Team indicated that the two versions ran different algorithms to unpack the contents, which was the reason for yet another exception to the upgrade process. 

One of the many best practices recommended at the Forum was to stack the switches instead of LAGing them.  Stack, stack, stack was drilled into everyone’s head.  The reasons are very good, and make a lot of sense.

  • Stacking modules in many ways extend the circuiting of a single switch, thus the stacking module doesn’t have to honor or be limited by traditional Ethernet.
  • Managing one switch manages them all.
  • Better, more scalable bandwidth between switches
  • No messing around with LAG’s

But here lays the conundrum of many Administrators who are responsible for production environments.  While stacked 6224’s offer redundancy against hardware failure, they offer no redundancy when it comes to maintenance.  These stacked switches are seen as one logical unit, and may be your weakest link when it comes to maintenance of your virtualized infrastructure.  Interestingly enough, when inquiring further on effective strategies for updating under this topology, I observed a few things;  many other users who were stuck with this very same dilemma, and the answers provided weren’t too exciting.  There were generally three answers I heard from this design decision:

  • Plan for a “lights out” maintenance window.
  • Buy another set of two switches, stack those, then trunk the two together via 10Gbe,
  • Buy better switches. 

See why I wasn’t too excited about my options?

Decision time.  I knew I’d suffer a bit of downtime updating the firmware and revamping the configuration no matter what I did.  Do I stack them as recommended, only to be faced with the same dilemma on the next firmware upgrade?  Or do I LAG the switches together so that I avoid this upgrade fiasco in the future?  LAG’ing is not perfect either, and the more arrays I add (as well as the inter-array traffic increasing with new array features), the more it might compound some of the limitations of LAGs. 

What option won out?  I decided to give stacking ONE more try.  I had to keep the eye on my primary objective; correcting my configuration by way of firmware upgrade and build up a simple, pristine configuration from scratch.  The idea was that the configuration would initially contain the minimum set of modifications to get them working according to best practices.  Then, I could build off of the configuration in the future.  Also influencing my decision was finding out that recommended settings with LAGs apparently change frequently.  For instance, just recently, the recommended setting for flow control for the port channel in a LAG was changed.  These are the types of things I wanted to stay away from.  But with that said, I will continue to keep the option open to LAGing them, for the sole reason that it offers the flexibility for maintenance without shutting down your entire cluster.

So here was my minimum desired results for the switch stack after the upgrade and reconfiguration.  Pretty straight forward. 

  • Management traffic on another VLAN (VLAN 10) on port 1 (for uplinking) and port 2 (for local access).
  • iSCSI traffic on it’s own VLAN (VLAN 100), on all ports not including the management ports.
  • Essentially no traffic on the Default VLAN
  • Recommended global and port specific settings (flow control, spanning tree, jumbo frames, etc.) for iSCSI traffic endpoint connections
  • iSCSI traffic that was available to be routed through my firewall (for replication).

My configuration rework assumed the successful boot code and firmware upgrade to version 3.2.1.3.  I pondered a few different ways to speed this process up, but ultimately just followed the very good steps provided with the documentation for the firmware.  They were clear, and accurate.

By the way, on June 20th, 2011, Dell released their very latest firmware update (thank you RSS feed) to 3.2.1.3 A23.  This now includes their “Auto Detection” of ports for iSCSI traffic.  Even though the name implies a feature that might be helpful, the documentation did not provide enough information needed, and I decided to manually configure as originally planned.

For those who might be in the same boat as I was, here were the exact steps I did for building up a pristine configuration after updating the firmware and boot code.  The configuration below was definitely a combined effort by the folks from the EqualLogic and PowerConnect Teams, and me pouring over a healthy amount of documentation.  It was my hope that this combined effort would eliminate some of the contradictory information I found in previous best practices articles, forum threads, and KB articles that assumed earlier firmware.  I’d like to thank them for being tolerant of my attention to detail, and to get this right the first time.  You’ll see that the rebuild steps are very simple.  Getting confirmation on this was not.

Step 1:  Reset the switch to defaults (make a backup of your old config, just in case)
enable
delete startup-config
reload

 
Step 2:  When prompted, follow the setup wizard in order to establish your management IP, etc. 
 
Step 3:  Put the switch into admin and configuration mode.
enable
configure

 
Step 4:  Establish Management Settings
hostname [yourstackhostname]
enable password [yourenablepassword]
spanning-tree mode rstp
flowcontrol

 
Step 5: Add the appropriate VLAN IDs to the database and setup interfaces.
vlan database
vlan 10
vlan 100
exit
interface vlan 1
exit
interface vlan 10
name Management
exit
interface vlan 100
name iSCSI
exit
ip address vlan 10
 
Step 6: Create an Etherchannel Group for Management Uplink
interface port-channel 1
switchport mode access
switchport access vlan 10
exit
NOTE: Because the switches are stacked, port one on each switch will be configured in this channel-group which can then be connected to their core switch or intermediate switch for management access. Port two on each switch can be used if they need to plug a laptop into the management VLAN, etc.
 
Step 7: Configure/assign Port 1 as part of the management channel-group:
interface ethernet 1/g1
switchport access vlan 10
channel-group 1 mode auto
exit
interface ethernet 2/g1
switchport access vlan 10
channel-group 1 mode auto
exit
 
Step 8: Configure Port 2 as Management Access Switchports (not part of the channel-group):
interface ethernet 1/g2
switchport access vlan 10
exit
interface ethernet 2/g2
switchport access vlan 10
exit
 
Step 9: Configure Ports 3-24 as iSCSI access Switchports
interface range ethernet 1/g3-1/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
interface range ethernet 2/g3-2/g24
switchport access vlan 100
no storm-control unicast
spanning-tree portfast
mtu 9216
exit
NOTE:  Binding the xg1 and xg2 interfaces into a port-channel is not required for stacking. 
 
Step 10: Exit from Configuration Mode
exit
 
Step 11: Save the configuration!
copy running-config startup-config

Step 12: Back up the configuration
console#copy startup-config tftp://[yourTFTPip]/conf.cfg

In hindsight, the most time consuming aspect of all of this was trying to confirm the exact settings for the 6224’s in an iSCSI SAN.  Running in second was shutting down all of my VMs, ESX hosts, and anything else that connected to the SAN switchgear.  The upgrade and the rebuild was relatively quick and trouble-free.  I’m thrilled to have this behind me now, and I hope that by passing this information along, you too will have a very simple working example to build your configuration off of.  As for the 6224’s, they are working fine now.  I will continue to keep my fingers crossed that Dell will eventually provide a way to update firmware to a stacked set of 6200 series switches without a lights out maintenance window.

55 Responses to Reworking my PowerConnect 6200 switches for my iSCSI SAN

  1. Tim A-to-Z says:

    Pete,
    Great posting. There are so few resources out there regarding configuration of the Dell 6200 series switches and iSCSI. It is good that you posted your config for others to follow in your footsteps.
    Great Job!

  2. Kevin says:

    Thank you for this. Really, it’s great that you decided to pass on this knowledge as it has saved me countless hours of support calls to Dell. I just picked up a couple of 6224′s along with an Equallogic SAN. iSCSI SAN performance relies heavily on proper switch hardware and switch configuration. You’ve laid out the config in a logical manner taking a lot of the guess work out of it.

    Thank You!

    • ITforMe says:

      Great to hear it helped you Kevin. The 6224′s are some of their most popular switches, which made it all that much more ironic that it was difficult to confirm the correct settings. Good luck.

  3. Greg Moran says:

    Are you seeing any errors on the SAN after enabling jumbo frames on the switch? We are running 6224′s and an EqualLogic SAN, and originally had turned on jumbo frames. Despite all of our best efforts, we could not eliminate SAN errors that Dell eventually confirmed were related to the use of jumbo frames on the switch. We have since disabled jumbo frames on the 6224′s and aren’t seeing the errors. You mentioned that you installed the newest 6224 firmware: I would be curious if you are seeing any network errors on the SAN side of things now that you have a few weeks of production under your belt with the new config…

    Thanks in advance,
    Greg

    • ITforMe says:

      Thanks for the comments Greg. Actually, while I did a complete revamp on my switch configurations, I have not changed over my vswitches on my ESX hosts yet. So the switches are capable of receiving jumbos, but as of now my hosts are just transmitting standard frames. I haven’t changed them yet mostly because I would like to move them up from 4.0 to 4.1, then use the EqualLogic MEM and get the ideal setup from there. Only then will I be able to give a good answer on that. An interesting observation I’ve had about jumbos is that the name itself implies “jumbo” type performance and CPU utilization gains, but we know that it mght only be in the area of 10%. The only thing jumbo about it are some of the headaches they can induce. That is why I wanted to get my hosts on the latest ESX release before I try jumbo’s again.

  4. Guillaume says:

    Very interesting article, thanks !. I am currently facing a similar situation. I am a bit lost with my powerconnect 6224 and 6248 which are running stacked : How to upgrade firmware of PC6200 DELL switches when they are stacked ?. Do you use boot auto-copy-sw to propagate firmware to others units of the stack or do you upgrade them manually unit per unit . ?, is tftp update possible or Xmodem ?. Documentation from stacked PC6200 switch …is light and sometimes saying the opposite of firmware release install notes… .

    Any experiences on firmware upgrade on PC6224 6248 stacked would be greatly appreciated. Sorry if I am a bit off topic but this is a desperate call ;) .

    No issues for upgrading standalone units.

    Thanks in advance.

    Guillaume.

    • ITforMe says:

      Hi Guillaume,

      If you follow my exact steps for upgrading the firmware and the bootcode, you will be fine. The post is a reflection of my experiences while performing the update. I contemplated other methods of updating the bootcode and firmware, but ultimately, those other options were more disuptive than the ones provided in the documentation.

  5. Guillaume says:

    Thanks for this quick reply I really appreciate it . Do you have another post describing the way you have updated the firmware your PC 6224/48 stack ? . I am still a bit dry and don’t really know how to update my production . I am really interested by your exact steps for firmware upgrade. As I said I have two solutions . Using boot auto-copy-sw …which should from the master copy the firmware to the entire stack … I am not very confident about this method and I am afraid to kill all my stack in one shote if something is going wrong.

    The other one is to update switch one by one , power off one member of the stack and do the firmware upgrade via serial cable ? (slow and painfull ) …or tftp is possible in such context .

    Thanks again for your article and feedback . I am just discovering your blog , so sorry if I have missed the requested information .

    Guillaume.

    • ITforMe says:

      First, print out the Dell White Paper title “release 3.x Upgrade Procedure” for the Dell PowerConnect 6200 Series. This is going to be the key to everything. Read it, and read it again a few times over. Mark it up all that you need to make the steps for you crystal clear. Next, based on your current firmware follow the directions exactly as it states. Like the post describes, be sure to have everything shut down so that you have no iSCSI traffic going across the switches. You will want to have a laptop with a local serial connecttion connected to the console port for the work that you’ll be doing. The process essentially will copy the .stk file to switches, treating them as one logical unit. The trick is updating the bootcode (not the firmware). You will want to follow the instructions as stated in the document (and not using the “update bootcode” from the option listing. For updating the bootcode, you will essentially be doing the following:

      Step 1: Power down the system. (the entire stack) The stand-alone unit or the management unit and each member unit, if in a
      stacking configuration, must be powered down.

      Step 1a: (unplug ethernet cable)

      Step 2: While logged into the serial console, power up just the management unit, stopping at the Boot Menu.
      Example:
      Boot Menu Jun 30 2009
      Select an option. If no selection in 10 seconds then
      operational code will start.
      1 – Start operational code.
      2 – Start Boot Menu.
      Select (1, 2):2

      Step 3: Select option 7 of the Boot Menu to Update Boot Code.
      NOTE: The Boot Menu options changed after Release 2.0. Option 6 is removed if upgrading from
      Release 2.2.
      Boot Menu Version: 21 November 2008
      Select an option. If no selection in 10 seconds then
      operational code will start.
      1 – Start operational code.
      2 – Start Boot Menu.
      Select (1, 2):2
      Boot Menu 21 November 2008
      Options available
      1 – Start operational code
      2 – Change baud rate
      3 – Retrieve event log using XMODEM
      4 – Load new operational code using XMODEM
      5 – Display operational code vital product data
      6 – Run flash diagnostics
      7 – Update boot code
      8 – Delete backup image
      9 – Reset the system
      10 – Restore configuration to factory defaults (delete config files)
      11 – Activate Backup Image
      12 – Password Recovery Procedure
      [Boot Menu] 7
      Do you wish to update Boot Code and reset the switch? (y/n) y

      Step 4: The switch will automatically reboot after completing the boot code update. If a switch reboot does
      not occur, reboot to the operational code (Option 9).

      Step 5: Move serial cable to member switch, then power up, stopping at the boot menu, choosing “2″ then “7″

      Step 6: Upgrade Procedure Completed. It will restart.

      • Guillaume. says:

        Thanks a lot . My migration plan is matching your main steps. I am aware for the bootcode update that must be done from the boot menu. I agree with you sticking to firmware release is the way to go … but sometimes frustrating … .

        Would dream to do this via tftp , which would be less time consuming. But well …let’s play it safe, I will forget my idea of boot auto-copy-sw (dell white paper of December …too risky (outdated?) without any previous testing (I don’t know in such case how it handle the bootcode update ).

        Thanks a lot for all your advices and for the time you have spent on my comments.

        Guillaume.

      • ITforMe says:

        I too can think of many ways that this process could be improved, as it would be nice to offer up some more flexible ways of updating the switches. But of course, my main objective was to make the update a smooth one, so following the recommendations was the best choice for me, and I’m sure for you as well. Best of luck to you.

  6. ITforMe says:

    Hey thanks for reading. Since I don’t have a switch to uplink to (I wish I did), I can’t tell you for sure what the configuration should be on the switch one would be uplinking to. It will probably vary a bit depending on what kind of switch it is (a great example of how this can vary is illustrated in Scott Lowe’s post here: http://blog.scottlowe.org/2010/12/02/vlan-trunking-between-nexus-5010-and-dell-powerconnect-switches/ ) . The other element to consider is what type of traffic are you going to try to uplink. If you notice on my configuration provided, only the Management network is being uplinked. So if you were planning to uplink other traffic, you’d have to adjust accordingly.

  7. jamie says:

    Great posting. I am configuring a citrix environment but basically the same concept. from your great post, how would you see your VMtraffic since you have configured all the ports from 3-24 for iSCSI traffic. secondly how would you separate the networks using vlans. how is the server environment going to talk with the SAN if routing is not enabled on the vlans.

    • ITforMe says:

      Thanks for reading. As for your questions…

      The configuration described assumes that one has dedicated switches for iSCSI traffic. This is really the best practice, as opposed to letting a single switch stack handle all traffic including iSCSI. VLANs will certainly be able to separate traffic, but it doesn’t omit the load on the switches, coming from other kinds of traffic. So it is really best to have two fabrics; your LAN stack, and your SAN stack.

      Because of that, there is really not much of a need to do any interVLAN routing for iSCSI traffic. These switches could be connected to a physical uplink if you needed to get the traffic out (e.g. Replication for instance). I do have a stack of PowerConnect 6248’s that I use on my LAN where I do have them configured for interVLAN routing, and they handle my various VLANs (LAN traffic, vSphere Management, vMotion, etc.). I’ve been toying with the idea of creating a post for these. Would you be interested?

  8. jamie says:

    Thanks. I would be more than interested. I will be waiting for your post. Thanks for your quick response. Please post it soon.

  9. jamie says:

    I am looking forward to the post. Good day.

  10. jamie says:

    Still waiting for that post.

  11. Conrad says:

    Something else to consider when running these switches in a stack is that if the primary fails the stack reboots and does a reelection.

  12. Pingback: Diagnosing a failed iSCSI switch interconnect in a vSphere environment « A glimpse into the life of IT

  13. Todd Theoret says:

    Hi Pete. Have you had good consistent results with your 6224 Stacked iSCSI configurations? Any lessons learned over the past 12 months which you care share?

    Many thanks!

    • vmPete says:

      Hi Todd,

      Yes, I have. Not only in this environment, but in other environments that I have deployed the 6200 series switches for this very purpose (iSCSI traffic), the configuration described has proven to be extremely robust. I’ve also received a tremendous amount of feedback from others who said this has worked out very well for them. When first deploying them, I generally use that as the opportunity to put the latest firmware on them, as that can be difficult to update once they are in production. The only suggestion I might make is to uplink one of the management ports to one’s management VLAN, so that you gain visibility into what the switch is doing. Thanks for reading.

  14. Todd Theoret says:

    One more question Pete. I have qty 3 Dell PowerEdge 610′s with Broadcom 5709c nics (8-ports) each in a HA 5.1 cluster attached to a MD3220i dual controller. Based on your experience would you utilize vShpere’s (5.1) iscsi software initiator or the nic dependent hw initiators? Based on the small number of vms I am going to run I predict there will be plenty of CPU cycles. Will you please briefly share your experience?

    Thanks again!

    • vmPete says:

      Without hesitation, I recommend sticking with the software based iSCSI initiator. Those who have ventured off and attempted to use some of the HW based initiators (e.g. Broadcom, etc.) have paid the price for it. You’ll ultimately have a more consistent environment as well, knowing that you won’t be depending on particular NICs populated in a host.

      By the way, you will want to read the lates MD3220i information out on Technet for proper deployment in a vSphere environment. An ounce of prevention here makes a big difference later on.

  15. Todd Theoret says:

    Have you come across a “new” vSphere 5.1 / MD3220i best practices deployment guide? I do have a copy of the Dell PowerVault MD3200i/MD3600i Deploy Guide for VMware ESXi 5.0 Server Software.

    Thanks again for sharing your experience.

  16. BobE says:

    Did you ever get around to enabling Jumbo Frames in your VMware config with the MPIO driver? If so, how did that go?

    I’m about to set up a new EqualLogic SAN with dual 6248 switches, so I’m looking for as much info as I can get from others with basically the same environment.

    • vmPete says:

      Yes. All is good. Another link that will be helpful for you is my recent post on installing the MEM. Guides you along step by step.
      http://vmpete.com/2012/09/20/multipathing-in-vsphere-with-the-dell-equallogic-multipathing-extension-module-mem/

      Let me know if you have any other questions, and thanks for reading!

      • BobE says:

        Here’s a general question about jumbo frames implementation that I can’t seem to find the answer to anywhere… In my environment I have 6 VMware hosts. I understand that jumbo frames won’t be used until every link in the chain has them enabled but what about other servers? My EqualLogic is already set to use them, I’ll be replacing the switches with jumbo frame capable switches soon and I’ll configure them to allow the larger MTU.

        After that, if I start changing the hosts one-by-one will the other standard frame size hosts have issues talking to the same datastores? I’m not doing anything fancy like fault tolerance or otherwise linked VMs, or virtual distributed switches. I do also have a few VMs that talk to their own iSCSI volume, so I realize I’ll want to update them too, but the question is when do I have to?

        Do I have to take all my VMs down to implement jumbo frames or can it be done one host at a time while keeping the majority of the environment running? Better yet, if I have the spare RAM/CPU, can I vMotion stuff off a host, make the changes, then vMotion it back onto that newly jumbo enabled host and not have to bring down much (with the exception of the VMs that are mounting iSCSI volumes directly instead of using VMDK files for disks)?

      • vmPete says:

        You’ll want to do your own due diligence on this, but based on what you said your environment is like, your steps might look similar to this:

        1. Don’t do a thing until you have fully/properly configured switches dedicated for your iSCSI fabric. (no sharing via VLAN with your LAN based traffic). Triple check your configuration per mfr and Eql best practices before moving forward.
        2. Changing over to new switches is often the biggest hassle. Set aside a time/maintenance window for the transition. Even if they are set for jumbos, and the arrays negotiate for jumbos, no jumbos will be passed until the hosts are changed. Let the dust settle for perhaps a few days, then move onto the matter of the hosts.
        3. When the switches are in place, and everything looks good, look at your EqualLogic group manager and make sure the interfaces say an MTU size of 9000.
        4. Put one host into maintenance mode (always do this before you start making any changes like this).
        5. Change the MTU size of the iSCSI vSwitch and vmkernels from 1500 to 9214. Or better yet, just blow the vSwitch away, and recreate with the MEM as described at: http://vmpete.com/2012/09/20/multipathing-in-vsphere-with-the-dell-equallogic-multipathing-extension-module-mem/
        6. Validate jumbos by doing a vmkping -s 9000 [array ips] from the ESXi host. Also look at the Eql logs in the group manager to validate.
        7. Pull out of maintenance mode, and move on to the next host. Complete all hosts.

        The only other consideration is if you have VMs that use Guest attached volumes (and connected via the iSCSI initiator courtesy of the HIT/ME), with their own set of vNICs used. These would have to be set for jumbos as well. (you don’t want to mix frame types inside that vSwitch). But in order to do this successfully, those vNICs inside the guest used for guest attached volumes should be using VMXNET3 adapter types (not E1000).

  17. BobE says:

    Just to clarify what I meant… any idea if the frame size negotiated host-by-host (IP by IP) when everything is on the same physical switch? As I understand it, jumbo frames sent to any device not expecting them will see them as errors or just drop them silently and either situation will be a pain to deal with. So I’d like to find out whether changing host1′s MTU will cause any grief for host2 in the interim period before it is changed as well.

    • vmPete says:

      Hi Bob,

      With respect to jumbos, one is dealing with storage traffic only, to and from the array(s) courtesy of the switch.

      • BobE says:

        Yes that’s my understanding too, but I’m wondering if the array will happily send jumbos to host1 but negotiate down to 1500byte packets to host2 when they’re both on the same physical switch?

  18. BobE says:

    I worked with a great Dell EqualLogic implementer and he told me that I indeed could do one host at a time, so that’s what I did. Things went really well.

  19. Justin Lee says:

    Thank you for your post. I am also in the midst of deploying a pair of stacked 6248 switches for iSCSI/MPIO and was wondering if you have ran a HA test on the switches, i.e. pull the plug on one of them and see what happens to the stack. AFAIK the stacking ports work fine as a 10GbE uplink, but when stacked they give 12GbE (in ‘Dell marketing speak’ this is 12GbE, full duplex, both ways, i,e, 12 x 2 x 2 = 48GbE, what rubbish?). I am willing to forgo the 2GbE if it means less trouble.

    • vmPete says:

      Yes, failure with one of the switches in the stack will be just fine – assuming you have your hosts, and your array meshed properly to the switch-stack. At that point, the responsibility of fault tolerance is off of the host, and off of the array. (You will see uplinks in vSphere go down, and will see some links on the array go down.

      I wouldn’t be overly concerned with with the stacking ports beyond insuring that your overall configuration is correct. If all is good, you almost forget about them.

  20. Adam says:

    Just wanted to say thanks for this.

    I’ve just setup a new iSCSI SAN with two 6224s and a PowerVault MD3200i. This configuration works perfectly and is dead simple. Just the way I like it.

    • vmPete says:

      Thanks for the great feedback on this Adam. …and thanks for reading.

      • Max says:

        Hi Pete,
        Im in the process of implementing 2 new Power connect 7048′s that will have 4 ports in a LAG connecting them together. We have an EQL PS5000 and 6 ESXi hosts. Our current VMWare setup has two interconnected Cisco switches with 1 VLAN for ISCSI and another VLAN for vMotion, Mgmt and VM Networking traffic.
        I want to move ISCSI and vMotion off and into the isolated 7048′s and use jumbo frames for both VLAN’s.
        I am having a hard time finding documentation on configuring these switches for VMware Virtual Switch Tagging. Particularly the switch port mode: Access, General or Trunk on the 7048′s. Would you be able to point me in the right direction or share how your configuration is setup? Any help would be appreciated.
        Thanks

      • vmPete says:

        Hi Max,

        Moving your storage traffic off onto its own fabric will be a good first step for you. Regarding the 7048s, the settings you find on this post should be similar. For iSCSI traffic do NOT make the ports tagged, so you won’t need to worry about anything there. As you can see for the build of the 6224s, make them untagged access ports. If you wish to include your vMotion network on your storage switch fabric, you would make these “general” ports. In vSphere, you’d have the vSwitch that contained the appropriate uplinks, then tag the VMkernels. Does that answer your question?

  21. Max says:

    Thanks for replying so quickly Pete.
    So would I configure the vMotion general ports as tagged or untagged in the 7048 switches?
    And just to be clear after I configure the ISCSI ports as untagged access ports in the 7048′s the one to one ISCSI port bindings in vSphere should be set to (VLAN ID: (Optional): None 0?
    Thanks for your help on this.

    • vmPete says:

      Good questions Max. So with regards to iSCSI traffic, create a VLAN (say, 100) and assign to the appropriate ports. (Best practice is to always avoid using the default VLAN on switches – especially iSCSI). Set all of the recommended iSCSI switch settings (flow control, etc.) as I describe in my post. If they are access ports, even though they are on VLAN 100, vSphere doesn’t need to know that because it is untagged traffic, so in the iSCSI vmkernels, they would be set to “0 (none)”

      The vMotion ports… you may do either way. But if they are untagged, they’d have to live on another VLAN (e.g. 101) and only be assigned to the specific ports. Then the vMotion vmkernel would also be set to untagged. However in regards to vMotion, I tend to tag them because they might live on your LAN side switch stack, and the uplinks might share say, vSphere Mgmt, or something else. If they those ports are set to general mode, and tagged, then the vmkernel ports created in vSphere would show a VLAN ID matching the one created.

      vmkping can be a nice way to not only test all of this, but also to verify that you can pass jumbos successfully. Just triple check your new storage switch stack. Its best to get it right the first time. Otherwise, things like TCP retransmits can be high, etc.

  22. Pingback: Configuring a VM for SNMP monitoring using Cacti | vmPete.com

  23. Pete says:

    Just wanted to thank you for an awesome post.
    We’re re-doing the config of our 6224 switching as the SAN performance just isn’t great. Everything else has been looked at, but according to your post our switch configs are a little wrong.

    The Dell doco suggests to use the command: iscsi enable which does most of the manual config that you’ve done here, but also uses LLDP to let the switch(es) reconfigure themselves in the event of a array controller failure.
    According to the doco the above only applies to EqualLogic arrays however, as the LLDP identifies the arrays as Eql units.
    For anyone whos interested the PDF is here:

    http://www.dell.com/downloads/global/products/pwcnt/EqualLogic_iSCSI_Optimization_with_Dell_PowerConnect.pdf
    page 9.

    Thanks again for the great post.

    • vmPete says:

      Thank you for reading Peter. I’m happy that you found the post helpful. As far as your questions, yes, there have been some changes since my post that can affect how they are configured. The ability to automatically detect the endpoints connected to the switch came out shortly after my post. It was unproven at the time, and (justified or not) is still something that I can’t rely on. I know that if I manually configure the switch as desired, it is a predictable, repeatable outcome. When you use any autodiscover/configure you are subject to known or unknown firmware issues of the switch, advertising problems of the endpoints, etc. One’s SAN fabric is important. It’s something that I typically don’t like to experiment on when it comes to hoping that an autoconfigure worked.

      Now, to get official “support” from Dell, one may need to follow the latest documentation. However, I originally found inconsistencies with what was out there, which lead me to write this post in the first place. I’ve had the opportunity to configure dozens of these switches in production environments, and I’ve always gone the route of manually configuring them. Perhaps I will change my practices eventually, but only after the outcomes are repeatable, and predictable.

      What about your SAN performance is making you believe something isn’t correct? Is SANHQ sending you high retransmit warnings, etc.?

      • Pete says:

        Thanks for your follow up Pete.

        Sorry, I didn’t want to sound like I was nitpicking as I didn’t realise that the “iscsi enable” command was introduced since you’d posted the guide.
        I fully agree not to trust auto-configs.

        As far as performance goes, we are just seeing sluggish performance. No errors are being logged in SAN HQ.
        For example, Veeam in SAN mode is only able to pull ~120MB/s to a new Dell 720xd repository with local disk as target. We have a shelf each of 12x600GB 15K and 12x2TB 7.2K as a group.
        I’d be hoping to get higher numbers than that in full backup mode. But perhaps I’m expecting more than our humble setup can deliver.

        Our config was setup at install time by a Dell engineer. Upon looking at the configs we’re missing FlowControl and “no storm-control unicast” compared with your recommendations.

        I plan to backup our config and apply a customised version of your config to see if we can get the perf as good as it can be.

        Will post results after we’ve updated.

      • vmPete says:

        No, those were very good questions!. They will undoubtedly help someone else out in the future. As for your followup questions:

        What models of PS arrays are you working with? What RAID level? And those two arrays, they are in the same group, but are they in the same storage pool? If you say you have two arrays with 12 bays each, then I’m figuring they are the PS4xxx series. A lot of factors come into the performance characteristics, but specific to those PS41xx series, it is not unheard of to have high RPM drives be bound by the limited NIC ports on back. Assuming a theoretical limit of about 125MBps per connection, and the active/passive arrangement of the controllers, even with really good multipathing, along with the other traffic and activity on the arrays, that will dictate quite a bit as to the final numbers you see.

        The engineer was probably well intentioned, but for many years there was quite a bit of confusion over the true, supported recommendations. It’s why my post was so popular. 

        By the way, are you connecting a Veeam physical proxy to the SAN fabric for faster throughput? Just a word of warning on that. The EQLs don’t have a “read-only” connection type, so if for some reason, the special registry settings on the physical proxy don’t stay persistent, and end up resignaturing the VMFS volumes, it will be a very long day for you. I’ve put in several times for a “feature request” to them to add a “read-only” iSCSI connection type, as other Manufacturers have. I’ve been told that it is coming, but am not sure when. That is one of the reasons why I don’t run a Phyical Veeam proxy via direct-connect.

        Yes, please post the results.

      • Pete says:

        You’re right on the money, PS4100X (15K rpm) and PS4100E (7.2krpm).
        Fast shelf is at RAID50, slow shelf RAID6.
        Arrays in the same pool.

        These units only have 2x GigE connections per controller, 2 controllers in active/passive setup.

        Veeam proxy has 4x GigE nics dedicated to iSCSI. Yes I’m aware of the Windows server signaturing VMFS and have my resume ready (!).
        That box has the HIT Kit installed and seems to multipath OK, I see bulk traffic on all 4 nics at various stages of a backup, just not on all at once.

        Thanks for your follow up, I really appreciate it.

      • vmPete says:

        Yeah, that is a pretty realistic traffic pattern of efficient multipathing. It’s not going to be a true aggregate, but rather an opportunity for more sessions with payload. Now, if you are seeing mediocre performance results from the hosts, then of course that could be a whole list of possibilities. You are starting out right though. Get those switches correct, and you’ll be happy that you did. It’s worth the effort.

        On a side note, Veeam has quite a few dials to tweak with regards to the repository and proxy settings. Things such as the # of maximum sessions, or parallel processing. The defaults there are pretty good, and prevent it from tripping over itself, but its worth looking into of you feel like adjusting.

  24. Pete says:

    All firmwares right up to date on Eqls too (7.0.1)

  25. Pete says:

    Hi there again Pete,

    Just wanted to say thanks again for your blog (and this post especially).
    After the changes we made our backups have jumped from ~150-180MB/s to well over 250MB/s.

    I managed to screw up the management config/VLAN so I can only get to the stack by serial right now, but I’m still happy :)

    Cheers

    Pete

  26. Florin B. says:

    Hi Pete,

    Thanks for this insightful article. I am 1 day old to these switches and it seem they are pretty close to Cisco switch. Here is my current config:

    configure
    hostname “roma-sansw01″
    clock timezone 2 minutes 0 zone “EET”
    stack
    member 1 1
    member 2 1
    exit
    ip address 172.17.35.40 255.255.255.0
    ip default-gateway 172.17.35.1
    no ip domain-lookup

    roma-sansw01#show vlan

    VLAN Name Ports Type Authorization
    —– ————— ————- —– ————-
    1 Default ch1-48, Default Required
    1/g1-1/g24,
    1/xg3-1/xg4,
    2/g1-2/g24,
    2/xg3-2/xg4

    I have two questions:
    – is it safe to assume that existing IP configuration is assigned to Vlan 1?
    – although I know Vlan 1 is not recommended and you also mentioned it several times, yesterday my SAN colleagues plug a wire from one free port to one of our core switches.

    Here is the config:

    interface ethernet 1/g4
    storm-control broadcast
    storm-control multicast
    spanning-tree disable
    spanning-tree portfast
    mtu 9216
    exit

    Name: Gi1/7
    Switchport: Enabled
    Administrative Mode: dynamic auto
    Operational Mode: down
    Administrative Trunking Encapsulation: negotiate
    Negotiation of Trunking: On
    Access Mode VLAN: 1 (default)
    Trunking Native Mode VLAN: 1 (default)
    Administrative Native VLAN tagging: enabled
    Voice VLAN: none
    Administrative private-vlan host-association: none
    Administrative private-vlan mapping: none
    Administrative private-vlan trunk native VLAN: none
    Administrative private-vlan trunk Native VLAN tagging: enabled
    Administrative private-vlan trunk encapsulation: dot1q
    Administrative private-vlan trunk normal VLANs: none
    Administrative private-vlan trunk associations: none
    Administrative private-vlan trunk mappings: none
    Operational private-vlan: none
    Trunking VLANs Enabled: ALL
    Pruning VLANs Enabled: 2-1001
    Capture Mode Disabled
    Capture VLANs Allowed: ALL

    Unknown unicast blocked: disabled
    Unknown multicast blocked: disabled
    Appliance trust: none

    This connection lead to a storage failure; I wasn’t aware of this so I couldn’t collect any log but I suspect some STP issues. Here is a summary of our Cisco switch STP config:

    Roma-cr01#show spanning-tree summary
    Switch is in pvst mode
    Root bridge for: VLAN0001, VLAN0008, VLAN0010
    Extended system ID is enabled
    Portfast Default is disabled
    PortFast BPDU Guard Default is disabled
    Portfast BPDU Filter Default is disabled
    Loopguard Default is disabled
    EtherChannel misconfig guard is enabled
    UplinkFast is disabled
    BackboneFast is disabled
    Configured Pathcost method used is short

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 737 other followers

%d bloggers like this: