Accelerating storage using PernixData’s FVP. A perspective from customer #0001

Recently, I described in "Hunting down unnecessary I/O before you buy that next storage solution" the efforts around addressing "technical debt" that was contributing to unnecessary I/O. The goal was to get better performance out of my storage infrastructure. It’s been a worthwhile endeavor that I would recommend to anyone, but at the end of the day, one might still need faster storage. That usually means, free up another 3U of rack space, and open checkbook

Or does it?  Do I have to go the traditional route of adding more spindles, or investing heavily in a faster storage fabric?  Well, the answer was an unequivocal "yes" not too long ago, but times are a changing, and here is my way to tackle the problem in a radically different way.

I’ve chosen to delay any purchases of an additional storage array, or the infrastructure backing it, and opted to go PernixData FVP.  In fact, I was customer #0001 after PernixData announced GA of FVP 1.0.  So why did I go this route?

1.  Clustered host based caching.  Leveraging server side flash brings compute and data closer together, but thanks to FVP, it does so in such a way that works in a highly available clustered fashion that aligns perfectly with the feature sets of the hypervisor.

2.  Write-back caching. The ability to deliver writes to flash is really important. Write-through caching, which waits for the acknowledgement from the underlying storage, just wasn’t good enough for my environment. Rotational latencies, as well as physical transport latencies would still be there on over 80% of all of my traffic. I needed true write-back caching that would acknowledge the write immediately, while eventually de-staging it down to the underlying storage.

3.  Cost. The gold plated dominos of upgrading storage is not fun for anyone on the paying side of the equation. Going with PernixData FVP was going to address my needs for a fraction of the cost of a traditional solution.

4.  It allows for a significant decoupling of "storage for capacity" versus "storage for performance" dilemma when addressing additional storage needs.

5.  Another array would have been to a certain degree, more of the same. Incremental improvement, with less than enthusiastic results considering the amount invested.  I found myself not very excited to purchase another array. With so much volatility in the storage market, it almost seemed like an antiquated solution.

6.  Quick to implement. FVP installation consists of installing a VIB via Update Manager or the command line, installing the Management services and vCenter plugin, and you are off to the races.

7.  Hardware independent.  I didn’t have to wait for a special controller upgrade, firmware update, or wonder if my hardware would work with it. (a common problem with storage array solutions). Nor did I have to make a decision to perhaps go with a different storage vendor if I wanted to try a new technology.  It is purely a software solution with the flexibility of working with multiple types of flash; SSDs, or PCIe based. 

A different way to solve a classic problem
While my write intensive workload is pretty unique, my situation is not.  Our storage performance needs outgrew what the environment was designed for; capacity at a reasonable cost. This is an all too common problem.  With the increased capacities of spinning disks, it has actually made this problem worse, not better.  Fewer and fewer spindles are serving up more and more data.

My goal was to deliver the results our build VMs were capable of delivering with faster storage, but unable to because of my existing infrastructure.  For me it was about reducing I/O contention to allow the build system CPU cycles to deliver the builds without waiting on storage.  For others it might delivering lower latencies to their SQL backed ERP or CRM servers.

The allure of utilizing flash has been an intriguing one.  I often found myself looking at my vSphere hosts and all of it’s processing goodness, but disappointed those SSD sitting in the hosts couldn’t help to augment my storage performance needs.  Being an active participant in the PernixData beta program allowed me to see how it would help me in my environment, and if it would deliver the needs of the business.

Lessons learned so far
Don’t skimp on quality SSDs.  Would you buy an ESXi host with one physical core?  Of course you wouldn’t. Same thing goes with SSDs.  Quality flash is a must! I can tell you from first hand experience that it makes a huge difference.  I thought the Dell OEM SSDs that came with my M620 blades were fine, but by way of comparison, they were terrible. Don’t cripple a solution by going with cheap flash.  In this 4 node cluster, I went with 4 EMLC based, 400GB Intel S3700s. I also had the opportunity to test some Micron P400M EMLC SSDs, which also seemed to perform very well.

While I went with 400GB SSDs in each host (giving approximately 1.5TB of cache space for a 4 node cluster), I did most of my testing using 100GB SSDs. They seemed adequate in that they were not showing a significant amount of cache eviction, but I wanted to leverage my purchasing opportunity to get larger drives. Knowing the best size can be a bit of a mystery until you get things in place, but having a larger cache size allows for a larger working set of data available for future reads, as well as giving head room for the per-VM write-back redundancy setting available.

An unexpected surprise is how FVP has given me visibility into the one area of I/O monitoring that is traditional very difficult to see;  I/O patterns. See Iometer. As good as you want to make it.  Understanding this element of your I/O needs is critical, and the analytics in FVP has helped me discover some very interesting things about my I/O patterns that I will surely be investigating in the near future.

In the read-caching world, the saying goes that the fastest storage I/O is the I/O the array never will see. Well, with write caching, it eventually needs to be de-staged to the array.  While FVP will improve delivery of storage to the array by absorbing the I/O spikes and turning random writes to sequential writes, the I/O will still eventually have to be delivered to the backend storage. In a more write intensive environment, if the delta between your fast flash and your slow storage is significant, and your duty cycle of your applications driving the I/O is also significant, there is a chance it might not be able to keep up.  It might be a corner case, but it is possible.

What’s next
I’ll be posting more specifics on how running PernixData FVP has helped our environment.  So, is it really "disruptive" technology?  Time will ultimately tell.  But I chose to not purchase an array along with new SAN switchgear because of it.  Using FVP has lead to less traffic on my arrays, with higher throughput and lower read and write latencies for my VMs.  Yeah, I would qualify that as disruptive.

 

Helpful Links

Frank Denneman – Basic elements of the flash virtualization platform – Part 1
http://frankdenneman.nl/2013/06/18/basic-elements-of-the-flash-virtualization-platform-part-1/

Frank Denneman – Basic elements of the flash virtualization platform – Part 2
http://frankdenneman.nl/2013/07/02/basic-elements-of-fvp-part-2-using-own-platform-versus-in-place-file-system/

Frank Denneman – FVP Remote Flash Access
http://frankdenneman.nl/2013/08/07/fvp-remote-flash-access/

Frank Dennaman – Design considerations for the host local FVP architecture
http://frankdenneman.nl/2013/08/16/design-considerations-for-the-host-local-architecture/

Satyam Vaghani introducing PernixData FVP at Storage Field Day 3
http://www.pernixdata.com/SFD3/

Write-back deepdive by Frank and Satyam
http://www.pernixdata.com/files/wb-deepdive.html

5 thoughts on “Accelerating storage using PernixData’s FVP. A perspective from customer #0001”

  1. Hey Pete,

    For your workloads do you think write-around caching would work better? Just curios as your scenario for writes is not uncommon, but as you pointed out it is somewhat of an outlier. Looking forward to seeing performance data down the road.

    1. Thanks for reading Kurt. Testing the effect of write around cache could essentially be done if one chooses to put the VMs (some or all) in “write-through” mode (write-through and write-back are the only two options in FVP). While different than write-around, you are re-establishing this synchronous connection to the backing storage. Although, the secondary benefit of having any read caching is that in principal, it will free up the backing storage to allow it to write when it needs to. What I’m finding is a need to really understand the “dialog” of your workload. This is especially important on any tiered storage infrastructure (inside an array, or outside of an array). And of course the resulting amount of “hot reads” versus “cold reads” The benefit of write-back or write-through versus write-around is that it will be added to the cache for potential use as a read sometime there afterward, so I would suspect write-around would offer no benefit

      Yes, I’m looking forward to providing real performance data as well.

      1. Amen. This visibility is becoming a lot more important. The “cache is King” standpoint of most storage sales encourages a lack of visibility in a lot of ways. I was in a discussion with Brocade the other day around their new Fabric Vision capabilities coming out and was surprised to hear them say that it was more about information than need from the performance ends. Essentially a need to look at flows. This ties into you recent post about looking at your data patterns and cleaning up. I put a lot of effort into correlating Array, Fabric, Hypervisor, OS and Application performance data to determine Tier 0 needs. In the end, the methodology and mindset need to change. Kudos to you for guiding folks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: