vSAN in cost effective independent environments

Old habits in data center design can be hard to break.  New technologies are introduced that process data faster, and move data more quickly.  Yet all too often, the thought process for data center design remains the same – inevitably constructed and managed in ways that reflect conventional wisdom and familiar practices.  Unfortunately these common practices are often due to constraints of the technologies that preceded it, rather than aligning the current business objectives with new technologies and capabilities.

Historically, no component of an infrastructure dictated design and operation more than storage.  The architecture of traditional shared storage often meant that the storage infrastructure was the oddball of the modern data center.  Given enough capacity, performance, and physical ports on a fabric, a monolithic array could serve up several vSphere clusters, and therein lies the problem.  The storage was not seen or treated as a clustered resource by the hypervisor like compute.  This centralized way of storing data invited connectivity by as many hosts as possible in order to justify the associated costs. Unfortunately it also invited several problems.  It placed limits on data center design because in part, it was far too impractical to purchase separate shared storage for every use case that would benefit from an independent environment isolated from the rest of the data center.  As my colleague John Nicholson (blog/twitter) has often said, "you can’t cut your array in half."  It’s a humorous, but cogent way to describe this highly common problem.

vSANWhile VMware vSAN has proven to be extremely well suited for converging all applications into the same environment, business requirements may dictate a need for self contained, independent environments isolated in some manner from the rest of the data center.  In "Cost Effective Independent Environments using vSAN" found on VMware’s StorageHub, I walk through four examples that show how business requirements may warrant a cluster of compute and storage dedicated for a specific purpose, and why vSAN is an ideal solution.  The examples provided are:

  • Independent cluster management
  • Development/Test environments
  • Application driven requirements
  • Multi-purpose Disaster Recovery

Each example listed above details how traditional storage can fall short in delivering results efficiently, then compares how vSAN addresses and solves those specific design and operational challenges. Furthermore, learn how storage related controls are moved into the hypervisor using Storage Policy Based Management (SPBM), VMware’s framework that delivers storage performance and protection policies to VMs, and even individual VMDKs, all within vCenter.  SPBM is the common management framework used in vSAN and Virtual Volumes (VVols), and is clearly becoming the way to manage software defined storage.  Each example wraps up with a number of practical design tips for that specific scenario in order to get you started in building a better data center using vSAN.

Clustering is an incredibly powerful concept, and vSphere clusters in particular bring capabilities to your virtualized environment that are simply beyond comparison.  With VMware vSAN, the power of clustering resources are taken to the next level, forming the next logical step in the journey of modernizing your environment in preparation for a fully software defined data center.

This use case published is the first of many more to come that are focused on practical scenarios reflecting common needs of organizations large and small, and how vSAN can help deliver results, quickly and effectively.  Stay tuned!

– Pete

How CPU related metrics in vSphere may be misinterpreted

Most Data Center Administrators are accustomed to looking for high CPU utilization rates on VMs, and the hosts in which they reside. This shouldn’t be a big surprise. After all, vCenter, and other monitoring tools have default alarms to alert against high CPU usage statistics. Features like DRS, or products that claim DRS-like functionality factor in CPU related metrics as a part of their ability to redistribute VMs under periods of contention. All of these alerts and activities suggest that high CPU values are bad, and low values are good. But what if conventional wisdom on the consumption of CPU resources is wrong?

Why should you care
Infrastructure metrics can certainly be a good leading indicator of a problem. Over the years, high CPU usage alarms have helped correctly identified many rogue processes on VMs ("Hey, who enabled the screen saver via GPO?…"). But a CPU alarm trigger assumes that high CPU usage is always bad. It also implies that the absence of an alarm condition means that there is not an issue. Both assumptions can be incorrect, which may lead to bad decision making in the Data Center.

The subtleties of performance metrics can reveal problems somewhere else in the stack – if you know how and where to look. Unfortunately, when metrics are looked at in isolation, the problems remain hidden in plain sight. This post will demonstrate how a few common metrics related to CPU utilization can be misinterpreted. Take a look at the post Observations with the Active Memory metric in vSphere to see how this can happen with other metrics as well.

The testing
There are a number of CPU related metrics to monitor in the hypervisor, and at least a couple of different ways to look at them (vCenter, and esxtop). For brevity, lets focus on two metrics that readily visible in vCenter; CPU Usage and CPU Ready. This doesn’t dismiss the importance of other CPU related metrics, or the various ways to gather them, but it is a good start to understanding the relationship between metrics. As a quick refresher, CPU Usage as it relates to vCenter has two definitions. From the host, the usage is the percentage of CPU cycles in use against the total CPU cycles available on the host. On the VM, usage shows the percent of CPU resources in use against the total available CPU cycles of the vCPUs visible to the VM. CPU Ready in vCenter measures in summation form, the amount of time that the virtual machine was ready, but could not get scheduled to run on the CPU.

A few notes about the test conditions and results:

  • The tests here comprise of activities that are scheduled inside each guest, and are repeated 5 times over a 1 hour period.
  • There are no synthetic tools used here to generate storage I/O load or consume CPU cycles. (iometer, StressLinux, etc.)
  • The activities performed are using processes that are only partially multithreaded. This approach is most reflective of real world environments.
  • The "slower" storage depicted in the testing were actually SSDs, while the "faster" storage was by leveraging PernixData FVP and distributed fault tolerant memory (DFTM) as a storage acceleration tier.
  • The absolute numbers are not necessarily important for this testing. The focus is more about comparing values when a variable like storage performance changes.
  • No shares, reservations, or limits were used on the test VMs.

The complex demands of real world environments may exhibit a much greater impact than what the testing below reveals. I reference a few actual cases of production workloads later on in the post. Synthetic load generators were not used here because they cannot properly simulate a pattern of activity that is reflective of a real environment. Synthetic load generators are good at stressing resources – not simulating real world workloads, or the time it takes for those workloads to complete their tasks.

Interpreting impacts on CPU usage and CPU Ready with changing storage performance
Looking at CPU utilization can be challenging because not all applications, nor the workloads they generate are the same. Most applications are a complex mix of some processes being multithreaded, while others are not. Some processes initiate storage I/O, while others do not. It is for this reason that we will look at CPU Usage and CPU Ready over a task that is repeated on the same sets of VMs, but using storage that performs differently.

For all practical purposes, CPU Ready doesn’t become meaningful until a host is running a large number of single vCPU VMs concurrently, or a number of multiple vCPU VMs concurrently. CPU Ready can sometimes be terribly tricky to decipher because it can be influenced in so many ways. Sometimes it may align with CPU utilization, while other times it may not. It may be affected by other resources, or it may not. It really depends on the environmental conditions. I find it a good supporting metric, but definitely not one that should stand on its own merit, without proper context of other metrics. We are measuring it here because it is generally regarded as important, and one that may contribute to load distribution activities.

Test 1: Single vCPU VM on a Host with no other activity
First let’s look at one of the very simplest of comparisons. A single vCPU VM with no other activity occurring on the host, where one test is using slower storage (blue), and the other test it is using faster storage (orange). A task was completed 5 times over the course of one hour. The image below shows that from the host perspective, peak CPU utilization increased by 79% when using the faster storage. CPU Ready demonstrated very little change, which was as expected due to the nature of this test (no other VMs running on the host).


When we look at the individual VMs, the results are similar. The images below show that CPU usage maximums for the VM increased by 24% when using the faster storage. CPU Ready demonstrated very little change here because there were no other VMs to contend with on that host. The "Storage Latency" column shows the average storage latency the VM was seeing during this time period.


You might think that higher latency may not be realistic of today’s storage technologies. The "slower" storage in this case did in fact come from SSD based storage. But remember that Flash of any kind can suffer in performance when committing larger block I/O which is quite common with real workloads. Take a look at "Understanding block sizes in a virtualized environment" for more information.

But wait… how long did the task, set to run 5 times over the period of one hour take? Well, the task took just half the time to run with the faster storage. The same amount of cycles were processing the same amount of I/Os, but just for a shorter period of time. This faster completion of a task will free up those CPU cycles for other VMs. This is the primary reason why the averages for CPU Usage and CPU Ready changed very little. Looking at this data in a timeline form in vCenter illustrates it quite clearly. There is a clear distinction of the characteristics of the task on the fast storage. Much more difficult to decipher on the run with slower storage.


Test 2: Multiple vCPU VM on a host with other activity
Now let’s let the same workload run on VMs with assigned multiple (4) vCPUs, along with other multi-vCPU VMs running in the background. This is to simulate a bit of "chatter" or activity that one might experience in a production environment.

As we can see from the images below, on the host level, both CPU usage and CPU ready values increased as storage performance increased. CPU usage maximums increased by 39% on the host. CPU Ready maximums increased by 34% on the host, which was a noticeable difference than testing without any other systems running.


When we look at the individual VMs, the results are similar. The images below show that CPU usage maximums increased by 39% with the faster storage. CPU Ready maximums increased by 51% while running on the faster storage. Considering the typical VM to host consolidation ratio, the effects can be profound.


Now let’s take a look at the timeline in vCenter to get an appreciation of how those CPU cycles were used. On the image below, you can see that like the single vCPU VM testing, the VM running on faster storage allowed for much higher CPU usage than when running on slower storage, but that it was for a much shorter period of time (about half). You will notice that in this test, the CPU Ready measurements generally increases as the CPU usage increased.


Real world examples
This all brings me back to what I witnessed years ago while administering a vSphere environment consisting of extremely CPU and storage I/O intensive workloads. Dozens of resource intensive VMs built for the purpose of compiling code. These were systems using that could multithread to near perfection – assuming storage performance was sufficient.


Now let’s look at what CPU utilization rates looked like on that same VM, running the same code compiling job where the storage environment wasn’t able to satisfy reads and writes fast enough. The same job took 46% longer to complete, all because the available CPU cycles couldn’t be used.


Still not a believer? Take a look at a presentation at the OpenStack summit by Charter Communications in April 2016, where they demonstrate exactly the effect I describe. Their Cassandra cluster deployed with VMware Integrated OpenStack, and the effects of CPU utilization when providing lower latency, higher performing storage. (key information beginning at 17:10). Their more freely breathing storage allowed CPU cycles related to storage I/O to be committed more quickly, thereby finishing the tasks much more quickly. High CPU usage was a desired result of theirs.

You might be thinking to yourself, "Won’t I have more CPU contention with faster storage?"   Well, yes and no. Faster storage will give power back to the Administrator to control the usage of resources as needed, and deliver the SLAs required. And moving the point of contention to the CPU allows for what it does best; time slicing processes to complete the tasks as quickly as possible.

Sample what?
The rate at which telemetry data is sampled is a factor that can dramatically change your impression of the behavior of these resources used in the Data Center. It’s a big topic, and one that will be touched on in an upcoming post, but there is one thing to note here. When leveraging faster, lower latency storage, there are many times where CPU utilization and CPU Ready will stay the same. Why? In a real workload that involve CPU cycles executing to commit storage I/O, a workflow can may consist of a given amount of those I/Os, regardless of how long it takes. If that process took 18 seconds on slow storage, but 5 seconds on faster storage, the 20 second sampling rate within vCenter may render it in the same way. One often has to employ other tools to see these figures at a higher sampling rate. Tools such as vscsiStats and esxtop are good examples of this.

The testing, and examples above should make it easy to imagine a scenario in which a storage system is upgraded, and CPU related alarms are tripped more frequently, even though the processes that support a workflow have completed much more quickly. So with that, it’s good to keep the following in mind.

  • Slow storage will suppress CPU utilization rates – giving you the impression that from a host, or VM perspective, everything is fine.
  • Conversely, Fast storage will allow those CPU cycles related to storage I/O to execute, thereby increasing utilization rates – albeit for a shorter period of time.  High CPU statistics are not necessarily a bad thing.
  • Averages and peaks can be misleading because increased utilization rates may not be recognizable in the vCenter CPU charts if it completes within the smallest sampling size (20 seconds)
  • Traditional methods of monitoring and balancing host resources can be misleading
  • Higher CPU utilization rates may not be a leading indicator of an issue. They are often be a trailing indicator of well-designed processes, or free breathing storage. Again, high CPU can be a good thing!!!
  • Application behavior, and the results are what counts. If a batch job in SQL takes 30 minutes, defining success should be around the desired time of that batch job. Infrastructure related metrics should help you diagnose issues and assist with achieving a desired result, but not be the one and only KPI.
  • Storage performance will generally impact every VM and host accessing the cluster. Whereas host based resource contention will only impact other VMs living on that same host.

Thanks for reading

– Pete

What does your infrastructure analytics really tell you?

There is no mistaking the value of data visualization combined with analytics.  Data visualization can help make sense of the abstract or information not easily conveyed by numbers.  Data analytics excels at taking discrete data points that make no sense on their own, into findings that have context, and relevance.  The two together can present findings in a meaningful, insightful, and easy to understand way.  But what are your analytics really telling you?

The problem for modern IT is that there can be an overabundance of data, with little regard to the quality of data gathered, how it relates to each other, and how to make it meaningful.  All too often, this "more is better" approach obfuscates the important to such a degree that it provides less value, not more.  it’s easy to collect data.  The difficulty is to do something meaningful with the right data.  Many tools collect metrics in an order not by which is most important, but what can be easily provided.

Various solutions with the same problem
Modern storage solutions have increased their sophistication in their analytics offerings for storage.  In principle this can be a good thing, as storage capacity and performance is such a common problem with today’s environments.  Storage vendors have joined the "we do that too" race of analytics features.  However, feature list checkboxes can easily mask the reality – that the quality of insight is not what you might think it is.  Creative license gets a little, well, creative.

Some storage solutions showcase their storage I/O analytics as a complete solution for understanding storage usage and performance of an environment.  Advertising an extraordinary amount of data points collected, and sophisticated methods for collection of that data that is impressive by anyone’s standards.  But these metrics are often taken at face value.  Tough questions need to be asked before important decisions are made off of them.  Is the right data being measured?  Is the data being measure from the right location?  Is the data being measured in the right way?  And is the information conveyed of real value?

Accurate analytics requires that the sources of data are of the right quality and completeness.  No amount of shiny presentation can override the result of using the wrong data, or using it in the wrong way.

What is the right data?
The right data has a supporting influence on the questions that you are trying to answer.  Why did my application slow down after 1:18pm? How did a recent application modification impact other workloads?  In Infrastructure performance, I’ve demonstrated how block sizes have historically been ignored when it came to storage design, because they could not have been easily seen or measured.  Having metrics around fan speed of a storage array might be helpful for evaluating your cooling system in your Data Center, but does little to help you understand your workloads.  The right data must also be collected at a rate that accurately reflects the real behavior.  If your analytics offerings sample data once every 5 or 10 minutes, how can it ever show spikes of contention in resources that impact what your systems experience?  The short answer is, they can’t.

The importance of location
Measuring the data at the right location is critical to accurately interpreting the conditions of your VMs, and the infrastructure in which they live.  We perceive much more than we see.  This is demonstrated most often with a playful optical illusion, but can be a serious problem with understanding your environment.  The data gathered is often incomplete, and how you perceived it by virtue of assuming it was all the data you need all lead to the wrong conclusion.  Let’s consider a common scenario where the analytics of a storage system shows great performance of a storage array, yet the VM may be performing poorly.  This is the result of measuring from the wrong location.  The array may have showed the latency of the components inside the device, but cannot account for latency introduced throughout the storage stack.  The array metric might have been technically accurate for what it was seeing, but it was not providing you the correct, and complete metric.  Since storage I/O always originate on the VMs and the infrastructure in which they live, it simply does not make sense to measure them from a supporting component like a storage array.

Measuring data inside the VM can be equally as challenging.  Operating Systems’ method of data collection assume they are the sole proprietor of resources, and may not always accurately account for that fact that it is time slicing CPU clock cycles with other VMs.  While the VM is the end "consumer" of resource, it also does not understand it is virtualized, and cannot see the influence of performance bottlenecks throughout the virtualization layer, or any of the physical components in the stack that support it.

VM metrics pulled from inside the guest OS may measure thing in different ways depending on Operating System.  Consider the differences in how disk latency in Windows "Perfmon" is measured versus Linux "top."  This is the problem with data collector based solutions that aggregate metrics from difference sources.  A lot of data collected, but none of it means the same thing.

This disparate data leaves users attempting to reconcile what these metrics mean, and how they impact each other.  Even worse when supposedly similar metrics from two different sources show different data.  This can occur with storage array solutions that hook into vCenter to augment the array based statistics.  Which one is to be believed?  One over the other, or neither?

Statistics pulled solely from the hypervisor kernel avoids this nonsense.  It provides a consistent method for gathering meaningful data about your VMs and the infrastructure as a whole.  The hypervisor kernel is also capable of measuring this data in such a way that it accounts for all elements of the virtualization stack.  However, determining the location for collection is not the end-game.  We must also consider how it is analyzed.

Seeing the trees AND the forest
Metrics are just numbers.  More is needed than numbers to provide a holistic understanding for an environment.  Data collected that stands on its own is important, but how it contributes to the broader understanding of the environment is critical.  One needs to be able to get a broad overview of an environment to drill down and identify a root cause of an issue, or be able to start out at the level of an underperforming VM and see how or why it may be impacted by others.

Many attempt to distill down this large collection of metrics to just a few that might help provide insight into performance, or potential issues.  Examples of these individual metrics might include CPU utilization, Queue depths, storage latency, or storage IOPS.  However, it is quite common to misinterpret these metrics when looked at in isolation.

Holistic understanding provides its greatest value when attempting to determine the impact of one workload over a group of other workloads.  A VM’s transition to a new type of storage I/O pattern can often result in lower CPU activity; the exact opposite of what most would look for.  The weight of impact between metrics will also vary.  Think about a VM consuming large amounts of CPU.  This will generally only impact other VMs on that host.  In contrast, a storage based noisy neighbor can impact all VMs running on that storage system, not just the other VMs that live on that host.

Whether your systems are physical, virtualized, or live in the cloud, analytics exist to help answer questions, and solve problems.  But analytics are far more than raw numbers.  The value comes from properly digesting and correlating numbers into a story providing real intelligence.  All of this is contingent on using the right data in the first place.   Keep this in mind as you think about ways that you currently look at your environment.

Understanding block sizes in a virtualized environment

Cracking the mysteries of the Data Center is a bit like space exploration. You think you understand what everything is, and how it all works together, but struggle to understand where fact and speculation intersect. The topic of block sizes, as they relate to storage infrastructures is one such mystery. The term being familiar to some, but elusive enough to remain uncertain as to what it is, or why it matters.

This inconspicuous, but all too important characteristic of storage I/O has often been misunderstood (if not completely overlooked) by well-intentioned Administrators attempting to design, optimize, or troubleshoot storage performance. Much like the topic of Working Set Sizes, block sizes are not of great concern to an Administrator or Architect because of this lack of visibility and understanding. Sadly, myth turns into conventional wisdom – in not only what is typical in an environment, but how applications and storage systems behave, and how to design, optimize, and troubleshoot for such conditions.

Let’s step through this process to better understand what a block is, and why it is so important to understand it’s impact on the Data Center.

What is it?
Without diving deeper than necessary, a block is simply a chunk of data. In the context of storage I/O, it would be a unit in a data stream; a read or a write from a single I/O operation. Block size refers the payload size of a single unit. We can blame a bit of this confusion on what a block is by a bit of overlap in industry nomenclature. Commonly used terms like blocks sizes, cluster sizes, pages, latency, etc. may be used in disparate conversations, but what is being referred to, how it is measured, and by whom may often vary. Within the context of discussing file systems, storage media characteristics, hypervisors, or Operating Systems, these terms are used interchangeably, but do not have universal meaning.

Most who are responsible for Data Center design and operation know the term as an asterisk on a performance specification sheet of a storage system, or a configuration setting in a synthetic I/O generator. Performance specifications on a storage system are often the result of a synthetic test using the most favorable block size (often 4K or smaller) for an array to maximize the number of IOPS that an array can service. Synthetic I/O generators typically allow one to set this, but users often have no idea what the distribution of block sizes are across their workloads, or if it is even possibly to simulate that with synthetic I/O. The reality is that many applications draw a unique mix of block sizes at any given time, depending on the activity.

I first wrote about the impact of block sizes back in 2013 when introducing FVP into my production environment at the time. (See section "The IOPS, Throughput & Latency relationship")  FVP provided a tertiary glimpse of the impact of block sizes in my environment. Countless hours with the performance graphs, and using vscsistats provided new insight about those workloads, and the environment in which they ran. However, neither tool was necessarily built for real time analysis or long term trending of block sizes for a single VM, or across the Data Center. I had always wished for an easier way.

Why does it matter?
The best way to think of block sizes is how much of a storage payload consisting in a single unit.  The physics of it becomes obvious when you think about the size of a 4KB payload, versus a 256KB payload, or even a 512KB payload. Since we refer to them as a block, let’s use a square to represent their relative capacities.


Throughput is the result of IOPS, and the block size for each I/O being sent or received. It’s not just the fact that a 256KB block has 64 times the amount of data that a 4K block has, it is the amount of additional effort throughout the storage stack it takes to handle that. Whether it be bandwidth on the fabric, the protocol, or processing overhead on the HBAs, switches, or storage controllers. And let’s not forget the burden it has on the persistent media.

This variability in performance is more prominent with Flash than traditional spinning disk.  Reads are relatively easy for Flash, but the methods used for writing to NAND Flash can inhibit the same performance results from reads, especially with writes using large blocks. (For more detail on the basic anatomy and behavior of Flash, take a look at Frank Denneman’s post on Flash wear leveling, garbage collection, and write amplification. Here is another primer on the basics of Flash.)  A very small number of writes using large blocks can trigger all sorts of activity on the Flash devices that obstructs the effective performance from behaving as it does with smaller block I/O. This volatility in performance is a surprise to just about everyone when they first see it.

Block size can impact storage performance regardless of the type of storage architecture used. Whether it is a traditional SAN infrastructure, or a distributed storage solution used in a Hyper Converged environment, the factors, and the challenges remain. Storage systems may be optimized for different block size that may not necessarily align with your workloads. This could be the result of design assumptions of the storage system, or limits of their architecture.  The abilities of storage solutions to cope with certain workload patterns varies greatly as well.  The difference between a good storage system and a poor one often comes down to the abilities of it to handle large block I/O.  Insight into this information should be a part of the design and operation of any environment.

The applications that generate them
What makes the topic of block sizes so interesting are the Operating Systems, the applications, and the workloads that generate them. The block sizes are often dictated by the processes of the OS and the applications that are running in them.

Unlike what many might think, there is often a wide mix of block sizes that are being used at any given time on a single VM, and it can change dramatically by the second. These changes have profound impact on the ability for the VM and the infrastructure it lives on to deliver the I/O in a timely manner. It’s not enough to know that perhaps 30% of the blocks are 64KB in size. One must understand how they are distributed over time, and how latencies or other attributes of those blocks of various sizes relate to each other. Stay tuned for future posts that dive deeper into this topic.

Traditional methods capable of visibility
The traditional methods for viewing block sizes have been limited. They provide an incomplete picture of their impact – whether it be across the Data Center, or against a single workload.

1. Kernel statistics courtesy of vscsistats. This utility is a part of ESXi, and can be executed via the command line of an ESXi host. The utility provides a summary of block sizes for a given period of time, but suffers from a few significant problems.

  • Not ideal for anything but a very short snippet of time, against a specific vmdk.
  • Cannot present data in real-time.  It is essentially a post-processing tool.
  • Not intended to show data over time.  vscsistats will show a sum total of I/O metrics for a given period of time, but it’s of a single sample period.  It has no way to track this over time.  One must script this to create results for more than a single period of time.
  • No context.  It treats that workload (actually, just the VMDK) in isolation.  It is missing the context necessary to properly interpret.
  • No way to visually understand the data.  This requires the use of other tools to help visualize the data.

The result, especially at scale, is a very labor intensive exercise that is an incomplete solution. It is extremely rare that an Administrator runs through this exercise on even a single VM to understand their I/O characteristics.

2. Storage array. This would be a vendor specific "value add" feature that might present some simplified summary of data with regards to block sizes, but this too is an incomplete solution:

  • Not VM aware.  Since most intelligence is lost the moment storage I/O leaves a host HBA, a storage array would have no idea what block sizes were associated with a VM, or what order they were delivered in.
  • Measuring at the wrong place.  The array is simply the wrong place to measure the impact of block sizes in the first place.  Think about all of the queues storage traffic must go through before the writes are committed to the storage, and reads are fetched. (It also assumes no caching tiers outside of the storage system exist).  The desire would be to measure at a location that takes all of this into consideration; the hypervisor.  Incidentally, this is often why an array can show great performance on the array, but suffer in the observed latency of the VM.  This speaks to the importance of measuring data at the correct location. 
  • Unknown and possibly inconsistent method of measurement.  Showing any block size information is not a storage array’s primary mission, and doesn’t necessarily provide the same method of measurement as where the I/O originates (the VM, and the host it lives on). Therefore, how it is measured, and how often it is measured is generally of low importance, and not disclosed.
  • Dependent on the storage array.  If different types of storage are used in an environment, this doesn’t provide adequate coverage for all of the workloads.

The Hypervisor is an ideal control plane to analyze the data. It focuses on the results of the VMs without being dependent on nuances of in-guest metrics or a feature of a storage solution. It is inherently the ideal position in the Data Center for proper, holistic understanding of your environment.

Eyes wide shut – Storage design mistakes from the start
The flaw with many design exercises is we assume we know what our assumptions are. Let’s consider typical inputs when it comes to storage design. This includes factors such as

  • Peak IOPS and Throughput.
  • Read/Write ratios
  • RAID penalties
  • Perhaps some physical latencies of components, if we wanted to get fancy.

Most who have designed or managed environments have gone through some variation of this exercise, followed by a little math to come up with the correct blend of disks, RAID levels, and fabric to support the desired performance. Known figures are used when they are available, and the others might be filled in with assumptions.  But yet, block sizes, and everything they impact are nowhere to be found. Why? Lack of visibility, and understanding.

If we know that block sizes can dramatically impact the performance of a storage system (as will be shown in future posts) shouldn’t it be a part of any design, optimization, or troubleshooting exercise?  Of course it should.  Just as with working set sizes, lack of visibility doesn’t excuse lack of consideration.  An infrastructure only exists because of the need to run services and applications on it. Let those applications and workloads help tell you what type of storage fits your environment best. Not the other way around.

Is there a better way?
The ideal approach for measuring the impact of block sizes will always include measuring from the location of the hypervisor, as this will provide these measurements in the right way, and from the right location.  vscsiStats and vCenter related metrics are an incredible resource to tap into, and will provide the best understanding on impacts of block sizes in a storage system.  There may be some time investment to decipher block size characteristics of a workload, but the payoff is generally worth the effort.

Working set sizes in the Data Center

There is no shortage of mysteries in the data center. These stealthy influencers can undermine performance and consistency of your environment, while remaining elusive to identify, quantify, and control. Virtualization helped expose some of this information, as it provided an ideal control plane for visibility. But it does not, and cannot properly expose all data necessary to account for these influencers. The hypervisor also has a habit of presenting the data in ways that can be misinterpreted.

One such mystery as it relates to modern day virtualized data centers is known as the "working set." This term certainly has historical meaning in the realm of computer science, but the practical definition has evolved to include other components of the Data Center; storage in particular. Many find it hard to define, let alone understand how it impacts their data center, and how to even begin measuring it.

We often focus on what we know, and what we can control. However, lack of visibility of influencing factors in the data center does not make it unimportant. Unfortunately this is how working sets are usually treated. It is often not a part of a data center design exercise because it is completely unknown. It is rarely written about for the very same reason. Ironic considering that every modern architecture deals with some concept of localization of data in order to improve performance. Cached content versus it’s persistent home. How much of it is there? How often is it accessed? All of these types of questions are critically important to know.

What is it?
For all practical purposes, a working set refers the amount of data that a process or workflow uses in a given time period. Think of it as hot, commonly accessed data of your overall persistent storage capacity. But that simple explanation leaves a handful of terms that are difficult to qualify, and quantify. What is recent? Does "amount" mean reads, writes, or both? And does it define if it is the same data written over and over again, or is it new data? Let’s explore this more.

There are a several traits of working sets that are worth reviewing.

  • Working sets are driven by the workload, the applications driving the workload, and the VMs that they run on.  Whether the persistent storage is local, shared, or distributed, it really doesn’t matter from the perspective of how the VMs see it.  The size will be largely the same.
  • Working sets always relate to a time period.  However, it’s a continuum.  And there will be cycles in the data activity over time.
  • Working set will comprise of reads and writes.  The amount of each is important to know because reads and writes have different characteristics, and demand different things from your storage system.
  • Working set size refers to an amount, or capacity, but what and how many I/Os it took to make up that capacity will vary due to ever changing block sizes.
  • Data access type may be different.  Is one block read a thousand times, or are a thousand blocks read one time?  Are the writes mostly overwriting existing data, or is it new data?  This is part of what makes workloads so unique.
  • Working set sizes evolve and change as your workloads and data center change.  Like everything else, they are not static.

A simplified, visual interpretation of data activity that would define a working set, might look like below.


If a working set is always related to a period of time, then how can we ever define it? Well in fact, you can. A workload often has a period of activity followed by a period of rest. This is sometimes referred to the "duty cycle." A duty cycle might be the pattern that shows up after a day of activity on a mailbox server, an hour of batch processing on a SQL server, or 30 minutes compiling code. Taking a look over a larger period of time, duty cycles of a VM might look something like below.


Working sets can be defined at whatever time increment desired, but the goal in calculating a working set will be to capture at minimum, one or more duty cycles of each individual workload.

Why it matters
Determining a working set sizes helps you understand the behaviors of your workloads in order to better design, operate, and optimize your environment. For the same reason you pay attention to compute and memory demands, it is also important to understand storage characteristics; which includes working sets. Understanding and accurately calculating working sets can have a profound effect on the consistency of a data center. Have you ever heard about a real workload performing poorly, or inconsistently on a tiered storage array, hybrid array, or hyper-converged environment? This is because both are extremely sensitive to right sizing the caching layer. Not accurately accounting for working set sizes of the production workloads is a common reason for such issues.

Classic methods for calculation
Over the years, this mystery around working set sizes has resulted in all sorts of sad attempts at trying to calculate. Those attempts have included:

  • Calculate using known (but not very helpful) factors.  These generally comprise of looking at some measurement of IOPS over the course of a given time period.  Maybe dress it up with a few other factors to make it look neat.  This is terribly flawed, as it assumes one knows all of the various block sizes for that given workload, and that block sizes for a workload are consistent over time.  It also assumes all reads and writes use the same block size, which is also false.
  • Measure working sets defined on a storage array, as a feature of the array’s caching layer.  This attempt often fails because it sits at the wrong location.  It may know what blocks of data are commonly accessed, but there is no context to the VM or workload imparting the demand.  Most of that intelligence about the data is lost the moment the data exits the HBA of the vSphere host.  Lack of VM awareness can even make an accurately guessed cache size on an array be insufficient at times due to cache pollution from noisy neighbor VMs.
  • Take an incremental backup, and look at the amount of changed data.  This sounds logical, but this can be misleading because it will not account for data that is written over and over, nor does it account for reads.  The incremental time period of the backup may also not be representative of the duty cycle of the workload.
  • Guess work.  You might see "recommendations" that say a certain percentage of your total storage capacity used is hot data, but this is a more formal way to admit that it’s nearly impossible to determine.  Guess large enough, and the impact of being wrong will be less, but this introduces a number of technical and financial implications on data center design. 

Since working sets are collected against activity that occurs on a continuum, calculating a typical working set with a high level of precision is not only impossible, but largely unnecessary.  When attempting to determine working set size of a workload, the goal is to come to a number that reflects the most typical behavior of a single workload, group of workloads, or a total sum of workloads across a cluster or data center.

A future post will detail approaches that should give a sufficient level of understanding on active working set sizes, and help reduce the potential of negative impacts on data center operation due to poor guesswork.

Thanks for reading

Understanding PernixData FVP’s clustered read caching functionality

When PernixData debuted FVP back in August 2013, for me there was one innovation in particular that stood out above the rest.  The ability to accelerate writes (known as “Write Back” caching) on the server side, and do so in a fault tolerant way.  Leverage fast media on the server side to drive microsecond write latencies to a VM while enjoying all of the benefits of VMware clustering (vMotion, HA, DRS, etc.).  Give the VM the advantage of physics by presenting a local acknowledgement of the write, but maintain all of the benefits of keeping your compute and storage layers separate.

But sometimes overlooked with this innovation is the effectiveness that comes with how FVP clusters acceleration devices to create a pool of resources for read caching (known as “Write Through” caching with FVP). For new and existing FVP users, it is good to get familiar with the basics of how to interpret the effectiveness of clustered read caching, and how to look for opportunities to improve the results of it in an environment. For those who will be trying out the upcoming FVP Freedom edition, this will also serve as an additional primer for interpreting the metrics. Announced at Virtualization Field Day 5, the Freedom Edition is a free edition of FVP with a few limitations, such as read caching only, and a maximum of 128GB tier size using RAM.

The power of read caching done the right way
Read caching alone can sometimes be perceived as a helpful way to improve performance, but temporary, and only addressing one side of the I/O dialogue. Unfortunately, this assertion tells an incomplete story. It is often criticized, but let’s remember that caching in some form is used by almost everyone, and everything.  Storage arrays of all types, Hyper Converged solutions, and even DAS.  Dig a little deeper, and you realize its perceived shortcomings are most often attributed to how it has been implemented. By that I mean:

  • Limited, non-adjustable cache sizes in arrays or Hyper Converged environments.
  • Limited to a single host in server side solutions.  (operations like vMotion undermining its effectiveness)
  • Not VM or workload aware.

Existing solutions address some of these shortcomings, but fall short in addressing all three in order to deliver read caching in a truly effective way. FVP’s architecture address all three, giving you the agility to quickly adjust the performance tier while letting your centralized storage do what it does best; store data.

Since FVP allows you to choose the size of the acceleration tier, this impact alone can be profound. For instance, current NVMe based Flash cards are 2TB in size, and are expected to grow dramatically in the near future. Imagine a 10 node cluster that would have perhaps 20-40TB of an acceleration tier that may be serving up just 50TB of persistent storage. Compare this to a hybrid array that may only put in a few hundred GB of flash devices in an array serving up that same 50TB, and funneling through a pair of array controllers. Flash that the I/Os would still have traverse the network and storage stack to get to, and cached data that is arbitrarily evicted for new incoming hot blocks.

Unlike other host side caching solutions, FVP treats the collection of acceleration devices on each host as a pool. As workloads are being actively moved across hosts in the vSphere cluster, those workloads will still be able to fetch the cached content from that pool using a light weight protocol. Traditionally host based caching would have to re-warm the data from the backend storage using the entire storage stack and traditional protocols if something like a vMotion event occurred.

FVP is also VM aware. This means it understands the identity of each cached block – where it is coming from, and going to -  and has many ways to maintain cache coherency (See Frank Denneman’s post Solving Cache Pollution). Traditional approaches to providing a caching tier meant that they were largely unaware of who the blocks of data were associated with. Intelligence was typically lost the moment the block exits the HBA on the host. This sets up one of the most common but often overlooked scenarios in a real environment. One or more noisy neighbor VMs can easily pollute, and force eviction of hot blocks in the cache used by other VMs. The arbitrary nature of this means potentially unpredictable performance with these traditional approaches.

How it works
The logic behind FVP’s clustered read caching approach is incredibly resilient and efficient. Cached reads for a VM can be fetched from any host participating in the cluster, which allows for a seamless leveraging of cache content regardless of where the VM lives in the cluster. Frank Denneman’s post on FVP’s remote cache access describes this in great detail.

Adjusting the charts
Since we will be looking at the FVP charts to better understand the benefit of just read caching alone, let’s create a custom view. This will allow us to really focus on read I/Os and not get them confused with any other write I/O activity occurring at the same time.



Note that when you choose a "Custom Breakdown", the same colors used to represent both reads and writes in the default "Storage Type" view will now be representing ONLY reads from their respective resource type. Something to keep in mind as you toggle between the default "Storage Type" view, and this custom view.


Looking at Offload
The goal for any well designed storage system is to deliver optimal performance to the applications.  With FVP, I/Os are offloaded from the array to the acceleration tier on the server side.  Read requests will be delivered to the VMs faster, reducing latency, and speeding up your applications. 

From a financial investment perspective, let’s not forget the benefit of I/O “offload.”  Or in other words, read requests that were satisfied from the acceleration tier. Using FVP, offload from the storage arrays serving the persistent storage tier, from the array controllers, from the fabric, and the HBAs. The more offload there is, the less work for your storage arrays and fabric, which means you can target more affordable backend storage. The hero numbers showcase the sum of this offload nicely.image

Looking at Network acceleration reads
Unlike other host based solutions, FVP allows for common activities such as vMotions, DRS, and HA to work seamlessly without forcing any sort of rewarming of the cache from the backend storage. Below is an example of read I/O from 3 VMs in a production environment, and their ability to access cached reads on an acceleration device on a remote host.


Note how the Latency maintains its low latency on those read requests that came from a remote acceleration device (the green line).

How good is my read caching working?
Regardless of which write policy (Write Through or Write Back) is being used in FVP, the cache is populated in the same way.

  • All read requests from the backing array will place the data into the acceleration tier as it fetches it from the backing storage.
  • All write I/O is placed in the cache as it is written to the physical storage.
    Therefore, it is easy to conclude that if read I/Os did NOT come from acceleration tier, it is from one of three reasons.
  • A block of data had been requested that had never been requested before.
  • The block of data had not been written recently, and thus, not residing in cache.
  • A block of data had once lived in the cache (via a read or write), but had been evicted due to cache size.

The first two items reflect the workload characteristics, while the last one is a result of a design decision – that being the cache size. With FVP you get to choose how large the devices are that make up the caching tier, so you can determine ultimately how much the solution will benefit you. Cache size can have a dramatic impact on performance because there is less pressure to evict previous data that have already been cached to make room for new data.

Visualizing the read cache usage
This is where the FVP metrics can tell the story. When looking at the "Custom Breakdown" view described earlier in this post, you can clearly see on the image below that while a sizable amount of reads were being serviced from the caching tier, the majority of reads (3,500+ IOPS sustained) in this time frame (1 week) came from the backing datastore.


Now, let’s contrast this to another environment and another workload. The image below clearly shows a large amount of data over the period of 1 day that is served from the acceleration tier. Nearly all of the read I/Os and over 60MBps of throughput that never touched the array.


When evaluating read cache sizing, this is one of the reasons why I like this particular “Custom Breakdown” view so much. Not only does it tell you how well FVP is working at offloading reads. It tells you the POTENTIAL of all reads that *could* be offloaded from the array.  You get to choose how much offload occurs, because you decide on how large your tier size is, or how many VMs participate in that tier.

Hit Rate will also tell you the percentage of reads that are coming from the acceleration tier at any point and time. This can be an effective way to few cache hit frequency, but to gain more insight, I often rely on this "Custom Breakdown" to get better context of how much data is coming from the cache and backing datastores at any point in time. Eviction rate can also provide complimentary information if it shows the eviction rate creeping upward.  But there can be cases were lower eviction percentages may evict enough cached data over time that it can still impact if it is still in cache.  Thus the reason why this particular "Custom Breakdown" is my favorite for evaluating reads.

What might be a scenario for seeing a lot of reads coming from a backing datastore, and not from cache? Imagine running 500 VMs in an acceleration tier size of just a few GB. The working set sizes are likely much larger than the cache size, and will result in churning through the cache and not show significant demonstrable benefit. Something to keep in mind if you are trying out FVP with a very small amount of RAM as an acceleration resource. Two effective ways to make this more efficient would be to 1.) increase the cache size or 2.) decrease the number of VMs participating in acceleration. Both will achieve the same thing; providing more potential cache tier size for each VM accelerated. The idea for any caching layer is to have it large enough to hold most of the active data (aka "working set") in the tier. With FVP, you get to easily adjust the tier size, or the VMs participating in it.

Don’t know what your working set sizes are?  Stay tuned for PernixData Architect!

Once you have a good plan for read caching with FVP, and arrange for a setup with maximum offload, you can drive the best performance possible from clustered read caching. On it’s own, clustered read caching implemented the way FVP does it can change the architectural discussion of how you design and spend those IT dollars.  Pair this with write-buffering with the full edition of FVP, and it can change the game completely.

Interpreting Performance Metrics in PernixData FVP

In the simplest of terms, performance charts and graphs are nothing more than lines with pretty colors.  They exist to provide insight and enable smart decision making.  Yet, accurate interpretation is a skill often trivialized, or worse, completely overlooked.  Depending on how well the data is presented, performance graphs can amaze or confuse, with hardly a difference between the two.

A vSphere environment provides ample opportunity to stare at all types of performance graphs, but often lost are techniques in how to interpret the actual data.  The counterpoint to this is that most are self-explanatory. Perhaps a valid point if they were not misinterpreted and underutilized so often.  To appeal to those averse to performance graph overload, many well intentioned solutions offer overly simplified dashboard-like insights.  They might serve as a good starting point, but this distilled data often falls short in providing the detail necessary to understand real performance conditions.  Variables that impact performance can be complex, and deserve more insight than a green/yellow/red indicator over a large sampling period.

Most vSphere Administrators can quickly view the “heavy hitters” of an environment by sorting the VMs by CPU in order to see the big offenders, and then drill down from there.  vCenter does not naturally provide good visual representation for storage I/O.  Interesting because storage performance can be the culprit for so many performance issues in a virtualized environment.  PernixData FVP accelerates your storage I/O, but also fills the void nicely in helping you understand your storage I/O.

FVP’s metrics leverage VMkernel statistics, but in my opinion make them more consumable.  These statistics reported by the hypervisor are particularly important because they are the measurements your VMs and applications feel.  Something to keep in mind when your other components in your infrastructure (storage arrays, network fabrics, etc.) may advertise good performance numbers, but don’t align with what the applications are seeing.

Interpreting performance metrics is a big topic, so the goal of this post is to provide some tips to help you interpret PernixData FVP performance metrics more accurately.

Starting at the top
In order to quickly look for the busiest VMs, one can start at the top of the FVP cluster.  Click on the “Performance Map” which is similar to a heat map. Rather than projecting VM I/O activity by color, the view will project each VM on their respective hosts at different sizes proportional to how much I/O they are generating for that given time period.  More active VMs will show up larger than less active VMs.

(click on images to enlarge)


From here, you can click on the targets of the VMs to get a feel for what activity is going on – serving as a convenient way to drill into the common I/O metrics of each VM; Latency, IOPS, and Throughput.


As shown below, these same metrics are available if the VM on the left hand side of the vSphere client is highlighted, and will give a larger view of each one of the graphs.  I tend to like this approach because it is a little easier on the eyes.


VM based Metrics – IOPS and Throughput
When drilling down into the VM’s FVP performance statistics, it will default to the Latency tab.  This makes sense considering how important latency is, but I find it most helpful to first click on the IOPS tab to get a feel for how many I/Os this VM is generating or requesting.  The primary reason why I don’t initially look at the Latency tab is that latency is a metric that requires context.  Often times VM workloads are bursty, and there may be times where there is little to no IOPS.  The VMkernel can sometimes report latency against little or no I/O activity a bit inaccurately, so looking at the IOPS and Throughput tabs first bring context to the Latency tab.

The default “Storage Type” breakdown view is a good view to start with when looking at IOPs and Throughput. To simplify the view even more tick the boxes so that only the “VM Observed” and the “Datastore” lines show, as displayed below.


The predefined “read/write” breakdown is also helpful for the IOPS and Throughput tabs as it gives a feel of the proportion of reads versus writes.  More on this in a minute.

What to look for
When viewing the IOPS and Throughput in an FVP accelerated environment, there may be times when you see large amounts of separation between the “VM Observed” line (blue) and the “Datastore” (magenta). Similar to what is shown below, having this separation where the “VM Observed” line is much higher than the “Datastore” line is a clear indication that FVP is accelerating those I/Os and driving down the latency.  It doesn’t take long to begin looking for this visual cue.


But there are times when there may be little to no separation between these lines, such as what you see below.


So what is going on?  Does this mean FVP is no longer accelerating?  No, it is still working.  It is about interpreting the graphs correctly.  Since FVP is an acceleration tier only, cached reads come from the acceleration tier on the hosts – creating the large separation between the “Datastore” and the “VM Observed” lines.  When FVP accelerates writes, they are synchronously buffered to the acceleration tier, followed by destaging to the backing datastore as soon as possible – often within milliseconds.  The rate at which data is sampled and rendered onto the graph will report the “VM Observed” and “Datastore” statistics that are at very similar times.

By toggling the “Breakdown” to “read/write” we can confirm in this case that the change in appearance in the IOPS graph above came from the workload transitioning from mostly reads to mostly writes.  Note how the magenta “Datastore” line above matches up with the cyan “Write” line below.


The graph above still might imply that the performance went down as the workload transition from reads to writes. Is that really the case?  Well, let’s take a look at the “Throughput” tab.  As you can see below, the graph shows that in fact there was the same amount of data being transmitted on both phases of the workload, yet the IOPS shows much fewer I/Os at the time the writes were occurring.


The most common reason for this sort of behavior is OS file system buffer caching inside the guest VM, which will assemble writes into larger I/O sizes.  The amount of data read in this example was the same as the amount of data that was written, but measuring that by only IOPS (aka I/O commands per second) can be misleading. I/O sizes are not only underappreciated for their impact on storage performance, but this is a good example of how often the I/O sizes can change, and how IOPS can be a misleading measurement if left on its own.

If the example above doesn’t make you question conventional wisdom on industry standard read/write ratios, or common methods for testing storage systems, it should.

We can also see from the Write Back Destaging tab that FVP destages the writes as aggressively as the array will allow.  As you can see below, all of the writes were delivered to the backing datastore in under 1 second.  This ties back to the previous graphs that showed the “VM Observed” and the “Datastore” lines following very closely to each other during period with several writes.


The key to understanding the performance improvement is to look at the Latency tab.  Notice on the image below how that latency for the VM dropped way down to a low, predictable level throughout the entire workload.  Again, this is the metric that matters.


Another way to think of this is that the IOPS and Throughput performance charts can typically show the visual results for read caching better than write buffering.  This is because:

  • Cached reads never come from the backing datastore, where buffered writes always hit the backing datastore.
  • Reads may be smaller I/O sizes than writes, which visually skews the impact if only looking at the IOPS tab.

Therefore, the ultimate measurement for both reads and writes is the latency metric.

VM based Metrics – Latency
Latency is arguably one of the most important metrics to look at.  This is what matters most to an active VM and the applications that live on it.  Now that you’ve looked at the IOPS and Throughput, take a look at the Latency tab. The “Storage type” breakdown is a good place to start, as it gives an overall sense of the effective VM latency against the backing datastore.  Much like the other metrics, it is good to look for separation between the “VM Observed” and “Datastore” where “VM Observed” latency should be lower than the “Datastore” line.

In the image above, the latency is dramatically improved, which again is the real measurement of impact.  A more detailed view of this same data can be viewed by selecting a “custom ” breakdown.  Tick the following checkboxes as shown below


Now take a look at the latency for the VM again. Hover anywhere on the chart that you might find interesting. The pop-up dialog will show you the detailed information that really tells you valuable information:

  • Where would have the latency come from if it had originated from the datastore (datastore read or write)
  • What has contributed to the effective “VM Observed” latency.


What to look for
The desired result for the Latency tab is to have the “VM Observed” line as low and as consistent as possible.  There may be times where the VM observed latency is not quite as low as you might expect.  The causes for this are numerous, and subject for another post, but FVP will provide some good indications as to some of the sources of that latency.  Switching over to the “Custom Breakdown” described earlier, you can see this more clearly.  This view can be used as an effective tool to help better understand any causes related to an occasional latency spike.

Hit & Eviction rate
Hit rate is the percentage of reads that are serviced by the acceleration tier, and not by the datastore.  It is great to see this measurement high, but is not the exclusive indicator of how well the environment is operating.  It is a metric that is complimentary to the other metrics, and shouldn’t be looked at in isolation.  It is only focused on displaying read caching hit rates, and conveys that as a percentage; whether there are 2,000 IOPS coming from the VM, or 2 IOPS coming from the VM.

There are times where this isn’t as high as you’d think.  Some of the causes to a lower than expected hit rate include:

  • Large amounts of sequential writes.  The graph is measuring read “hits” and will see a write as a “read miss”
  • Little or no I/O activity on the VM monitored.
  • In-guest activity that you are unaware of.  For instance, an in-guest SQL backup job might flush out the otherwise good cache related to that particular VM.  This is a leading indicator of such activity.  Thanks to the new Intelligent I/O profiling feature in FVP 2.5, one has the ability to optimize the cache for these types of scenarios.  See Frank Denneman’s post for more information about this feature.

Lets look at the Hit Rate for the period we are interested in.


You can see from above that the period of activity is the only part we should pay attention to.  Notice on the previous graphs that outside of the active period we were interested in, there was very little to no I/O activity

A low hit rate does not necessarily mean that a workload hasn’t been accelerated. It simply provides and additional data point for understanding.  In addition to looking at the hit rate, a good strategy is to look at the amount of reads from the IOPS or Throughput tab by creating the custom view settings of:


Now we can better see how many reads are actually occurring, and how many are coming from cache versus the backing datastore.  It puts much better context around the situation than relying entirely on Hit Rate.


Eviction Rate will tell us the percentage of blocks that are being evicted at any point and time.  A very low eviction rate indicates that FVP is lazily evicting data on an as needed based to make room for new incoming hot data, and is a good sign that the acceleration tier size is sized large enough to handle the general working set of data.  If this ramps upward, then that tells you that otherwise hot data will no longer be in the acceleration tier.  Eviction rates are a good indicator to help you determine of your acceleration tier is large enough.

The importance of context and the correlation to CPU cycles
When viewing performance metrics, context is everything.  Performance metrics are guilty of standing on their own far too often.  Or perhaps, it is human nature to want to look at these in isolation.  In the previous graphs, notice the relationship between the IOPS, Throughput, and Latency tabs.  They all play a part in delivering storage payload.

Viewing a VM’s ability to generate high IOPS and Throughput are good, but this can also be misleading.  A common but incorrect assumption is that once a VM is on fast storage that it will start doing some startling number of IOPS.  That is simply untrue. It is the application (and the OS that it is living on) that is dictating how many I/Os it will be pushing at any given time. I know of many single threaded applications that are very I/O intensive, and several multithreaded applications that aren’t.  Thus, it’s not about chasing IOPS, but rather, the ability to deliver low latency in a consistent way.  It is that low latency that lets the CPU breath freely, and not wait for the next I/O to be executed.

What do I mean by “breath freely?”  With real world workloads, the difference between fast and slow storage I/O is that CPU cycles can satisfy the I/O request without waiting.  A given workload may be performing some defined activity.  It may take a certain number of CPU cycles, and a certain number of storage I/Os to accomplish this.  An infrastructure that allows those I/Os to complete more quickly will let more CPU cycles to take part in completing the request, but in a shorter amount of time.


Looking at CPU utilization can also be a helpful indicator of your storage infrastructure’s ability to deliver the I/O. A VM’s ability to peak at 100% CPU is often a good thing from a storage I/O perspective.  It means that VM is less likely to be storage I/O constrained.

The only thing better than a really fast infrastructure for your workloads is understanding how well it is performing.  Hopefully this post offers up a few good tips when you look at your own workloads leveraging PernixData FVP.

Getting the big IT purchase approved

IT organizations are faced with a tantalizing array of options when it comes to hardware and software solutions. But long before anything can ever be deployed, it has to be purchased, which means at some point it had to be approved. Sometimes deploying a solution is easy compared to getting it approved. But how does one go about getting the big ticket item through? Well, here is my attempt at demystifying the process.

First, lets just say that "big purchase" is without a doubt a relative term. For an SMB, $10,000 might be a show stopper, while seven figures for a large enterprise may be part of the routine. Both offer unique challenges, but share similar tactics. Getting a big IT purchase approved typically consists of a unique set of skills and experience. A mix of preparation, clarity, delivery, timing, and attitude make up the chaotic formula that when done well, will improve the odds of success. It is a skill that can be equally important to anything you bring in your technical arsenal.

You will serve yourself well if you think and deliver like a consultant. Life in Ops can get muddied down by internal strife, whack-a-mole fire fighting, and the occasional "look at this new feature" deployment even though nobody asked for it. Take notice of how a good consultant does things. Step back to understand the desired result, then build out your own statement defining the typical design inputs like requirements, constraints, assumptions and risks.

At some point, you will need to prioritize your own wants, and pick your battles. You typically can’t have everything, so start from the ground up of what IT’s mission statement is, and work from there. Start with bet-the-business elements like high availability, and data/system protection that won’t be spoken up for by anyone but IT. Then, if there are other needs, they may in fact be a departmental need that impacts productivity and revenue. While IT may be the enabler of the request, make sure the identity of the requester is clear.

It’s not uncommon for an SMB to have very little money allocated to IT, but this isn’t an excuse for lack of diligence in preparation. Large organizations have more money, but proportionally much more complex problems to solve, SLAs to adhere to, and regulations to comply with. If you have no idea how your organization’s IT spending compares to peers in your industry, it is time to learn, and communicate that as a part of your presentation if your funds are abnormally low.

This is also an opportunity for you to project yourself as the "solution provider" in your organization. Embrace this. Help them understand why technology costs have increased over the past 10 years. If someone says, "Why don’t we just use the cloud for this?" Rather than let smoke pour out of your ears, respond with "That is a great question Joe. IT is constantly looking for the best ways to deliver services that meets the requirements of the organization." And then go into an appropriate level of detail on why it may or may not be a good fit. (If it is a good fit, then say so!). The point here is to embrace the solution provider role for the organization.

Your biggest competitor to your proposal will be, you guessed it, doing nothing. But there is a cost of doing nothing. The key stakeholders might look at this proposed expenditure and compare it to $0. In most cases, this is completely wrong, and it is up to you to help them understand what the real cost comparison is.

One opportunity sometimes overlooked is the power of a cost deferral. Does the unbudgeted solution you are proposing delay a much larger budgeted purchase until perhaps next year? Showcase this. Good proposals typically show a TCO of 3 to 5 years. But do not underestimate the allure an immediate cost deferral has to your friendly CFO.

Get input on defining the "what" of a problem, and it’s impacts. The "how" is usually reserved for the Subject Matter Expert (e.g. you). This will minimize silly ideas from others suggesting your storage capacity issues can be solved by the Friday flier for Best Buy.

Learn to prime the pump. Do a little one-on-one campaigning. This is a common method suggested in many books on successful leadership. It is your chance to win over your constituents before any formal proposal. Trying holding an internal "Lunch and Learn" about trends in technology. Share a little about how amazing virtualization is, and help them understand some basic challenges of IT. These techniques will engage key personnel, and help in establishing a trusting relationship with IT.

The presentation – IT Shark Tank
I’m a big fan of the show, ‘Shark Tank.’ If you aren’t familiar with it, four very successful investors hear pitches by would-be entrepreneurs who are looking for investment funds in exchange for a stake in equity. The investors bring their own wealth, smarts and competitive nature to the table, and can be quite tough on prospective entrepreneurs. A few things can be gleaned from this, and applied directly to your ability to deliver a successful proposal.

  • Come prepared. Nothing kills a proposal like lack of preparation, and not knowing your facts. Lets say you are requesting more storage: You’d better believe some of the simplest questions will be asked. Many that you may overlook when entering a room. "How much storage do we have?" "How much do we have left?" "How much do we need?" "Why does it cost so much?" "what are the alternatives?"
  • Clearly state the problem, the impacts to the business, the options, and your recommendations.
  • Learn to answer the simplest of questions in the simplest of ways. "Does this proposal save us money?" "Is there a less expensive way to do this?"
  • Craft your message to your audience and appeal to their sensibilities. Flog yourself upside the head if you use any IT acronyms, or assume that technical gymnastics is going to impress them. It won’t. What will is being concise. Every word has a purpose.
  • Provide a little (but not too much) context to the problem that you are trying to solve. Leverage an analogy if you need to.
  • Know the counterpoints, and how to respond. Know how you are going to answer a question you don’t know the answer to.
  • Seek to understand their position. What might they dislike (e.g. unpredictable expenses, obligated debt, investments they don’t understand, etc.)
  • Respect everyone’s time. Make it quick, make it concise, and if they would like more detail, you can certainly do that, but don’t make it a part of the pitch.

How to deal with everyone else in the food chain
Be honest with your vendors. They have a job to do, and are trying to help you. If you show interest in a solution that is 10x more than what you can afford, it isn’t going to do anyone good to bring them in for an onsite demonstration. They will appreciate your honesty so they can perhaps focus on more cost appropriate solutions. Believe it or not, most want the right solution for you in the first place, as repeat business is the most important value they can bring back to their own organization.

If you are someone who doesn’t have deep-dive knowledge on the solution you are proposing, take advantage of the SE for the VAR or channel partner as a resource. Many of my friends in the industry are SEs and are some of the best and the brightest folks I know, and they all came from the Ops side at some point. Use them as a resource to learn about the solutions they are proposing, and ask them challenging questions.

Be honest with your organization. This isn’t about what you want. Your value will increase when you can demonstrate repeatedly that you have their best interests in mind.

After the decision
If the proposal was approved, focus on delivering at least some results fast. Then showcase the win and how IT can help solve organizational challenges. This may sound like self promotion, but it is not if done right. The wins are for the organization, not you. This establishes trust, and lays the groundwork for the future. Use company newsletters, or establish a monthly IT Review to share updates.

If it was denied, don’t take it personal. It is great to show passion, but don’t confuse passion for what you are really trying to do; helping your organization make the best strategic and financial decision for them. Would it be gratifying to get a new Datacenter revamp through only to realize it was the financial tipping point of the organization just a few months later? Keep it all in perspective. Besides, some of the best purchasing decisions I’ve been involved with were the ones that were ultimately rejected, which gave solutions a chance to mature, and me an opportunity to find a different way to solve a problem.

Try doing your own proposal or presentation retrospective. What went well and what didn’t. Ask for feedback on how it went. You might be surprised at the responses you get.

You have the unique opportunity to be the technology advocate for the organization rather than simply a burden to the budget.  Do I get everything approved?  Of course I don’t, but a well prepared proposal will allow you, and your organization to make the smartest decisions possible, and help IT deliver great results.

Software that helps make life in IT a little easier


In IT, rarely is one truly developing something from the ground up.  In many ways, IT is about making solutions work – disjointed as they may be.  Large enterprise class solutions such as Email and messaging platforms, Content Management Systems, CRM’s, Directory Services, and Security Solutions all are massively complex -  even if they are well designed.  Those of us who are faced with the responsibility to “make it work” must possess the knack to be a deep-dive expert on any number of subjects, while having the big picture perspective of the IT Generalist.  It can be a complex mix of factors that determine how well solutions end up working out.  It’s usually an assorted mix of experience, technical and organizational skillsets, ingenuity, a lot of hard work, and a little bit of luck.  This is how the seasoned IT veteran separates themselves from those less experienced. 

Then, every once in a while a piece of software comes along to make your life in IT easier.  Software that helps bridge the much needed gaps that may exist in cross platform integration, connectivity, management, monitoring, or procedural tasks.  These are applications that don’t make deploying or managing complex systems easy.  They just make it a little easier.  Sometimes you stumble upon helpful applications like these almost by accident, as I have.  Others you knew of, but just never got around to trying out.  So I thought I’d take a brief time-out from my recent focus on all things related to Virtualization, and take a moment to share a few of those applications that are currently making my life in IT a little easier.  Some of these listed below are worthy of their own posts, which I hope to get around to.  It is a list that is neither complete, nor appropriate for every environment, and their importance really depends on how much you need it.  Only time will tell on which solutions become obsolete, and which one’s stand up over time.

Scribe Insight
This may be the best product you’ve never heard of.  If you ever need to transform, manipulate, or convert data from disparate systems, this is the product for you.  No, it’s not a “utility” but an enterprise class solution that demands a commitment in time to learn.  The results are stunning.  Data sources that had no earthly intention of being able to talk to another system can share the same data.   Example:  Your Sales Department uses a CRM running on SQL, but an ERP or Finance system runs on Oracle, and you need those records to interact on a transaction by transaction basis.  Scribe can do that, and much more.  Are those systems running on separate networks?  No problem.  Scribe simplifies the communication channels between autonomous systems.  It can insulate the complexity of convoluted database tables, and in some cases will completely eliminate the need for you to use an application’s SDK for data integration.  Database Administrators would love this tool, but it’s power extends well beyond just database integration.  It’s a true gem.

Tree Size Pro
You have a choice. Spend weeks and weeks trying to get PowerShell or vb scripts to analyze and manipulate your large flat-file storage contents, or spend a few bucks for Tree Size Pro.  This product delivers.  I’ve used it to generate reports on storage usage, and to automate flat file storage cleanup tasks.  When I think about what it would have taken to do it programmatically, I’d still be working on it.

I’ve written about OneNote before, and how it can be utilized in IT.  Since that time, I’ve learned how to exploit it even more, and it goes with me everywhere.  It could be 10 times the price it is, and I’d still pay for it myself if needed.  It’s the pocket knife that should be in every Administrator’s tool chest.  The larger your team, the better it works.  Design documentation, troubleshooting active issues, project planning, research, etc.  It will help you become a better Administrator. 

This software allows for Unix, Linux, and Mac systems to authenticate against Active Directory.  It will allow for centralized management of these systems using Group Policy Objects in the same way you manage your windows machines.  I was one of their first customers, and have been thrilled to see it mature over the years.  Their Open Source edition is OEM integrated into Linux Distributions such as Ubuntu, Suse, and other products like VMware vSphere.  The free/Open source edition allows for you to join these systems to AD, while the commercial edition allows for centralized management.

If you need a solid windows based SSH client to connect to your Linux clients, this is it.  One version (.56b) also supports the “Generic Security Services API” or GSSAPI.  This means that if your Linux machines are domain joined using Likewise, you can leverage Active Directory to log in to that Linux system, inheriting your credentials so that it is all passwordless.  Included with it is “plink” which gives you the ability to run a *nix command remotely from the windows system.  Great for routines initiated from a windows workstation.  “Pscp” is the putty SCP client for getting files to and from that connected *nix system.

CionSystems AD Change Notifier
One of the interesting aspects of Active Directory is that there are object changes all the time, but as an Administrator, you have no way of knowing it. AD Change Notifier helps with that.  Simple, yet effective.  It sends you an email notification of object changes in AD.  You can select whether you want all types of changes (modifies, creates, deletes), as well as particular object types (users, machines, OUs, GPOs, etc.). You learn a little about how objects change in AD, and if you delegate AD responsibility, how and what is being changed in AD.

Wyse Pocket Cloud for the iPhone and iPad
Not unique in its purpose, but this RDP (and optionally PCoIP) client for the iPhone and iPad does what its supposed to do flawlessly.  Any app that can let you reboot a critical server from the golf course is good in my book.  Any app that lets you do that on the golf course, in front of the VP of the company is even better. (True story)

Long before the wonders of virtualization, there were byte-level disk imaging solutions to help you with your system protection and recovery needs.  This was like magic at the time, especially as it was becoming obvious file based backups of system partitions were never any good in the first place.  While it may not be needed in the Enterprise like it once was, there are still a few good use cases for it.  It’s also pretty handy to have on your home system, and every one of your neighbors home systems.  …Or the ones that know you’re in IT, and think you are their personalized technical support. 

CionSystems AD Self Service
Yet another tool from CionSystems.  It takes the burden off of IT for user account related activities.  Does the user need to change their cell phone number or their home address?  Does a Department Manager need to change the Title of someone’s position?  AD Self Service can do this, without ever giving these end users privileges.  Updating AD related attributes is especially important if you use other solutions that leverage AD information (Exchange, SharePoint, CRM, etc.).  AD Self Service also allows for a secure way for the user to unlock their locked out account.  The more users you manage, the more this product will help take the burden off of IT.

SolarWinds Subnet Calculator
Some networking purists would flog me on the side of the head for recommending such a cheater app.  But the fact is, I need quick and easy way to review subnetting options in order to make the right decision.  I can subnet manually much like I can do arithmetic manually.  I just choose not to.  I have other projects to allocate my time to, and I need the speed of a calculator to help me visit those options more quickly.  Subnet calculators like SolarWinds offer one other ability often overlooked; the ability to visualize the sizing of your subnetting.  You can create problems by making subnets too small, or too large.  Tools like this give a great visual representation of how you want to split networks.  It doesn’t excuse the requirement that every Administrator should fully understand how subnetting works.  (I still marvel at how brilliant IP subnetting is).  It’s that once they do, an Administrator should be able to use a tool to make it easier and faster for them to make the correct decision.

For as long as FTP has been around, and ubiquitous as it may seem, one might conclude that it all works the same.  Not true.  FTP Servers will have their own unique behaviors, just as FTP clients will have their own quirks.  The firewalls that the FTP traffic pass through add another variable that can frustrate end users and Administrators alike.  FileZilla seems to offer the most flexibility when working with remote FTP servers, and is what I use to handle a variety of different FTP needs.  FileZilla won’t eliminate inherent complexities with the FTP protocol as it traverses multiple networks, it just makes it easier to negotiate.
