If you had a storage bottleneck and wanted to resolve it with software in less than 2 hours
….or….
If you wanted to have a Swiss army knife of performance enhancing techniques at your disposal enabling you to leverage the most cost effective and constantly evolving(think moore’s law) components of the commodity hardware industry
….then….
PernixData was built for you. It is not without its share of challenges and constraints like any technology but it does what it is advertised to do as easily as PernixData advertises it will!
Summary
Leveraging 24GB of RAM across 10 VMware hosts and 160 VMs we were able to offload ~600 IOPS from our SAN at sub millisecond(<.1 ms) response times. Prior to PernixData the read response times from our Hybrid SAN were approaching 20ms more and more frequently in response to our always growing(in size and #) VM’s. The only traditional solution was going to be to add more SSD to the SAN or purchase a new san altogether. However, in a using server side RAM, PernixData was able to offload sufficient read IO to put the SAN back into operating within normal latency thresholds.
The PernixData solution provides the best of the hyper-convergence world (i.e. extremely low latency), the best of the commodity hardware world (low cost SSD, PCI Flash and multi core) and the best of traditional SAN architecture (scaling capacity separate from scaling compute and memory) without the headaches; scaling performance is the challenge with modern SAN architecture, marrying capacity, performance and compute is the challenge for hyper-convergence.
We will be testing the server side SSD as cache rather than RAM in part 2.
Background
The story about how we ended up testing PernixData starts off like many of our adventures exploring new technology, with me waiting longer than I should to make our next investment in hardware and scrambling to find the best way to delay further. You see the read latency degradation in our Hybrid SAN was really self inflicted situation because as we added more and more VM’s to our environment over the past year(we are at 160 right now) the read times slowly increased as the cache hit rate slowly decreased. We had the tools in place that made us aware that this trend was happening but didn’t want to invest in new hardware till we were full on capacity (all our disk space used up). Putting our situations into context with the PernixData value proposition you could say, we wanted to hold off on making hardware investments because we wanted to buy disk capacity and disk performance at the same time. Although this sounds like a simple example of us being cheap there are a lot of compounding issues the made delaying on a hardware purchase seem like a good judgment call. The first is that the large majority of our VM’s could easily tolerate read latencies spiking in the 20 ms range. The other issue is that our best user experience gauge is customers using hosted desktops (XenApp) and none of them were complaining. Not only were they not complaining we had some comment on how fast their desktops applications were. So although our tools were telling us latency was slowing increasing our users were telling us everything is good. Since our environment was providing great user experiences from the majority of our customers and we really didn’t think the rising read latencies were going to cause problems.
In fact it was only one customer that actually complained. That customer had a series of latency sensitive databases (not MS SQL) that were not getting the performance they needed. This customer had always been a good customer and one we wanted to keep….so we looked at our existing bag of tricks and realized the only solution we knew to fix required more hardware, more rack space and more power. In addition, purchasing new hardware meant that all the performance gains would be shared among customers that were all operating sufficiently.
During our troubleshooting process with the customer, we also realized that we didn’t have a way of “ruling out” if the customers issue was really caused by IO Latency. We reasoned that if we had a switch we could flip to get their virtual machines better performance we could quickly rule out latency as the cause and work with the customer to fix their application/code…or if adding performance solved the problem we could charge him more for the higher tier of performance.
We realized we were missing a critical tool. We thought about looking at all flash arrays but they are expensive for general purpose workloads and if we wanted to go the new hardware route we already had a hybrid san vendor that we liked, trusted and gave us the right mix of performance and capacity for most of our needs.
So we looked at software to solve this problem, specifically software that allows distributed caching in RAM. We looked at Infinio (too immature), Atlantis (too many marketing spins made it seem complicated), SoftLayer (don’t trust SanDisk’s long term intentions) and PernixData*. PernixData was eloquent (because of its use of a VMware kernel driver), powerful (because of the ability to use RAM or SSD or both), and simple (because it took less than 2 hours to solve our problem). Our first tests were with RAM caching and without the write back feature as we did not have any SSD in our blades to test using SSD as read cache. We will be adding SSD’s in the next month and writing about our experiences.
*Note that we did not do a complete evaluation of each of these products and my analysis is based on limited research any/or my own personal views after doing preliminary research.
Technical Tests
I have some very specific technical tests I like to do on my storage and the exact tests depend on what I am trying to test. In this case, I am not trying to test the performance of the SAN in general but rather understand the benefits of PernixData. Accordingly test is designed to ensure I understood the benefit of premix data as compared to the SAN. I was not testing max performance of PernixData or anything like that just trying to understand if PernixData is faster than the SAN for a given workload.
IOMeter – rule 1 in testing IOMeter against a SAN is don’t use the default IO file as it doesn’t represent realistic data patterns. However, I wasn’t testing the SAN as I said above. I wanted to see if the PernixData from RAM would deliver better results than the same data set delivered from our SAN…and it did…by a lot. I set the queue depth to 50 for both tests and did 100% read test at 16KB blocks. The first screenshot shows the results from the VM accelerated by PernixData and the second shows with no acceleration…Both tests used the same parameters on the same VM. You can see the PernixData accelerated test shows much lower average latency, much lower max latency and much higher throughput. The reality is that the Hybrid SAN can handle a lot more than this test shows both in terms of IOPS and Throughput …however it can’t achieve its best with only 1 VM and 50 queue depth because the variance in latency prevents the single VM from consuming equal bandwidth as compared to that same VM with accelerated with PernixData (this is because Pernix is providing less variance in latency and a lower average latency). To summarize this test shows that RAM is orders of magnitude faster than Hybrid SAN. Technically speaking it is RAM located on the same host that the guest is on. This is the fastest read times possible….faster than inifiband, faster than violin memory, faster than ssd san, faster than hyper convergence, faster than any DAS stuff, faster than PCI Flash. There is no reasonable way to get a read request to your guest faster than a cache hit on local RAM.
To further illustrate my point about RAM being extremely fast take a look at the screenshot below from the PernixData GUI. This shows the RAM from each VM host achieving <.1ms response times. Note that .1ms is 10 times faster than 1ms and in my experience only achievable by reads from RAM. In this screenshot you can also see the cache hit rates and data eviction rates. However it should be noted that I spread the RAM over such a large number of VM’s that the hit rate can only get so high so don’t glean any information about what is normal for a cache hit from this screenshot.
The screenshot below shows the latency over a 7 day period on our SAN. The red line with a P is when we turned on pernixdata acceleration and the latency is lower.
What I really like about PernixData is its ability to quickly give performance to all my VM’s, some of my VM’s or any combination. It is like having a dial to crank up performance as needed for which ever VM’s need it. I can provide better than SSD performance using RAM to accelerate, equal to SSD performance using commodity SSD to accelerate or provide no acceleration at all.
Room for Improvement with PernixData
The compression algorithm brings our RAM Cache from 328GB to 408GB…which is a low(bad) data reducation rate. Pernix Data seems proud of it because the compression algorithm they use doesn’t impact latency but in my mind it would be nice to get the 5X compression and deduplication gains that other vendors are touting in their all flash arrays at the cost of a couple of .1 ms. The screenshot below shows our compression.
There are many cases in which we use an iSCSI initiator from within the windows guest. PernixData can’t accelerate these volumes as they are LUNS on the SAN that don’t pass through VMware kernel driver. Although the limitation on this is due to the very nature of PernixData’s design it is still a limitation. Since most of my heavy hitting VM’s are designed this way, I now need to rethink how we build servers if we have a customer wanting the extreme performance PernixData is capable of.
PernixData doesn’t allow populating the cache (RAM or Flash) unless the guest is on the same host as the cache. Once the cache is populated you can read the cache from other hosts which is what allows vmotion to continue to work but you can’t populate the cache unless you are on a host with a RAM or SSD resource. The general concept of a global pool of cache in RAM duplicated seems powerful for VDI and Xenapp environments. With that type of setup it seems like all my OS level files would never hit disk again.
PernixData doesn’t let you have a resource from a host in more than one cache pool (called a cluster in PernixData). These two limitations mean you will have to think about how you setup your clusters in VMware. To give you an example lets say I want to take 8 GB of ram on all my hosts and give it to my XenApp servers. Then I want to take another 24 GB of ram on all my hosts and give it only to my heavy hitter servers. This type of configuration is not possible. These two facts mean that PernixData may not be as transparent as you would like from a VMware administration point of view as your PernixData design potentially needs to be understood by your people doing vmotion and provisioning servers. This is really the biggest flaw I have with this model. If I just add more SSD to my SAN and my technical staff don’t need to think about a software layer then maybe the capital purchase of the SAN is cheaper than the extra CPU cycles my people send thinking about software?
The easiest way to make an ROI with PernixData is to use commodity hardware. This means buying non HP/Dell SSD drives. This might not be comfortable for some people. You can use just the RAM cache feature like we did by buying extra ram(which is relatively cheap for how high performing it is) but if you want to use SSD’s then you need to understand that branded SSD’s are still expensive and be willing to purchase non branded SSD’s.
Conclusion
Hopefully documenting our experience helps some people wrap their head around what is a great product. PernixData seems like the most cost effective way to be able to selectively deliver ALL Flash like performance without an all flash array. Not including the the cost of the PernixData software I can add 10TB of SSD flash to my environment for about $10,000 and not take up any more U’s or power.
Whats Next
We haven’t tested the SSD Functionality of PernixData. Once we do we will write about it in part 2.