Enmotus Fuzedrive has Arrived!

It’s been a long while with lots of heads down time,  but we finally launched FuzeDrive™ Server for Windows and Linux in the last week or so at Enmotus. We can finally come up for air!

Thanks so much to the team that worked so hard to get it out there and the early adopter customers that helped get it to production level. Going back to hands on software development for the past 2-3 years has left me with a whole new level of respect for those folks who labor every day in front of a computer screen and we’re all reminded @enmotus just how much effort goes into getting a great product from inception to shipping.

I try not to promote any particular product at Tekinerd, but given this is one is near and dear, I’ll do my best to discuss objectively but be aware, I’m biased Smile.  That said,  I do use FuzeDrive as my primary boot volume (Pro version – still in beta) for both my primary desktops (Samsung Pro and Toshiba HDD), laptop (WD Black 2 drive #WDBlack2) and Windows home server, so it’s definitely part of my stable enthusiast collection here at home and has been for some time.

That said, let’s take a look FuzeDrive. Here’s a summary of some of its features:

  • A fully automated mapping based tiering software solution for Windows Server and Linux (i.e. not a cache)
  • Operates at the full performance of the solid state disk (SSD) for reads and writes
  • Easily combine fast SSDs (SATA or PCIe) with hard drives (HDDs) to create a virtual disk drive that operates with the equivalent performance of a pure SSD
  • SSD capacity is additive – now you can use that 512G SSD without wasting any capacity or carving it up into pieces
  • File pinning integrated into Windows Explorer with simple right click to lock files to either SSD or HDD (or command line)
  • Visual mapping and activity monitor to see at a glance if your SSD is actually in the right place!

For the true techies out there, FuzeDrive is a kernel based Windows virtual storport driver and Linux modular block driver based implementation that includes some user space apps for management, file pinning and the visual maps.

Let’s Start with Visibility

imageOne of the first things we tout about FuzeDrive is the visibility that mapping instead of caching gives the end user. You can see exactly where the SSD is being applied versus the storage activity using a simple at-a-glance tool we call eLiveMonitor.  As we talked to end users out there, their first reaction was “at last – I have way to see how effectively my SSD is working for me”. An example for the PC I’m typing from right now is shown here.

The eLive tool is a great way to show how the software keeps track of which files (or more correctly the file’s blocks) I’m using on my FuzeDrive virtual block device or disk and makes sure the most active of these files automatically live on the SSD. You can see instantly if activity and SSD mapping align. This all happens automatically and the software continuously adapts to new or different activity profiles i.e. if a bunch of new files are copied to the FuzeDrive virtual disk and are now accessed more than the ones I already have on there, it simply swaps all or part of the inactive files or file chunks with active ones from the HDD.

Versus Caching

So why is this any different from my run of the mill Intel SRT or SSD caching tools, or even a hybrid drive you may ask?

  • Performance is significantly faster. Mapping is much simpler and provides direct access to your SSD with minimal overhead. Mapping takes a few microseconds at best to turn virtual requests into actual physical disk IO requests and doesn’t require a cache hit-miss search algorithm to translate where data lives on the SSD (or HDD).
  • Capacity handling is significantly better for larger drives.  Caching uses both capacity and adds CPU overhead to manage itself. This combination limits the practical size and hit rate for a cache approach. FuzeDrive takes less 1% of the CPU, even if I’m using a 10TB fast tier! Furthermore, it doesn’t require SSD capacity to manage itself, only a small slither (a few hundred MB to 1-2GB for an average configuration). Hence, if you take a 1TB SSD and combine it with a 6TB HDD, you get the full 7TB to use for applications.
  • Programmable options allow the behavior to be altered from it’s defaults e.g. you can decide if relocation between SSD is to be decided based on size of requests (MBytes) or the number of requests (IOPs), reads traffic or read+write traffic and so on. FuzeDrive gathers a lot of statistics is can use to make more intelligent relocation decisions, and while best left in its automated mode, having that additional tweak capability often helps tune it for specific applications that need it.

microtiering

As a word of caution, and as covered in my other blogs on tiering vs. caching, FuzeDrive relies on a more reliable SSD device as data is relocated not copied. Today’s SSDs are certainly much more reliable nowadays, so this is no worse than a desktop machine or laptop which doesn’t have redundancy, but for servers this does matter. Hence, a RAID configuration for both the SSD and HDD tiers is definitely recommended (and supported) for those that require device level redundancy. For more budget conscious readers, you can also use the mirror capability of Microsoft Storage Spaces or Linux MD RAID.

Now for Performance

For a basic test, I used Crystal Mark 3.0.1 x64 on my Intel Z97 system to see how my Samsung 840 Pro “fuzed” with a Toshiba 3TB DT01ACA300 drive faired. I ran the test three times on a volume that been installed and running as a bootable FuzeDrive for 3-4 weeks. Note, this is the first time I’ve run Crystal Mark on this system. I’m not a huge fan of benchmarks as they are all biased toward single media type and don’t always handle mixed virtual media disks like a FuzeDrive well that has different performance depending on where you hit the volume. I was pleasantly surprised however.

Here are the results.

Run 1

cm_run1_cm

Run 2

cm_run2_cm

Run 3

cm_run3_cm

We see from the first run, I’m not quite seeing the max rate out of my SSD and I did notice the eLive monitor flash it’s SSD bars to indicate it was re-load balancing the drive. However, the second run figured out where it was hitting and started to reflect a result closer to the SSD performance.

The place it really shines is writes however which explains why the overall system performance in Windows (which is always doing background writes) works well. This also helps tremendously with virtual machines which can end up being much more write intensive due to their large RAM cache flushes back to disk.

Where You Can Use FuzeDrive

I  personally have FuzeDrive running on several home machines:

  • Dell Inspiron N4110 laptop with the WD Black 2 drive (2.5” 120G + 1TB dual drive #WDBlack2) using the Beta FuzeDrive Pro client version
  • Intel Z97 gaming/music station desktop machine using a Samsung 840Pro and Toshiba 3TB HDD and again the Beta FuzeDrive Pro client version
  • Window 2012R2 Storage Server box with the released FuzeDrive Server for Windows I’m experimenting with as a replacement for my existing Windows Home Server

Other test configurations running the Tekinerd lab also include an AMD Ubuntu 14.04 configuration. I’m also experimenting with Windows Home Server 2010.

Apple Fusion Drive, At Last A True Tiered Drive

We’ve been talking about the merits of SSD caching vs. tiering for a while now, but at last someone else did it right. Enter Apple’s Fusion drive announced earlier today.

For those needing a technical refresher on caching vs. tiering, see my earlier blog posts SSD Tiering versus Caching: Part 2 and SSD Caching versus Tiering. Unfortunately, confusion still reigns in many of the blog posts we’ve seen today and most people still don’t get what the real difference is. Yes they achieve something very similar, and no, Apple’s approach is not like Intel’s caching or anyone’s caching. It’s an approach like we do at Enmotus with our MicroTiering i.e. we move the data you care about the most onto the faster SSD and make sure the stuff you don’t use a whole lot lives on the slower hard drive. For the technical folks out there, the only difference is that Apple does it at file level and we do it at block level.

For the less technical, the best everyday parallel I can think of is that caching is like staying in a hotel room whereas tiering which is more like living in a home.  A hotel serves as a temporary place to hang out for a specific occasion and leave after a relatively short stay. If the hotel is busy, you generally have to move out after your pre-allotted time with no possibility to stay on. If it’s not so busy, you may be able to extend your stay. A tier on the other hand is more like a permanent home you moved to and where you live, have all your belongings with you and you can stay a whole lot longer, often for the rest of your life. Furthermore, your home is a whole lot nearer the everyday places you go most of the time and it will still be there and available to you every day as long as you need it.

So what’s the fuss about? Apple are the first guys to openly admit and support the notion that tiering is better than caching where SSDs are concerned, which I and the team at Enmotus whole heartedly agree with. Why? Here a few reasons we’ve discovered along the way.

  • Tiering delivers superior performance as it provides the full bandwidth of the SSD to the user for the active portions of your data
  • Almost all of the SSD’s capacity is useable storage and not hidden as with caching. In other words, I paid for this SSD and it visibily adds to my overall capacity and it’s not hidden behind a disk drive.
  • SSDs wear out the more you write to them. Caching increases the number of writes to the SSD and wears them out faster. Teiring is far friendlier. We’ve measured extremely low overhead writes to SSDs using tiering at Enmotus of the order of 0.01% to 1% above and beyond what the host wrote over just a few hours of usage.

Why should we care about tiering? Even the professional vendors of PCIe SSDs agree – skip caching and put all the user data on one large SSD if you can. Problem is, this is just too expensive for us everyday users who are not trying to replace large numbers of disk drives. Tiering provides that perfect balance between performance and cost while providing enough capacity to still store those large videos, photos and games.

The future is definately bright for SSD tiering.

SSDs and World of Warcraft

For enthusiasts and gamers, over the past several years, combining several hard drives in a striped RAID configuration has provided reasonable benefits over and above a single hard drive. With my new SSD insight however, it’s now apparent that RAID never really provided sufficient improvements that justified the additional cost, time and effort involved other than for a few applications. It was really the HD video editing users that benefited the most as they were able to increase the size and performance of a single storage volume and support merging of two or more HD video streams in real time as a single disk drive on it’s own just wasn’t fast enough i.e. they were forced to RAID. Since then, compression techniques have improved to the point where you don’t need to RAID disks even for video in many situations (except for the high end professional rigs).

Why were there no real improvements observed from RAID in gaming applications? Talking to a few gamer developers it’s pretty apparent that given the slow response times of hard disks, a lot of development hours have been spent ensuring that the software avoided hitting that slow rotating media at all costs, relying instead on RAM based caching. Then along came games like Blizzard’s World of Warcraft and Starcraft 2(ok  – full disclosure, two of my favorites) that had to load a significant amount of data from the hard drives as you loaded or transitioned to new areas of the online world. This load time becomes worse as customizations in the form of add-on utilities are added and as the expansion packs were introduced with richer media content.

Then Along Came Enthusiast Class SSDs

SSDs changed the situation considerably. We are now able to get a simple add-on component that operates just like a regular hard drive, but with 5x or more the data streaming capability directly off the drive and 100x+ (more for higher end) the random IO performance versus a regular hard drive. They are also small, compact and lower power just like a laptop drive, so no upgrades to your computer case and PSU as was typical in the RAID days.

We did a little experiment of our own to see how well SSDs behaved using a Monster Digital LeMans series SSD, a SATA 6Gbps model based on the newer Sandforce chipsets. The same Intel core i5 system was setup with a regular hard drive for one configuration and the same system then installed onto an SSD. In both cases, the entire operating system and World of Warcraft game was loaded onto the single storage devices with no other applications or antivirus installed other than Fraps for capturing video.

 

Game load time World of Warcraft on Hard Disk followed by SSD

 

Configured with a 7200 RPM 500G disk drive, the time taken to load World of Warcraft with a significant number of add-on helper applications was nearly 2 minutes 30s, including the time taken for all the neighboring characters and players to load. The exact same configuration on the SSD took a little over 7s. To see how much this was just the SSD versus the 6G SATA interface (the Intel motherboard we used had 2 AHCI SATA 6G ports) we repeated the test with a a 3G SATA device and it only took around 8 seconds i.e. unless your application really needs the extra few seconds, 3G SATA works pretty well also.

The system configuration we used for this test was:

  • Intel Core i5 3.1GHz
  • Asus P8 Z68-V/Gen3 Motherboard
  • 4G RAM
  • AMD Radeon HD 6770 1G Graphics Card
  • WD 500G 7200 RPM SATA drive for HDD based testing
  • Monster Digital LeMans 480G SSD SATA 6G drive for SSD based testing

Conclusion

If you can afford it, take a serious look at an SSD. They speed up your system without a doubt in gaming applications and the system just works better overall. Professional gamers have already made the transition. The only thing to be concerned about is reinstalling your operating system and applications to get the maximum possible performance which is fine for most enthusiasts, but for the adventurous, there are solutions coming to help address this also or you can try migrating your existing system over using some of the software tools available.

Overall, very happy with the change to SSD.

SSD Tiering versus Caching: Part 2

A while back I wrote about some of the differences between caching and tiering when using solid state disk (SSD) drives in a PC or server.

Having just returned from the 2011 Flash Memory Summit in Santa Clara, I feel compelled to add some additional color around the topic given the level of confusion clearly evident at the show. Also I’d like to blatantly plug an upcoming evolution in tiering, called MicroTiering from our own company, Enmotus which emerged from stealth at the show.

The simplest high level clarification that emerged from the show I’m glad to say matched what we described in our earlier blog (SSD Caching versus Tiering): caching makes a copy of frequently accessed data from a hard drive and places it in the SSD for future reads, whereas tiering moves the data permanently to the SSD and it’s no longer stored on the hard drive. Caching speeds up reads only at this point with a modified caching algorithm to account for SSD behavior versus RAM based schemes, whereas tiering simply maps the host reads and writes to appropriate storage tier with no additional processing overhead. So in teiring, you get the write advantage and of lesser benefit, the incremental capacity of the SSD which becomes available to the host as usable storage (minus some minor overheads to keep track of the mapping tables).

Why the confusion? One RAID vendor in particular, along with several caching companies, are calling their direct attached storage (or DAS) caching solution “tiering”, even though they are only caching the data to speed up reads and data isn’t moved. Sure write based caching is coming, but it’s still fundamentally a copy of the data that is on the hard drive not a move and SSD caching algorithms apply.

Where Caching is Deployed

SSD caching has a strong and viable place in the world of storage and computing at many levels so it’s not a case of tiering versus caching, but more when to use either or both. Also, caching is relatively inexpensive and will most likely end up bundling for free in PC desktop applications with the SSD you are purchasing for Windows applications for example, simply because this is how all caching ends up i.e. “free” with some piece of hardware, an SSD in this case. Case in point is Intel and Matrix RAID, which has now been enhanced with it’s own caching scheme called Smart Response Technology (SRT) currently available for Z68 flavor motherboards and systems.

In the broader sense, we are now seeing SSD caching deployed in a number of environments:

  • Desktops (eventually notebooks with both SSD and hard drives) bundled with SSDs or as standalone software e.g. Intel SRT and Nvelo (typically Windows only)
  • Server host software based caching e.g. FusionIO, IOturbine, Velobit (Windows and VMware)
  • Hardware PCIe adapter based server RAID SSD caching e.g. LSI’s CacheCade (most operating systems)
  • SAN based SSD caching software, appliances or modules within disk arrays e.g. Oracle’s ZFS caching schemes (disk arrays) or specialist appliances that transparently cache data into SSDs in the SAN network.

Where Data Tiering is Deployed

Tiering is still fundamentally a shared SAN based storage technology used in large data sets. In its current form, it’s really an automated way to move data to and from slow, inexpensive bulk storage (e.g. SATA drives, possibly even tape drives) to fast, expensive storage based on its frequency of access or “demand”. Why? So data managers can keep expensive storage costs to a minimum by taking advantage of the fact that typically less than 20% of data is being accessed over any specific period of time. Youtube is a perfect example. You don’t want to store a newly uploaded video and keep it stored on a large SSD disk array just in case it becomes highly popular versus the other numerous uploads. Tiering automatically identifies that the file (or more correctly a file’s assocatied low level storage ‘blocks’) is starting to increase in popularity, and moves it up to the fast storage for you automatically. Once on the higher performance storage, it can handle a significantly higher level of hits without causing excessive end user delays and the infamous video box ‘spinning wheel’. Once it dies down, it moves it back making way for other content that may be on the popularity rise.

Tiering Operates Like A Human Brain

The thing I like about teiring is that it’s more like how we think as humans i.e. pattern recognition over a large data set, with an almost automated and instant response to a trend rather than looking at independent and much smaller slices of data as with caching. A tiering algorithm observes data access patterns on the fly and determines how often and more importantly, what type of access is going on and adapts accordingly. For example, it can determine if an access pattern is random or sequential and allocate storage to the right type of storage media based on it’s characteristics. A great “big iron” example solution is EMC’s FAST, or the now defunct Atrato.

Tiering can also scale better to multiple levels of storage types. Whereas caching is limited to either RAM, single SSDs or tied to a RAID adapter, tiering can operating on multiple tiers of storage from a much broader set up to and including cloud storage (i.e. a very slow tier) for example.

MicroTeiring

At the show, I introduced the term MicroTiering, one of the solutions our company Enmotus will be providing in the near future. MicroTiering is essentially a direct attach storage version of its SAN cousin but applied on the much smaller subset of storage that is inside the server itself. It’s essentially a hardware accelerated approach to teiring at DAS level that doesn’t tax the host CPU and facilitates a much broader set of operating systems and hypervisor support versus the narrow host SSD caching only offerings we see today that are confined to just a few environments.

Tiering and Caching Together

The two technologies are not mutually exclusive. In fact, it is more than likely that tiering and caching involving SSDs will be deployed together as they both provide different benefits. For example, caching tends to favor the less expensive MLC SSDs as the data is only copied and handles the highly read only transient or none critical data, so loss of the SSD cache itself is none critical. It’s also the easiest way to add a very fast, direct attached SSD cache to your sever provided your operating system or VM environment can handle it.

On the other hand, as tiering relocates the data to the SSD, SLC is preferable for it’s higher performance on reads and writes, higher resilience and data retention characteristics. In the case of DAS based tiering solutions like MicroTiering, it is expected that tiering may also be better suited to virtual machine environments and databases due to it’s inherent and simpler write advantage, low to zero host software layers and VMware’s tendencies to shift the read-write balance more toward 50/50.

What’s for sure, lots of innovation and exciting things still going on this space with lots more to come.

Snapshot and Full Image Backup for Windows Home Server

One of the short comings of the original 32 bit Windows Home Server (WHS) for me was the lack of any built in tools to backup the primary WHS boot drive. While using a RAID 1 boot drive would protect me from a disk drive crash, it didn’t offer the capability to “rewind” back to a former backup copy to fix a system drive corruption issue. To compound the problem, my favorite Windows disk image backup utility doesn’t support server based operating systems, presumably because they have a higher cost enterprise class version they sell into the classic server markets.

So for most of us, the problem remains on how to easily backup and restore the primary boot volume of the WHS server or upgrading the entire server hardware without losing the current WHS configuration. The good news is that backup and restore for a primary boot drive becomes significantly easier when you are running as a virtual machine. Better still, these come for free if implementing WHS on VMware’s ESXi hypervisor as illustrated in At Home with ESXi posted earlier.

Additional Backup Options Provided by a WHS ESXi Setup

With VMware ESXi, you automatically get two ways to create backups of your primary home server boot drive via the vSphere utility run from your regular PC:

  • Snapshot – takes a point in time copy of the complete WHS virtual machine on the same physical drive as the primary WHS boot image. Total time around 3 mins and take as many as you have room for on the disk.
  • Full Image Copy – the complete WHS virtual machine is copied via the network to your local PC or a network drive. Total time will be several hours depending on your network speed (70GBytes take a while to copy across home network)

Read more at System Drive Backup Options for Windows Home Server on VMware ESXi.

Loving Personal Cloud Storage with SugarSync

I love simple-to-use technology, being a great advocate of the KISS principle. I could writesugarsync_logo a thesis on how many times I’ve seen engineers and marketers create funky steps in products to do the most basic things, usually because not enough attention was paid pre-design to figuring out ahead of time the multiple ways different end users may use a product out of the box. 

So I was delighted when within minutes, I was using my first real cloud storage application and sync’ing up files on my laptop to a virtual storage drive at cloud storage provider sugarsync popupSugarSync. The first 5G is free so there’s plenty of room for an average user’s photos and also music files as a starter. My first step was selecting my non-personal files such as downloaded iTunes music file directory (warnings provided that this was for back purposes only) and photos. The easy to setup configuration utility started sync’ing my photo and music files almost instantly after I identified which directories to use, which took a few hours to get fully copied over via my cable modem connection. The SugarSync folks provide a nice little task bar popup menu and applet to access the online files and see where you’re sync is up to if you get curious. Once complete, it only copies over the changes, adds or deletions helping keeping the on-going traffic to an absolute minimum.

I then downloaded the iPhone SugarSync app. That’s when I got reallysugarsync_iphone excited. I could clearly see my computer’s files and more importantly, it was very simple to get to them and display the photos directly on my iPhone. What I liked even better was that I could select a file and send it to someone while out roaming about. For all intents and purposes, it was as though I was emailing it directly from my laptop. Very handy if you have specific documents that you may want to send someone instantly while travelling. Adding another PC was also easy. From the SugarSync web site, after logging in you can download the sync application to connect the new PC into the cloud storage based drive I’d created for myself earlier, and add new share areas from that PC also.

Viewing which files were up on the cloud drive was just a matter of clicking on the task bar applet installed (pop up menu shown above) and selecting the View Files in SugarSync File Manager option which opens up their file manager like application. From there, it was pretty simple to see what you had up on the drive as well as select files to share or send to others if required.

sugarsync screen

Alternatively, you can get to a similar view by visiting the web site and logging in from any PC. Nice if you need to use someone else’s computer to get to your data.

So far, I really do recommend the SugarSync approach for those of you looking for a simple file backup and sharing mechanism for photos and music. The free 5G certainly helps draw you in, especially as they don’t ask you for your credit card details for that “just in case you go over” situation – a nice change. I’m not quite there on storing more personal or sensitive data as I’m still in the wait-and-see-mode with respect to trusting that my data is stored securely. We just don’t have enough history to know how secure this type of approach is yet but hopefully time will demonstrate this.

That said, nice job SugarSync on a great and easy-to-use product!

Will Traditional Disk Array Vendors Survive?

The more you look at how SAN disk array systems are evolving, the more you start to wonder how traditional disk array systems vendors can survive long term without a significant change in their business model. What’s been a business primarily focused on building solid high-availability RAID arrays is now facing a serious commoditization challenge as standard Intel compatible server hardware is adapted to storage device applications. Furthermore, the latest or pending acquisitions of 3PAR, Isilon and now Compellent, puts significant pressure on the last few independent disk array vendors to start beefing up their virtualization or unified storage play.

There are a few industry trends that should raise concerns for traditional players:

  • There are fewer margin dollars available for incremental R&D for pure RAID disk arrays, not helped by the continuing drop in disk drive prices
  • Standard Intel compatible off the shelf CPUs, memory, chipsets and RAID components developed for higher volume servers and workstations are now being utilized in storage platforms running standard operating systems, or open storage software Linux or ZFS based  (e.g. OpenFiler)
  • We continue to see new hybrid storage-server hardware product offerings from white box equipment builders, including dual controller SBB solutions, that facilitate easier disk array and storage virtualization integration
  • SSD device performance increases are leading to newer approaches to storage systems that may not lend themselves to traditional disk array controllers and slow speed SANs
  • We are seeing many storage management features and thin provisioning making their way into server hypervisors, in many cases as “free” standard components
  • Large scale data intensive environments, such as those used by search engines, are favoring non-RAID, non-SAN architectures (e.g. Hadoop) utilizing low cost networked PC technology and Linux (JBOS – just a bunch of servers)

Does this spell the end for traditional RAID array guys? If you read many of the press around RAID today, then the outlook does not look good. However, in the short term there is still business to be had. The good news is that the storage equipment market, along with infrastructure upgrades in traditional corporate markets, takes many years to transition to new technologies. You just cannot switch out storage when something new comes along. It’s like trying to upgrade the train carriages while the train is moving; not practical, dangerous and will pretty much upset customers if you try to do it. You have to migrate non-disruptively to any new storage system to avoid significant service or application disruption. But this will not last forever.

They literally have to starting thinking outside the box. What is clear is that these traditional disk array vendors really need to start taking note of the highly successful Dell EqualLogic effect i.e. storage that is scalable at network system level, simple and cost effective from an operating cost standpoint as opposed to the device centric model they currently use.  The market has certainly rewarded this virtualized, scalable networked approach in spades given Dell’s reported revenue successes with this product family over the past few years, even though  it is more costly on a dollar per Gigabyte basis.

Some pure play disk array vendors or divisions such as LSI, Dot Hill and Xyratex (who OEM their products to larger companies such as Dell, HP and IBM) are already migrating upward by acquiring or building storage virtualization appliance companies, but it will take time for them to adjust to the new world as it takes a significant shift in the types of engineers to develop network level, virtualized storage. Then there is the issue of doubling investment for a while as you sustain one and develop another approach. Selling styles also change as the system level knowledge needs to target a different type of buyer in many cases. In their favor, they have the robustness, service and know-how advantage i.e. they know how to make, sell and support storage extremely well.

It wouldn’t be the first time an industry had to reinvent itself in order to survive and it will be interesting to see how it pans out over the next several years, and most importantly, who survives until the next round of commoditization.

SSD Caching versus Tiering

In some recent discussions, I sensed there is some confusion around solid state device (SSD) storage used as a storage tier vs, a cache. While there are some similarities and both are intended to achieve the same end result i.e. acceleration of data accesses from slower storage, there are some definite differences which I thought I’d try to clarify here. This is from my working viewpoint here, so please do post comments if  you feel differently.

Firstly, SSD caching is temporary storage of data in an SSD cache whereas true data tiering is classed as a semi-permanent movement of data to or from an SSD storage tier. Both are based on algorithms or policies that ultimately result in data being copied to, or removed from, SSDs. To clarify further, if you were to unplug or remove your SSDs, for the caching case, the user data is still stored in the primary storage behind the SSD cache and is still from the original source (albeit slower) whereas in a data tier environment, the user data (and capacity) is no longer available if the SSD tier were removed as the data was physically moved to the SSDs and most likely removed from the original storage tier.

Another subtle difference between caching and teiring is if the SSD capacity is visible or not. In a cached mode, the SSD capacity is totally invisible i.e. the end application simply sees the data accessed much faster if it has been previously accessed and is still in cache store (i.e. a cache hit). So if a 100G SSD cache exists in a system with say 4TB of hard disk drive (HDD) storage, the total capacity is still only 4TB i.e. that of the hard disk array, with 100% of the data always existing on the 4TB with copies only of the data in the SSD cache based on the caching algorithm used. In a true data tiering setup using SSDs, the total storage is 4.1TB and though this may be presented to a host computer as one large virtual storage device, part of the data exists on the SSD and the remainder on the hard disk storage. Typically, such small amounts of SSD would not be implemented as a dedicated tier, but you get the idea if say 1TB of SSD storage was being used in a storage area network system of 400TB of hard drive based storage creating 401TB of usable capacity.

So how does data make it into a cache versus a tier? Cache and block level automated data tiering controllers monitor and operate on statistics gathered from the stream of storage commands and in particular the addresses of the storage blocks being accessed.

SSD Caching Simplified

Caching models typically employ a lookup table method based on the block level address (or range of blocks) to establish if the data the host is requesting has been accessed before and potentially exists in the SSD cache. Data is typically moved more quickly into an SSD cache versus say tiering where more analysis of the longer term trend is typically employed which can span hours if not days in some cases. Unlike DRAM based caches however where it is possible to cache all reads, a little more care and time is taken with SSDs to ensure that excessive writing to the cache is avoided given the finite number of writes an SSD can tolerate. Most engines use some form of “hot-spot” detection algorithm to identify frequently accessed regions of storage and move data into the cache area once it has been established there is a definite frequent access trend.

Traditional caching involves one of several classic caching algorithms which result in either read-only or read and write caching. Cache algorithms and approaches vary by vendor and dictate how a read from the HDD storage results in a copy of the original data entering the cache table and how long it ”lives” in the cache itself. Subsequent reads to that same data who’s original location was on the hard drive can now be sent from the SSD cache instead of the slower HDD i.e. a cache hit (determined using a address lookup in the cache tables). If this is the first time data is being accessed from a specific location on the hard drive(s), then the data must first be accessed from the slower drives and a copy made in the SSD cache if the hot spot checking algorithms deems necessary (triggered by the cache miss).

Caching algorithms often try to use more sophisticated models to pre-fetch data based on a trend and store it in the cache if it thinks there a high probability it may be accessed soon e.g. in sequential video streaming or VMware virtual machine migrations where it is beneficial to cache data from the next sequential addresses and pull them into the cache at the same time as the initial access. After some period of time or when new data needs to displace older or stale data in the cache, a cache flush cleans out the old data. This may also be triggered by the hot spot detection logic determining that the data is now “cold”.

The measure of a good cache is how many hits it gets versus misses. If data is very random and scattered over the entire addressable range of storage with infrequent accesses back to the same locations, then the effectiveness is significantly lower and sometimes detrimental to overall performance as there is an overhead in attempting to locate data in the cache on every data access.

SSD Auto Tiering Basics

An automated data tiering controller treats the SSD and HDDs as two separate physical islands of storage, even if presented to the host application (and hence the user) as one large contiguous storage pool (a virtual disk). A statistics gathering or scanning engine collects data over time and looks for data access patterns and trends that match a pre-defined set of policies or conditions. These engines use a mix of algorithms and rules that indicate how and when a particular block (or group of blocks) of storage is to be migrated or moved.

The simplest “caching like” approach used by a data tiering controller is based on frequency of access. For example, it may monitor data blocks being accessed from the hard drives and if it passes a pre-defined number of accesses per hour “N” for a period of time “T”, then a rule may be employed that says when N>1000 AND T>60 minutes, move the data up to the next logical tier. So if data being accessed a lot from the hard drive and there are only two tiered defined, SSD being the faster of the two, the data will be copied to the SSD tier (i.e. promoted) and the virtual address map that converts real time host addresses to the physical updated to point data to the new location in SSD storage. All of this of course happens behind a virtual interface to the host itself who has no idea the storage just moved to a new physical location. Depending on the tiering algorithm and vendor, the data may be discarded on the old tier to free up capacity. The converse is also true. If data is infrequently accessed and lives on the SSD tier, it may be demoted to the HDD tier based on similar algorithms.

More sophisticated tiering models exist of course, some that work at file layers and look at the specific data or file metadata to make more intelligent decisions about what to do with data.

Where is SSD Caching or Tiering Applied?

Typically, SSD caching is implemented as a single SATA or PCIe flash storage device along with an operating system driver layer software in a direct attached storage (DAS) environment to speed up Windows or other operating system accesses. In much larger data center storage area networks (SAN) and cloud server-storage environments, there are an increasing number of dedicated rackmount SSD storage units that can act as a transparent cache at LUN level where the caching is all done in the storage area network layer, again invisible to the host computer. The benefit of cache based systems are that they can be added transparently and often non-disruptively (other than the initial install). Unlike with tiering, there is no need to setup dedicated pools or tiers of storage i.e. they can be overlaid on top of an existing storage setup.

Tiering is more often found in larger storage area network based environments with several disk array and storage appliance vendors offering the capability to tier between different disk arrays based on their media type or configuration. Larger tiered systems often also use other backup storage media such as tape or virtual tape systems. Automated tiering can substantially reduce the management overhead associated with backup and archival of large amounts of data by fully automating the movement process, or helping meet data accessibility requirements of government regulations. In many cases, it is possible to tier data transparently between different media types within the same physical disk array e.g. a few SSD drives in RAID 1 or 10, 4-6 SAS drives in a RAID 10 and 6-12 SATA drives in a RAID  i.e. 3 distinct tiers of storage. Distributed or virtualizaed storage environments also offer either manual or automated tiering mechanisms that work within their proprietary environments. At the other end of the spectrum, file volume manager and storage virtualization solutions running on the host or in a dedicated appliance can allow IT managers to organize existing disk array devices of different types and vendors and sort them into tiers. This is typically a process that requires a reasonable amount of planning and often disruption, but can yield tremendous benefits once deployed.

© 2010-2014 TekiNerd™ All Rights Reserved -- Copyright notice by Blog Copyright