Auto-tiering is intended to help systems migrate workloads across different tiers of disks, making more efficient use of limited resources. The premise is that the most active data is moved to the highest performance media, and the least active data is moved to the lowest cost, higher capacity drives.
Auto-tiering of file systems used to be limited to moving whole volumes. Of course this was not a granular approach and generally required a lot of expensive high performance infrastructure to accommodate these larger blocks of data.
Today there are an increasing number of products that support a more granular form of auto-tiering, ranging from sub-LUN tiering to auto-tiering for files based on size, date or access frequency. The logic is the same: Older files, and files that are infrequently accessed are moved to inexpensive disks in second tier storage, while very active files are retained on high performance media to ensure accessibility and performance. However because the system’s IO analytics and management tools are focused on individual files or blocks of data rather than volumes, the effect is an even more efficient use of resources and a more fully optimized system.
This file-based optimization is destined to become increasingly important as accelerating growth of unstructured data puts ever greater pressure on data centers to optimize and cut costs. Unstructured data can include very large files or objects, such as video and images, as well as a humongous number of smaller files, such as email and logs. Scality RING is a cloud storage infrastructure solution that is particularly well adapted for unstructured data. By introducing object-based auto-tiering, it goes a step further than block-level or file-based auto-tiering,
Object-based auto-tiering is optimized for large unstructured data files. Scality Ring splits the large objects into smaller sizes automatically, without requiring any prior data sharding by the customer or the application. It writes each ‘part object’ to servers in the first tier of the storage system. In addition it can create a specified number of replicas for data resilience.
The data is then moved to the second tier according to customizable criteria. Thanks to Scality RING’s parallel processing, the different parts of an object are uploaded simultaneously, avoiding the long latency generally seen with large files and serial access.
In addition, Scality RING avoids common bottlenecks and dependencies: One of the common disadvantages of other auto-tiering systems is that objects are generally tracked in a centralized system. This creates a potential single point of failure. Instead, Scality relies on a dispersed key-value system to locate the object files. Rather than storing the keys, they are determined by calculated hashes that are completely independent of the object content. The core of the Scality solution is its uniquely logical supplementary intelligence that links storage nodes together and assures a balanced distribution of all the object files. The absence of centralization removes the potential for some common sources of system disruption.
Besides its optimization for large unstructured files, the logical architecture of Scality Ring is also well suited for huge numbers of smaller files. The offloading of these files to low cost second-tier storage requires only the addition of inexpensive scale-out hardware, even as the volume of files increases to extraordinary and previously unimaginable proportions.
Scality Ring delivers on the promise of unlimited scalability, capable of always honoring more data requests. RING’s implementation of auto-tiering ensures that the costs of the scale out operations remain marginal, and data resilience and availability are uncompromised at scale.
Monique Shefer is a technology analyst and strategic consultant to the software and software as service industries. Shefer is currently working for Scality – Storage System Pioneer and developer of RING Organic Storage.