Until recently, it was normal to have to arbitrate between cost, scalability and performance when choosing a storage system. And this trade-off was complex. The volume of data collected by enterprises continues to grow: according to IDC, between 2018 and 2023 the average annual growth rate will exceed 25% to reach a global datasphere of approximately 102.6 zettabytes. This non-stop increase in data is placing larger demands on storage systems as companies endeavor to extract the value from their data.
Thanks to the accessibility of computing resources (distributed computing clusters such as Hadoop and GPUs) and the democratization of algorithms (machine learning, deep learning, etc.), companies can finally reap the benefits of Big Data. In this context, Object Storage, which until now was considered as an archiving solution (active archive or long-term archive) must be reconsidered. Analysts such as IDC and GigaOm already recognize that new generation Object Storage solutions can deliver the high-performance required for Big Data use cases.
IDC: Growth in the use of Object Storage in companies that want to take advantage of Big Data
According to IDC, Object Storage is now viable for use cases requiring high performance, with more and more organizations using it alongside their Big Data processing platforms.
In its December 2019 IDC Innovators "Open Source Object Storage for High-Performance Workloads, 2019" [PAID ACCESS], IDC cites three "innovators" in this category. All three develop open source object storage solutions:
"Ceph dominates the market across all industries and customer sizes, with a focus on archiving," says Amita Potnis, research director, Infrastructure Systems, Platforms, and Technologies Group at IDC. "The market now has more open source options to consider as startups such as MinIO, SoftIron, and OpenIO develop object-oriented storage technologies specifically designed for high-performance environments."
OpenIO also appears in the latest IDC Marketscape: "Worldwide Object-Based Storage 2019 Vendor Assessment", as shown in the graph.
In an earlier IDC Technolgy Spotlight paper " Consider Object Storage for High-Performance Use Cases" [FREE], IDC also referred to the explosion in the use of object-based storage in companies that use AI, machine learning, deep learning, and/or connected object fleets (IoT) techniques. "These new uses are at the origin of both the growth of data (in many cases to hundreds of petabytes or even exabytes) and the need for real-time analysis. Cost and performance are immediate concerns for any organization with such projects.”
GigaOm: How to distinguish between Object Storage solutions
In his report Key Criteria for Evaluating Enterprise Object Storage [PAID ACCESS], published in November 2019, GigaOm analyst Enrico Signoretti reviewed 12 competing technologies
- Caringo Swarm
- Cloudian HyperStore
- Dell EMC ECS
- Hitachi Vantara HCP
- IBM Cloud Object Store
- NetApp StorageGRID
- Quantum ActiveScale
- Red Hat Ceph
- Scality RING
The GigaOm report highlights the advantages and disadvantages of choosing to implement these vendors’ object storage solutions, which all rely on different erasure coding algorithms to ensure data protection.
OpenIO stands out for our efficiency and is ranked in the high-performance segment.
These qualities were showcased with the #TbpsChallenge, where OpenIO demonstrated our performance by blasting through the Terabit per second write speed, to set a record of 1,372 Tbps on a production infrastructure provided by Criteo Labs.
These different studies all agree that a new generation of Object Storage technologies is emerging. OpenIO and MinIO stand out from the crowd as two players engaged in a "object storage speed wars", as Chris Mellor put it on Block & Files.
Performance is becoming a criterion of choice for storage systems
While performance has long been a (difficult to verify) marketing promise, it is now becoming the criterion around which the Object Storage market is reshaping itself. And for good reason! Even though legal archiving of certain types of data is still required, solutions such as public cloud storage, first generation object storage and even tape storage are low cost and more than adequate for this use. However, as soon as data is intensively used, these systems show their limits, mainly in terms of bandwidth and scalability.