OpenIO Announces the Release of OpenIO Version 19.10
New Product Release Aims for Accelerated Performance to Meet Big Data Use Cases
OpenIO, a leading provider of hyperscalable and agnostic hardware storage solutions, today announced the release of their latest version of storage software which includes performance improvements prototyped and tested in production conditions during the Tbps Challenge. This benchmark, carried out on an infrastructure provided by Criteo Labs, enabled OpenIO’s object storage solution to exceed the terabit per second mark in writing.
Last September’s #TbpsChallenge enabled OpenIO to demonstrate the hyper-scalability and high performance of the OpenIO object storage solution (1.372 Tbps on a cluster of 350 physical machines). To achieve this record, optimizations were made to various mechanisms inherent to the operation of OpenIO, in particular in terms of load balancing.
OpenIO has prototyped a load balancing mechanism based on intelligent redirections between nodes (307 redirections), to send the load to the most available servers at a given time. In the event that the node contacted by the primary load balancing mechanism is not available – or if another node is able to perform the required task more quickly – the redirection is performed. Each of the OpenIO S3 gateways is thus able to redistribute the load among all the machines in a cluster, to avoid a bottleneck at the platform’s end-point. This redirection system, tested in September, has been industrialized and integrated into release 19.10 to benefit all users. Similarly, optimizations related to the large-scale deployment of OpenIO technology have been implemented.
The performance of OpenIO has also been improved in metadata directories, the 3-level distributed databases that map data on the storage platform, from buckets to chunks. Calls made between the “meta1” and “meta2” directories have been reduced to the strict minimum to reduce latency when writing and reading data.
The task distribution (data reconstruction in case of loss of a disk/server and data relocation) has been redesigned. These operations, which can now be launched from the webUI, are now parallelized at the cluster level, to be performed more quickly by taking advantage of all available computing resources.
A version 2 of the data compression feature is available in this release, improving performance in terms of computation speed and space saving on the platform (depending on the file type).
COMPATIBILITY AND INTEGRATIONS
The #TbpsChallenge also consolidated the integration between OpenIO and Hadoop, the reference framework in the Big Data universe, validating OpenIO’s perfect compatibility with Hadoop version 3.1.1. via the DistCp command (distributed copy), and making it possible to read or write to and from HDFS or S3 with optimal performance. This allows OpenIO to be substituted for HDFS to form a Big Data cluster, by decoupling storage from compute to optimize both costs and performance, through a better allocation of hardware resources.
OpenIO’s compatibility with Apache Spark version 2.4.4 (distributed computing framework) has also been validated. Similarly, integration with HDF Kita, the connector between HPC applications and HDF5 data sets, has been optimized (read more).
Formalized with the integration of OpenIO into the iRODS consortium, the compatibility of OpenIO with the open source data management application iRODS version 4.2.4 is also on the menu of this release. Popular in the academic research community, iRODS automates data flows within a multi-tier storage environment by creating a unified namespace and a unique metadata catalogue. The OpenIO team is continuing its efforts regarding this integration, to minimize the flows required by iRODS during the file listing phase. The objective is for OpenIO to notify iRODS in real time of events on the storage platform, so that the metadata catalog is updated incrementally.
Finally, OpenIO now supports Python in version 3. An update made in anticipation of the end of Python 2 support on January 1, 2020, which is also a prerequisite to support new versions of the upcoming Ubuntu and CentOS distributions.
OpenIO is the 2nd wave of hyper-scalable, performance-oriented object storage solutions. In addition to the record performance in terms of achievable bandwidth, the software-defined storage technology developed by the OpenIO team offers infinite scalability, which does not require rebalancing data between different servers when the platform is expanded. This is a feat made possible by a dynamic and intelligent placement of data on the different nodes of the cluster, according to their state at the time T – while most Object Storage solutions, for simplicity, distribute data in a purely algorithmic way, without taking into account either the state of the platform or the nodes that compose it. ConsciousGrid™, the name of this process, constitutes a real technological breakthrough, because the operation of rebalancing data on a storage platform results in a significant drop in performance for several days or weeks, or even whole months when the volumes reach tens of petabytes.
In addition, OpenIO is a hardware agnostic, software-defined solution, capable of making the most of heterogeneous commodity servers, which makes it easy to upgrade your storage cluster, whether on the company’s premises or at a hosting company.
OpenIO’s technology has already attracted more than forty customers worldwide, including Dailymotion, the CEA and the Internet Initiative Japan (IIJ) service provider. OpenIO, which received initial support from Georges Lotigier (CEO of Vade Secure), raised $5 million from Elaia, Partech Partners and Nord France Amorçage in October 2017. In addition to its headquarters near Lille, within the Okto Campus it shares with Vade Secure (leader in email security, which has just raised €70 million), the company also has a sales office in Paris, and a team in Tokyo. In July 2019, the startup won the Pass French Tech 2018-2019, a national program to support companies in hyper-growth. On October 10, 2019, at the OVHcloud Summit in Paris, the European leader in the cloud announced that OpenIO had integrated its partner ecosystem. The OpenIO technology will soon be available “hosted by OVHcloud” and managed by OpenIO.