Discover just how fast object storage can go!
To demonstrate the performance and the scalability of OpenIO Object Storage, we deployed our solution over more than 350 servers, lent to us by Criteo, the most important advertising platform for the open internet, that handles large volumes of data and uses advanced machine learning technologies. This benchmark allowed us to blast past the 1 Terabit per second mark – our actual achievement was 1.372 Terabits of written data per second.
We reached this milestone twice, based on two distinct scenarios: a progressive injection with a gradual power increase by adding servers to the cluster in batches of 50, and then via an all-in injection immediately making use of all the machines provided.
We attained this very high performance under production conditions and on standard commodity hardware. The results of our benchmark confirm OpenIO’s software design, which is optimized for new data uses, and in particular the massive exploitation of artificial intelligence algorithms over Big Data/HPC clusters.
We would now like to inaugurate this record by inviting other market players to challenge their own technology! #TbpsChallenge
The copy of a data lake in production, a realistic use case.
Criteo operates a data lake comprising several thousand Hadoop nodes. A data lake of this type is both a “Big Data” computing platform and a data storage platform, integrated with HDFS. At Criteo, several thousand compute nodes host several dozen petabytes of storage locally.
We transferred the Hadoop data lake (2,500 servers) to an OpenIO cluster deployed within the same infrastructure (the machines being interconnected by a mesh network), by launching copies of production directory trees via DistCp .
It is therefore not a benchmark test with data generated on the fly, but a real use case of a Hadoop data lake backup, using a state-of-the-art data protection mechanism.
USEFUL CAPACITY: 38 PB
HARDWARE: 352 Servers DL380 Gen10 HPE with the following features:
- CPU: 2 × Intel® Xeon® Gold 6140
- Memory: 384 GB RAM
- System disk: 1 × 240 GB SSD
- Data disks: 15 × 8 TB 7200 rpm HDD in RAID 0
- Network interface: Single 10GbE
- Servers connected to a “top of rack” (TOR) switch
- Racks connected, by a cluster (pod) of 22, to a spine (switch network) via 160 Gbps (4 x 40 Gbps) links
- Each pod is connected to a super-spine (network of higher level switches) via 800 Gbps (8 x 100 Gbps) links
1.372 Tbps write throughput: what does this number mean?
It means that OpenIO is able to write and protect 171 GB of data per second, since this performance includes data protection using 14 + 4 erasure coding (a combination that allows up to 4 servers to be lost within the cluster, without data loss).
The data protection mechanism also made it possible to temporarily lose 2 machines during the benchmark, without any impact (a total of 352 machines were made available).
No other object storage technology to date has claimed to have achieved such a high write throughput under production conditions. OpenIO approached the theoretical limits of network capacities made available by Criteo. In other words, OpenIO technology has not reached its own limits.
What does the benchmark demonstrate about OpenIO?
- Robust large-scale deployment tools
- Runs on standard hardware
- Optimal use of available resources
- Instant scaling of the platform, take immediate advantage of the added resources, through the absence of data rebalancing
- Performance at terabit-per-second, including data protection
Download the full benchmark report
Cluster deployment, load balancing, optimizations and detailed metrics in more than 30 pages.
- Jérôme LoyetR&D DevOps Engineer
- Florence VennetierR&D Engineer
- Yannick BussyDirector of Engineering
- Jean-François SmigielskiCTO & Co-Founder
- Maxime BrugidouEngineering Director, SRE
- Stuart PookSenior Site Reliability Engineer