Navigation ↓

#TbpsChallenge

Discover just how fast object storage can go!

To demonstrate the performance and the scalability of OpenIO Object Storage, we deployed our solution over more than 350 servers, lent to us by Criteo, the most important advertising platform for the open internet, that handles large volumes of data and uses advanced machine learning technologies. This benchmark allowed us to blast past the 1 Terabit per second mark – our actual achievement was 1.372 Terabits of written data per second.

Terabit Challenge Object Storage Network Usage

We reached this milestone twice, based on two distinct scenarios: a progressive injection with a gradual power increase by adding servers to the cluster in batches of 50, and then via an all-in injection immediately making use of all the machines provided.

We attained this very high performance under production conditions and on standard commodity hardware. The results of our benchmark confirm OpenIO’s software design, which is optimized for new data uses, and in particular the massive exploitation of artificial intelligence algorithms over Big Data/HPC clusters.

We would now like to inaugurate this record by inviting other market players to challenge their own technology! #TbpsChallenge

The copy of a data lake in production, a realistic use case.

Criteo operates a data lake comprising several thousand Hadoop nodes. A data lake of this type is both a “Big Data” computing platform and a data storage platform, integrated with HDFS. At Criteo, several thousand compute nodes host several dozen petabytes of storage locally.

We transferred the Hadoop data lake (2,500 servers) to an OpenIO cluster deployed within the same infrastructure (the machines being interconnected by a mesh network), by launching copies of production directory trees via DistCp .

It is therefore not a benchmark test with data generated on the fly, but a real use case of a Hadoop data lake backup, using a state-of-the-art data protection mechanism.

USEFUL CAPACITY: 38 PB

HARDWARE: 352 Servers  DL380 Gen10 HPE with the following features:

  • CPU: 2 × Intel® Xeon® Gold 6140
  • Memory: 384 GB RAM
  • System disk: 1 × 240 GB SSD
  • Data disks: 15 × 8 TB 7200 rpm HDD in RAID 0
  • Network interface: Single 10GbE

NETWORK

  • Servers connected to a “top of rack” (TOR) switch
  • Racks connected, by a cluster (pod) of 22, to a spine (switch network) via 160 Gbps (4 x 40 Gbps) links
  • Each pod is connected to a super-spine (network of higher level switches) via 800 Gbps (8 x 100 Gbps) links

1.372 Tbps write throughput: what does this number mean?

It means that OpenIO is able to write and protect 171 GB of data per second, since this performance includes data protection using 14 + 4 erasure coding (a combination that allows up to 4 servers to be lost within the cluster, without data loss).

The data protection mechanism also made it possible to temporarily lose 2 machines during the benchmark, without any impact (a total of 352 machines were made available).

No other object storage technology to date has claimed to have achieved such a high write throughput under production conditions. OpenIO approached the theoretical limits of network capacities made available by Criteo. In other words, OpenIO technology has not reached its own limits.

What does the benchmark demonstrate about OpenIO?

  • Robust large-scale deployment tools
  • Runs on standard hardware
  • Optimal use of available resources
  • Instant scaling of the platform, take immediate advantage of the added resources, through the absence of data rebalancing
  • Performance at terabit-per-second, including data protection

Download the full benchmark report

Cluster deployment, load balancing, optimizations and detailed metrics in more than 30 pages.

Get it now

OpenIO Terabit Object Store
Benchmark Report 1.372 Tbps: All about the record OpenIO set on Criteo’s infrastructure
Get the document

The team

  • Jérôme Loyet
    Jérôme LoyetR&D DevOps Engineer
  • Florent Vennetier
    Florence VennetierR&D Engineer
  • Yannick Bussy
    Yannick BussyDirector of Engineering
  • Jean-François Smigielski
    Jean-François SmigielskiCTO & Co-Founder
  • Maxime Brugidou
    Maxime BrugidouEngineering Director, SRE
  • Stuart Pook
    Stuart PookSenior Site Reliability Engineer

Discover the technical background of the benchmark

Cluster deployment, load balancing, optimizations and injection scenarios
#tbpschallenge - Challenge us if you can!

Meet with us

  • JRES 2019
    Terre d'innovation et de partage
    Dijon
  • Tech.Rocks Summit 2019 Paris
< Previous > Next