Block, file or object storage: a brief history of the evolution of computer storage systemsLire la version française
Block Storage, File System and Object Storage are three ways to store and access data. They are often presented as three competing models, each with its own advantages and disadvantages. This approach is not inaccurate, but it won’t help you to fully understand the fundamental differences between these three approaches and, if necessary, to choose the most appropriate one according to your organization’s needs.
In this blog post, I’ll take a brief journey through the history of computing and the storage systems which have been strongly influenced in recent decades by the three phenomena: the development of office automation, the emergence of the web and the advent of distributed applications.
A deep dive into the foundations of computer systems
The history of storage media, from perforated cards to recent attempts to store information in DNA strands, is well documented. Disks, ZIP disks, CD-ROMs, DVDs, flash memory, SATA, SSD or NVMe disks… almost all of us have held one or more of these evolutions of storage in our hands. This progression hasn’t yet resulted in a complete transfer of data to the “cloud”, as shown by the persistence of archiving on magnetic tapes. The death of such solutions was announced prematurely, in fact volumes are still increasing today!
The parallel history of storage modes, from block storage to object storage, is less well known, probably because its roots lie in the lower layers of IT. So many layers of abstraction have been added over the years between hardware and software that uncovering their history can be like an exercise in caving[BHJ2] or archeology. Let’s go on this journey of discovery together!
In the beginning was the block, the smallest storage unit within a computer system, with an original size of 4KB (Kilobytes), or 4096 bytes and as many memory areas equivalent to 0 or 1, through which information can be stored in digital form. Initially, the storage could not be dissociated from the server. A hard disk contained the information read and written locally by the machine, via the motherboard from a physical point of view, and through the operating system from a logical point of view. Basic. Except that the size of the files handled by computers kept on growing. When the first microcomputers were developed in the 1970s, the 8-inch floppy disc was considered a “high-speed” storage device and could contain an entire operating system!
Phase 1: Dissociate the storage system from the server
Most personal computers have always had a single hard disk. Servers, on the other hand, were quickly designed to accommodate multiple disks. At first this was done to increase storage capacity, then for data protection (with the advent of RAID, the concept of which was first defined in the late 1980s) and eventually to tackle performance issues. However, this vertical resizing (scale-out) sooner or later reaches its limits: that of the server chassis.
The first technological breakthrough involved dissociating the storage system from the server, and allowing it to consume disks other than those placed inside its own chassis. This led to the appearance of Direct Attached Storage (DAS), which is the ability for a computer to access a disk connected to the machine as a device. Then came the Storage Area Network (SAN), a network attached hard disk system that allows a machine to access storage space via the Fibre Channel protocol in client/server mode. It then became possible to share a storage space between several servers. But not yet to read or write simultaneously from several machines, due to the difficulty of managing competing entries.
However, the dissociation of storage from servers was a significant step forward: this made it possible to design the first “complex” architectures to ensure high application availability. At its heart, the database is hosted on shared storage between a master machine and a slave machine, ready to take over in the event of unavailability. This active/passive mode will be quickly optimized with the active/passive crossover: the two machines work simultaneously and can take over from each other in the event of a problem, temporarily absorbing the other’s load (provided they have the capacity, since the load is then doubled). This avoids keeping a machine in sleep, which at best will only be used for a few minutes a year. And the use of two servers at 50% of their capacity makes it easier to handle any peak loads.
The slow evolution of hard disks
The main progress of magnetic platter hard disks (HDDs), born more than 60 years ago, has largely been a growth in their capacity, through an increase in their density. It was not until the arrival of Flash technology, recently democratized with SSD and NVMe disks, that a real breakthrough was achieved in terms of latency (data access time) and throughput, resulting in greater machine reactivity and a higher number of operations per second multiplied (IOPS).
Phase 2: Generalization of file systems
Driven by the development of office automation practices, and the need to collaborate by sharing and editing documents and folders simultaneously, file systems inevitably became more widely used. Storage in file mode (or “file-based”) is probably the easiest to understand. Its principle is exactly the one you can imagine when faced with a file explorer (or Finder for Mac OS enthusiasts). The data is stored in folders and sub-folders, forming a tree structure overall. The data is then accessed via a longer or shorter path, depending on the depth of the tree structure. This “hierarchical” storage method is still the most common for direct and Network Attached Storage (NAS) systems.
Following the growth of file systems, new protocols have emerged to organize communications between servers and shared storage spaces. Network File System (NFS) was developed by Sun Microsystems in 1984. Server Message Block (SMB) was initially created in 1985 by IBM before being popularized by Microsoft, which integrated it as a default file sharing system under Windows. Microsoft renamed it Common Internet File System (CIFS) in 1996 before returning to the original initials SMB in 2006. Samba, the open source implementation of the SMB protocol, first appeared in public in 1997 and is the most well-known and widely-used by enterprises.
For emerging web uses, in particular file uploading, File Transfer Protocol (FTP) has become the preferred method, coupled with the use of NAS. It perfectly met the need to have a back-end storage solution shared between several servers to create n-tier web applications. This is what drove the success of solutions such as those developed by NetApp, an American company that grew rapidly in the 2000s to become the second largest company in the data storage sector, between the industry giants Dell and HP.
Long live block storage!
The file system, which creates a virtual tree structure, is an abstraction layer that is superimposed on the “block device” (management of block writing at the kernel level). This is an obvious improvement, for the reasons mentioned above, but the file system has not killed block storage. And for good reason: the addition of an abstraction layer leads to a decrease in IO performance. In part this is because of the calculations required to maintain and present the tree structure. Performance is also impacted by the need to manage the system for concurrent writing[BHJ3] that is necessary if you’re going to allow multiple workstations or servers to access the data at once.
As a result, block storage continues to exist, especially for the storage of large databases accessed and modified intensively, or for high-performance virtual machine file systems.
Over time, storage solutions using block storage and the file system have improved from a technological point of view (performance gains for both storage media and controllers). Most importantly, providers have added services such as data protection. These range from different types of RAIDs, which optimized both redundancy and space consumption, to advanced functions such as synchronous or asynchronous data copying between two bays, or snapshots of entire volumes.
The addition of these value-added services has helped to maintain high costs: economies of scale, such as those related to increasing disk capacity, were largely absorbed by the growing sophistication of storage solutions. This was the era when proprietary solutions ruled the storage market: specific hardware, driven by proprietary software that was truly a black box for users. The high cost of storage bays was compounded by unavoidable maintenance contracts. In fact, when problems occurred it was almost impossible for organizations to resolve them without the help of the manufacturer. Such storage solutions were also lacking in flexibility: data tiering, in particular, was not possible within the same storage system.
But perhaps it was also a simpler time: the different systems had similar capabilities and if you had the budget, your CIO just needed to pick a preferred vendor. In honesty, the choice relied less on the technical characteristics of the solutions than it did on the talents of the vendor’s sales representatives!
Phase 3: Object Storage – limitless scrolling and software-based intelligence
While the volume of data to be stored has grown continuously (the curve is now an exponential one), the limits of the file system have gradually appeared. The file system – or more precisely the Distributed Lock Manager service – is unable to manage simultaneous connections from thousands of machines. And if the volume of data can reach petabyte levels, the number of files is not unlimited. Setting up a file system requires caching the tree structure at the same time as it is being explored by the user. When a file system contains a large number of files, this caching of the tree structure is RAM-intensive and can significantly affect performance. Ultimately: file system-based storage systems reach their limits before their capacity is fully exploited (the drop in performance is significant once the 85% fill rate is reached).
The only solution is to increase the number of storage systems, in other words to create silos. This means that data migrations must be carried out regularly, as soon as a silo fills up. These are risky operations, involving entire teams. On the other hand, most of the recently generated data is so-called unstructured data. To put it simply, this is all the information that is not organized into databases: office automation files, email histories, images, videos, logs…
The technological breakthrough that resulted from these new challenges was the transfer of intelligence from the hardware (which had become increasingly sophisticated and expensive) to the software. Software Defined Storage is, in a way, the logical continuation of the movement that revolutionized compute (with virtual machines) and network (with the software-defined network approach), before disrupting the world of storage. “Software is eating the world”, Marc Andreessen’s famous prophecy in 201[BHJ4] 1, meant that all companies, regardless of their field of activity, had to become software publishers – otherwise they were threatened with Uberisation. In the infrastructure business, this adage had been true for several years already.
The benefits of distributed storage
The idea of object storage is to use standard servers (x86 or ARM) to create a flat structure in which files are fragmented and distributed across all nodes of the cluster, according to different logic. Most of the time, it is a sharding algorithm that handles it, distributing the data in a relatively random way, but other methods are possible, as shown by the intelligent placement system implemented in our OpenIO solution. Each object has a unique identifier and metadata. The system uses this identifier and metadata to retrieve files and reassemble the fragments distributed between several machines.
This principle of distributed storage offers many advantages: infinite scalability in theory; more economical data protection (erasure coding software replacing the work of the RAID controller); the ability (theoretically, once again) to operate standard servers and heterogeneous hardware; better load balancing; and the ability to make the most of resource capacity.
Accessing data via an API also has the advantage of simplifying the development of applications that will use the data. Basically, everything is possible with three commands: GET, PUT and DELETE. And some of the data processing can be done directly on the storage system, thanks to metadata analysis. For example, in the case of online photo storage, the web application can offer the user dynamic collections (by date, by location, by type of camera…), without managing these sorting and grouping operations itself. All it needs to do is make a call via the API to list the photos with the relevant attribute in the metadata. It then presents to the application user all the photos that the Object Storage system returns in response to this call. For uses such as Big Data, this ability to dynamically generate thematic data collections based on what you want to study is a valuable asset.
Distributed systems: an idea from the world of research
The first research work behind the concept of Object Storage dates back to 1996. But the idea of running cheap servers and aggregating their capacities with a software brick, responsible for distributing tasks across all nodes of the grid, is even older. NASA, like many research centers around the world, has been equipped with supercomputers since the 1960s – large multi-ton, multi-million-dollar machines. The best known of these were the Cray supercomputers, which dominated the market between the 1970s and 1990s. In 1994, two NASA engineers, Thomas Sterling and Donald Becker, revolutionized the world of high-performance computing by changing paradigms. Rather than relying on ever more efficient and expensive hardware, the two engineers had the idea of creating a computing grid composed of many standard computers, running on free operating systems (GNU-Linux in general). What is the name of this invention? The Beowulf Cluster, which is now commonly used in the research world. In another field, Google made public in 2003 the principle of a distributed file system developed for their own use: Google File System. This architecture, designed to deliver scalability for storing search indexes, was a first step towards Object Storage. The only key things missing were the method of accessing data by a unique identification key (what was called Content Addressed Storage), and more sophisticated sharding mechanisms to intelligently break down the data on the cluster.
What’s next in the evolution of data storage?
Is there any chance of a new paradigm that will revolutionize the way data is stored and consumed once again? Nothing is less certain: the three existing models – block storage, file system and object storage – cover all current market needs. Object Storage, which is more recent, is probably the one that has the brightest future in terms of data volume. But there is still work to be done to keep all the promises of the concept and improve usability, if Object Storage is to become truly widespread. Block storage and file system, on the other hand, will have to keep pace with the ever-increasing demand for performance.
In recent years, we have seen a step backwards, consisting in re-associating storage and compute, with the so-called “hyper-converged” offers which bring the disks as close as possible to the hypervisors (virtual machine monitors) to reduce latency. But, once again, we are now back to disintegration, with the MVMe over Fabric (NVMe-oF) protocol, a model that Gartner calls “shared accelerated storage”.
The evolution of data storage is far from over. To be continued!