OpenIO Object Storage is a highly flexible, software-defined storage solution that can be installed on all types of hardware. Its ConsciousGrid technology enables the software to mix nodes of different types, sizes, or generations without affecting the efficiency of the cluster. OpenIO takes advantage of all available resources no matter where they are in the cluster.
Some months ago we wrote a sizing guide for X86, but people still ask me about examples of typical configurations. In this article I'll show you 3 examples of common configurations. And, since it is common for our customers to begin with recycled hardware and then move to new, more powerful nodes over time, I’ll give you a couple of tips to help you make the most out of the hardware you have available.
I'm going to discuss simple configurations (with the front-end/access layer and backend installed on the same node), but, depending on the use cases, they could be installed on separate nodes.
Here a few important rules to follow so you don’t have any performance issues:
- Flash memory: SSDs are not mandatory, but metadata access performance can benefit greatly from it. Each node used for active workloads should have 0.3% of flash memory added to the overall storage capacity (meaning 3 GB for each TB) for this purpose.
- CPU: 4 CPU cores are a minimum for each node, with 8 recommended for most intensive workloads.
- RAM: OpenIO runs in very small environments, but 8GB is usually the bare minimum supported in production. The more the RAM you have, the better the performance, especially because RAM is usually used as a cache for data and metadata.
- NICs: Redundant 10Gbit/s network interfaces are needed for better node availability (and performance).
These basic rules do not include CPU and RAM requirements for GridForApps, our serverless framework, but this could be a topic for another article.
1: Cold archiving
This type of application is characterized by long-term retention and low performance, with data written once and probably never accessed again. $/GB and durability are the most important factors to take into account, and performance expectations are generally quite low.
Depending on the size of the files and the capacity under management, the customer has several options. But most of our customers choose the simple route and opt for dynamic data protection. In this case, the system selects the most appropriate data protection scheme according to a pre-determined policy.
Nodes for this type of scenario are very inexpensive:
- Flash: it can be avoided in this case because the application won't require many operations on metadata.
- CPU: 4 CPU cores are more than enough in the vast majority of cases.
- RAM: 8GB of RAM.
- NICs: a single 1Gb/s port would be enough; but it is highly likely that 10Gbit won't add much to the cost, and 1Gb is becoming very hard to find on servers.
- Storage Capacity: to reduce the failure domain it is likely that every node will have between 12 and 24 SATA drives, but, in extreme situations, and to lower power consumption, it could be possible to use larger server configurations.
2: General purpose object store
Many enterprises are adopting object storage to store data coming from several applications, and as theis primary storage. In many cases they don't even know what they will store in the future because the number of applications and front-end appliances changes continuously. This is similar to what happens to ISPs: both of these use cases need a solution that can do a little bit of everything, especially at the beginning.
Nodes for this type of workload are a bit more performance conscious, making the cluster more reactive:
- Flash: the 0.3% rule applies, with 30GB of flash memory for every 10TB of HDDs.
- CPU: a modern 8-core CPU with support ISA-L for HW assisted Erasure Coding.
- RAM: 32 or 64GB of RAM.
- NICs: Dual 10Gbit/s interfaces for performance and redundancy.
- Storage Capacity: Again, 12/24 SATA drives is a configuration that makes it possible to have a balanced system and minimize the failure domain while keeping costs at a reasonable level when it is time to add a new node.
3: High throughput / media streaming
This use case is becoming increasingly common among our customers. Not only for media/streaming companies, but also enterprises and other types of organizations are now looking at object storage for their infrastructures when it comes to HPC, big data, and large backup solutions. And they want the best possible performance.
Nodes for this type of configuration could be quite resource hungry:
- Flash: the 0.3% rule is still okay, but some customers are adding more flash memory to their systems to get faster writes, moving data later with automated tiering functions. Also, all-flash systems are becoming popular when they manage billions of very small files or offload compute tasks to the object store through GridForApps.
- CPU: 16 or more cores are not an exception on this kind of system. And this is without taking into account G4A.
- RAM: 128GB of RAM or more, to speed up performance and have some room to run G4A functions.
- NICs: Dual 10/40 Gbit/s interfaces for performance and redundancy. Many customers bond multiple NICs together.
- Storage Capacity: very large nodes are usually used in this type of configuration, with 48/60 or more disks per chassis. As far as OpenIO is concerned, limits often come from the number of IOPS available from the disks, and this is also why some customers prefer hybrid or even all-flash configurations at times.
OpenIO is very flexible and configurable, allowing our customers to have the best freedom of choice depending on their needs. It is also capable of evolving quickly as new needs arise with mixed hardware configurations, thanks to ConsciousGrid technology.
This article presents just a few examples. If you want to know more about how to configure OpenIO you can download our x86 sizing guide or visit docs.openio.io to get more information.