Can instant scaling remove the need to rebalance data?
You might have heard the term ‘instant scaling’, but what does this really mean? Is ‘instant’ even possible, and if yes, can instant scaling remove the need to rebalance data?
Before getting into the nitty gritty of instant scaling, we need to start with the concept of data rebalancing. If you’re in the business of digital data, then you’ll recognize the need for a robust and scalable storage system. Depending on how much data you manipulate and store, at some point the servers in that system will reach their limit, and you’ll need to add capacity. You increase capacity by adding new nodes to the platform, but this poses a temporary problem as in most cases these new nodes are not immediately functional.
Most object storage systems require you to rebalance data, which means to spread it evenly over the nodes within a platform, when you add new servers to extend the platform. Rebalancing data can be done in two ways, and unfortunately, neither is ideal. One option is to go about it quickly, but this leads to reduced performance. Unacceptable, if your business is based upon a reliable user experience. Unimaginable even, if you’re into processing big data workloads or AI algorithms.
The other option is to rebalance data slower, to maintain acceptable performance. And by slower, we’re talking about several weeks to several months…. But this method is equally far from ideal, as rebalancing operations sometimes conflict with rebuild. Moving large amounts of data between servers is a risky process that could lead to failures and other complications.
Unused capacity, wasted investment
Companies who know the pitfalls of rebalancing data tend to take precautions, to avoid going through the process too often. They estimate what their storage needs will be, several years into the future, and then purchase enough storage to last that amount of time. The problem is, they pay for the storage and servers in year X, but only use part of that capacity immediately. A large part of their investment sits in the racks and remains unused for years.
By the time the extra storage is actually needed, the servers may no longer be the best, highest-performing ones available in the market. Of course, there’s little incentive to change them out for newer models as payment has been made – and may even be ongoing.
Needless to say, rebalancing is a painful and expensive, yet mostly necessary, part of scaling an object storage platform. Or is it? Companies who have gone through the process understand why not having to rebalance data is such a big advantage.
Why start out with a disadvantage, when you don’t have to?
There is a better way to scale, and it doesn’t have to involve trade-offs. OpenIO is an object storage solution that can scale instantly and infinitely, without rebalancing data. You don’t have to sacrifice performance or risk failures when you scale. How do we do it? Two elements are key to scaling without rebalancing data: our unique architecture, which is designed as a Grid of Nodes, and our ConsciousGrid™ technology.
The Grid of Nodes architecture is more efficient than a ring architecture, because you can add servers and storage capacity one by one, or in small or large batches. You add each server as your storage needs evolve (which in most cases is progressive); and not months or years in advance!
Plus, you can add any type of server that you choose. These could be the latest models with the highest performance, or commodity machines that you already have in stock. OpenIO is hardware agnostic so you can create a heterogeneous platform, made up of the best servers for your specific needs, regardless of brand, model or generation.
The ConsciousGrid™ technology works on top of the Grid of Nodes. It uses an intelligent method to dynamically place data on the most appropriate node within the platform, at the best time. This is why we can add one or many servers to the platform and then use them immediately. Data is always stored on the most appropriate node. There is never the need to rebalance data while scaling, and performance remains consistent.
Other data placement technologies fill up servers based on static algorithms. When using this configuration, new nodes are not recognized until they receive data from the pre-existing servers, through rebalancing operations.
Want to learn more?
I’m just scraping the surface here on how the ConsciousGrid™ actually works. We’d love to get into a more detailed conversation or fill in any gaps, so feel free to get in touch. We can also show you a demo of the ConsciousGrid™ in action. I do hope you understand that the key benefits go way beyond the cool tech though! Instant scaling releases you from mandatory data rebalancing. This translates into consistent high performance, no wasted costs. My only question is: why would anyone ever rebalance data again, now that they know it’s unnecessary?