IoT and the Data Center. Storage Ready for the Post-Cloud Era
A few weeks ago a presentation from Peter Levin (a partner at the Andreesen Horowitz VC firm) anticipated the next wave of compute “after the cloud“: when we’ll be going towards a de-centralized world with powerful networks of smart devices that are able to de-centralize data collection, compute and store, acting as a micro datacenters… in your car, in your house, office or wherever.
Your IoT-based data center
No matter how your car or house is connected to the internet, these kinds of systems will need to process more and more information locally. The amount of data generated will be astounding. Just to make an example: a connected car will generate 25GB of data per hour, and it’s easy to envision the sort of problems we are likely to have in the next future, to upload (and process) this amount of data immediately to the cloud when millions of these cars are on the road. And what happens if you lose that connection? Even just for a few minutes?
Some data will be needed for real time operations, other for diagnostics and analytics. Everything will be stored for a certain period of time and some for longer periods… But you don’t need (or want) to send everything immediately to the cloud… unnecessary, inefficient and risky.
Your IoT-based micro datacenter will have enough power to do what is necessary to improve data footprint (by eliminating redundancies, normalizing data, discarding unnecessary information and so on) and upload only what is necessary asynchronously, when possible.
From the cloud to IoT
Last month, at AWS re:invent, Amazon announced an interesting tool set: AWS Greengrass. The idea is to bring some of the programming constructs you already use on its cloud to IoT devices… including AWS Lambda and the ability to store data locally.
At the same time, IoT devices are obviously becoming ever more sophisticated and powerful…. generating larger amounts of data with each new generation, but it’s becoming evident that they lack proper storage subsystems. What’s more, data is not shared locally but only through an upload operation to the cloud which, again, poses many challenges in terms of accessibility, security and reliability of the whole infrastructure.
It’s not about the media, flash memory exists at any price and any size, it’s more about the absence of proper data protection, resiliency, availability and, above all it’s not shared. We are just at the beginning of the IoT era and in the vast majority of the cases, especially in the consumer space, these limitations haven’t had an impact yet. But if you look at the future and complex industrial systems, the absence of an adequate distributed storage layer will become a major limitation, reducing overall efficiency and limiting the effectiveness and abilities of the whole system. At the end of the day, is it possible to think about a datacenter without a storage infrastructure…? This is why your IoT-based micro-datacenter needs it.
Shared storage for your IoT network
Shared storage as we know it doesn’t work for IoT devices. NAS or SAN are just too complex and, even though most IoT devices are based on Linux, far too many additional components would be needed for it to work, (drivers, file systems, etc.) and security could become an issue. Object storage is the way to go, it can be accessed directly via native APIs or HTTP and is easier to access from the application.
IoT storage must be distributed, you can’t think about a single storage device but, on the contrary, a multitude of devices with a small amount of storage can easily be part of a large distributed storage system. Think about 1000 raspberry Pis for example, each one of them with 300GB available. It would be 300TB (100TB with a three-way replica)!
It’s a compelling idea but this approach has its challenges. 1000s nodes for just hundreds of TB of storage? It means massive scalability, a lot of node rebalancing when a node disappears, complex node discovery and management that could impact performance. All problems that could easily make the system unusable pretty quickly. And while this type of issue could be very challenging in this particular (but futuristic) scenario, it is true that small ARM-based servers are becoming more of interest with hyper-scalers and large organizations, driving up the number of servers and posing identical challenges when it comes to storage infrastructures.
Challenging, but not for OpenIO
OpenIO SDS is a scale-out object storage platform that runs on nano-nodes and… does all the magic. It doesn’t require a lot of resources (a Raspberry Pi can easily run SDS, and our nano-nodes have even less resources in terms of CPU cores!). In fact, it was designed from day one to be lightweight (just like the kind of resources you could find in an IoT device….does it ring a bell?). What’s more, SDS doesn’t work like object storage systems, it isn’t based on distributed hash tables and load balancing is dynamic, designed around what we call Conscience technology – a set of advanced algorithms which measure and make cluster resources available in real time. Adding and removing nodes happens very quickly, and failures are managed in a matter of seconds.
Conscience technology enables another feature: Grid for Apps. By knowing the amount and where resources are available, SDS can run applications directly into the storage, triggered by events, and without any additional orchestration tool. This is a compelling characteristic of our system which allows to have compute and storage close to each other enabling to run applications where data resides! For example, think about running an image recognition software on one of the devices with an ARM GPU available on the network every time a new image is taken, adding metadata to the object and then making it accessible to other applications in the micro-datacenter.
OpenIO SDS, thanks to Conscience Technology, a lightweight design and other unique characteristics, can be installed on any type of hardware infrastructure including containers, the smallest of devices or larger x86 servers and create a resilient storage layer for a large number of use cases… Thanks with Grid for Apps you can leverage unused networked resources to run applications where needed, where data resides!
This idea of the micro-datacenter based on IoT devices is really intriguing, but we aren’t quite there yet. It is exciting though that we have a technology ready for what could be the next wave of compute (if it ever happens) but can already give, today, the best freedom of choice to our customers.