When Steve Jobs launched the iPhone in 2007, the device was cool, but it was just a nice piece of hardware with a fancy OS. What really changed the market happened later with the introduction of standard API sets enabling developers to build applications.
iOS and Android allow developers to build applications without thinking too much about the device underneath. In most cases, apps take advantage of services and APIs available without having to deal directly with hardware, connectivity, or anything else.
Even though IoT is a totally different field, a standard platform to enable developers to concentrate on their applications without thinking about hardware, connectivity, and code written for specific devices would be a huge leap forward for the realization of complex projects. This would reduce complexity, simplify development, and make the ecosystem less expensive.
Most of today’s efforts today are focused on building better networks and optimizing their use. This is not wrong per se, but most people think about IoT as depending on always-connected applications. The reality, as we are discovering, is a bit more nuanced than that.
The always-connected approach is perfect for some applications, but sometimes applications generate boatloads of data and not all of it is necessary at the core. Even more so, data has to be stored locally, optimized, and sent only when the network is available, prioritizing it depending on application or user needs.
I agree that we will need better connectivity, but at the same time we need mechanisms that are able to better understand data, organize it, and send it to a central repository for consolidation. Traditional data replication, compression, and other forms of optimization are not efficient enough.
Just the other day I was discussing a use case with a potential customer: they have sensor data saved on machinery deployed in the middle of nowhere that can connect to a central hub only once per day; this data is validated and saved locally first, then optimized and sent during a short time window at the end of each workday.
Storage is a fundamental element of this use case. You can't save data on a local file system, for example; an abstraction layer is necessary to make everything transparent to the application, and making it as portable as possible, without hardware dependencies.
At the same time, you can't implement storage in the same way you do for traditional applications. From the hardware perspective, reliability, when necessary, can't be attained with component redundancy because of the cost and the complexity of the resulting device. You should have data replicated locally on other devices to minimize risks and improve overall system availability.
From the software point of view, a scale-out object store is the only way to go. It's easy to set up, scalable, secure, and accessible via APIs. In fact, it is much easier to implement an object store on a small device and add additional nodes to the same network for redundancy than it is to implement a larger, more powerful device: if you need more storage that is faster and more durable, you just add small devices in a scale-out fashion.
Again, this is not applicable to all use cases, but by making it possible to build unattended infrastructures at the edge, it opens the door to a large number of use cases.
Doing compute at the edge is not like doing it at the core or in the cloud. You can't have the same stack of hypervisors, OSes, orchestrators, etc. Applications must be simple, as stateless as possible, abstracted from the hardware, and portable. A serverless computing framework is perfect for this scope: simple functions written to do simple tasks and triggered by events (or time). And the stateless form of the function is the perfect companion for object storage's persistency. Once again, reliability and scalability of the platform is not obtained with larger and more expensive devices, but by adding more smaller components to the same network.
AWS Greengrass is a concrete example; it brings functions to the edge and IoT devices. It doesn't include a storage option yet, but I'm sure that AWS customers are already asking for it.
The technology I described above is already available. OpenIO, for example, has deployed it with large x86 servers for multi-petabyte cloud installations, on nano-node based appliances, and on devices as small as Raspberry Pis. And, again, we are not alone thinking about a similar approach. Most cloud providers, such as Amazon Greengrass are building similar stacks.
The cool part of this story is that SDS and Grid for Apps have been showing their capabilities at the core for some time, and we are sure they can do the same at the edge.
At OpenIO, we are helping our customers build end-to-end infrastructures that can face data storage and processing challenges at any scale in the cloud, and now we are working on ensuring this features at the edge too; seamlessly, with the same technology and tools.