Ready to learn Internet of Things? Browse courses like Internet of Things (IoT) Training developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Companies are doing a lot of interesting things in the IoT and industrial IoT space, but it often seems to me that while there are a lot of exciting and flashy features taking place at the edge, not a lot is being done with the collected data afterwards.
For example, we are being inundated with smart retail, smart manufacturing, self-driving cars, robotics and other amazing use cases. These things are incredible, but we don’t see a lot of trend analysis afterwards or real-time reaction to collected data. We should be seeing a lot more insights from the edge. We see basic things, but the advanced analysis we have come to expect from machine learning and predictive analysis is missing. Furthermore, we are not seeing real-time data analysis and reaction that is possible.
This, however, was just my general consensus for the industry based on many conversations I’ve had with developers, customers, and partners; however, this article — “Internet of things data proving to be a hard thing to use” by Sharon Shea — takes a much deeper look and confirms my suspicion. It states, based on research by Forrester, that only 1% of IoT data collected is ever used. That is staggering, but should not be surprising.
Why are companies collecting so much data, yet using so little? I believe the answer to be very straightforward — traditionally this has been a very hard problem to solve. This is compounded by the fact that most of the people solving these problems, while incredibly smart, are first and foremost electrical engineers and do not have strong backgrounds in big data.
Most engineers implementing IoT projects, with good reason, are heavily focused on hardware challenges like battery life, device size and selecting the right manufacturer. They are also obviously focused on solving their primary use cases and getting to market quickly, and not as concerned with value-added analytics, which normally comes later.
They do not have time to design and build the traditional big data architectures required to consume IoT data and transform that data into valuable insights, identify patterns and make it actionable. They don’t have the time, skill set or incentive to fully optimize their data value chain.
How could they? The technologies available on the market either offer very basic functionality, are very complex or very expensive. Often a data value chain optimized for trend analysis and analytics is comprised of five to seven different products. These might include a caching mechanism on the device, an IoT gateway, a NoSQL database, an SQL database, a map-reduce technology, and middleware to tie it all together. This requires a large investment in licenses, infrastructure and technical resources in addition to already expended capital on IoT hardware.
As a result, they often turn to technologies that provide the most basic of functions and allow them to move forward with the least resistance with the plan that, after reaching the market, they will circle back toward optimizing their data value chain to monetize the other 99% of the data they are collecting. As we know, this rarely to never happens.
As I mentioned above, solving big data problems is really complicated. Database vendors have taken the path that offloads that complexity onto the developer instead of internalizing this complexity and presenting simple solutions to complex problems. Furthermore, the people innovating within the technology landscape are a lot different than they were 15 years ago, yet the most popular database technologies, like MongoDB and SQLite, are 11 and 18 years old respectively.
Folks driving innovation today are electrical engineers, data scientists, artificial intelligence researchers and quantum physicists. These are all incredibly smart folks, but they are not first and foremost computer programmers. Furthermore, most new developers do not have four-year degrees in CS but rather graduated from code school. As a result, the data management industry needs to focus on providing solutions to massive big data problems that empower these folks to focus on building incredible innovations, and not worry about overly complex and expensive big data architectures that leave 99% of data unused.
As programming becomes a more cross-disciplinary skill, it is unreasonable to have an expectation that database end users will have 10, 15 and 20 years of experience implementing data management systems. These technologies need to be easy to install, easy to maintain and easy to use. This is especially important in IoT. In other verticals and workloads, the solution to the above has recently become DBaaS. That does not work in IoT, where your data value chain begins far outside the cloud.
Adoption of IoT, in turn, is going to continue to drive an adoption of hybrid cloud. As a result, the solution to these problems cannot simply be providing a managed service for an incredibly complex product. The product itself needs to be simple and easy to use, yet highly scalable, so that it can be deployed and managed directly on the edge and in the cloud with the same level of effort and skill set. In order to continue to drive and support innovation the data management industry will need to transform to support a new generation of innovators.