Ready to learn Big Data? Browse Big Data Training and Certification Courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Do you remember 1993? Sure, you say. But really think back. We were on the cusp of a great digital revolution that reverberated throughout the world in ways that have changed business, innovation and humanity forever. But it was still only 1993. Few of us had cell phones or even laptops. The Internet existed, but most people hadn’t even heard of it. There were no search engines. No Facebook. No online businesses.
Then along came Marc Andreessen and Eric Bina with Mosaic, the first web browser that could display inline images. Netscape (founded by Andreessen along with Jim Clark) and IE from Microsoft soon followed and two years later, everyone was merging onto the information highway. Businesses were including their website in ads. Amazon was the leading online retailer of… books.
But, as exciting as the web was, it was clear it was getting stuck. Around this time, Newsweek published a hype alert simply entitled, “The Internet? Bah!” In it Clifford Stoll lamented, “cyberspace isn’t, and will never be, nirvana.” One of his chief complaints? The “Internet has become a wasteland of unfiltered data.” Sound familiar?
25 years later, big data is in a similar predicament. There’s still great promise. And there is plenty of innovation at work. But much of it is behind the scenes, and much of that is still surprisingly limited. In a way, big data is trying to break out of the “for developers only” mold the same way ARPANET—the precursor to the web—had to evolve to appeal to more than just computer scientists. Of course, big data is farther along than that. I’d say, if you’re trying to pinpoint exactly where we are on big data’s evolutionary journey, this is our 1999. So get ready to party, because just as the web was able to break out of its dial-up shackles thanks to the emergence of broadband, big data is about to turn onto the autobahn with the help of automation tools that render big data easier to use.
Remember, when Hadoop started to gain mainstream attention, it was supposed to democratize big data for everyone. After all, Hadoop helping reinvigorate Yahoo’s search platform and was also used to enable the multi-machine processing needs of other web giants like Google and Facebook. But as other organizations began going after the big data dream, they discovered a big problem: they weren’t Yahoo!, Google or Facebook. They didn’t have the technical expertise or capital or resources to make big data work. These woes continue into present day with Cloudera, for instance, which recently revealed they were spending too much money and effort on getting their customers deployed.
This is a formidable problem not just for enterprises depending on Hadoop but for big data in general. Projects that do deploy to production take months and are extremely inefficient, requiring talent that continues to be scarce (the latest Harvey Nash / KPMG CIO Survey showed big data and analytics remain “the number one skill in short supply” for the fourth year in a row). And while cloud-based big data solutions like Microsoft Azure, Google Cloud Platform and AWS provide some reduction in big data’ complexity, they still require too much engineering expertise.
But here comes our version of broadband—automation tools that hide the underlying complexity of big data, rendering it more accessible for more organizations. A new crop of big data innovators are emerging that use extremely sophisticated algorithms to automate much of the manual coding and configuration required to make big data work, replacing labor and time intensive processes with automated ones. This means achieving development, processing and analysis in just a fraction of the time for higher scalability, greater operational efficiency, and faster and more cutting-edge innovation. But, even more importantly, these new automation tools will bring a new level of simplicity that will remove the complexity barrier for the vast majority of organizations either struggling with or completely shut out from using more modern data architectures.
Finally, big data can be handed to the masses.