Need training for Hadoop? Browse courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Many Hadoop installations have been focused on individual teams and their particular data analysis projects. But that’s changing as Scott Carey points out in “11 Hadoop case studies in the enterprise,” business from big banks to airlines and retailers are deploying Hadoop at the enterprise scale. Further he asserts, that “Forrester now says enterprise adoption of Hadoop is ‘mandatory,’ so any business that wants to derive value from its data should, at the very least, be looking at the technology.” Installing Hadoop and enabling a single team to use it has become a simple process, particularly if a cloud-based offering of the platform is acceptable. Whether on Azure, AWS, the Google Cloud Platform or another cloud provider’s infrastructure, provisioning a multi-node Hadoop instance can be done quickly…point and click.
Does that mean, however, that the newly-minted Hadoop cluster is ready for the enterprise? Is it ready to be used concurrently by teams of data scientists and analysts, and applications across your company? Is your data lake ready to be stocked with information of every shape, size, and specie? Probably not. Being ready for that takes more than having processes running on servers ready to respond to REST service calls. To enable your enterprise to share a common Hadoop cluster and the information it contains really means “eating the elephant.”
A quick Google search will give you plenty of hits related to breaking dauntingly-large tasks down to size in order to “eat the elephant” of an audacious goal. Mike Martel, however, takes a different tack — he suggests “You hack it up and have a party.” While still being focused on achieving an overwhelming goal or project, his emphasis is on collaboration and parallel activity.
So what is the path for taking Hadoop from the confines of being a pet technology of a data science team here or there to it being a platform on which business strategy with data can be built? It takes collaboration and parallel activity.
And that takes investment. And, like any significant investment, it deserves a vision to give it a purpose and a roadmap to give it direction. Here are three key benefits using a strategic roadmap for adopting Hadoop for your enterprise can offer.
“If you don’t know where you’re going,
any road will take you there” — Lewis Carroll
First, a roadmap identifies a destination and the direction to take to get there. Unlike a traditional map that illustrates many locations along with the highways, neighborhood streets, dirt paths or hiking trails that connect them without strongly preferring any particular destination, a strategic roadmap for adopting Hadoop for your enterprise lays out the particular destination: A reusable, well-organized data lake on the Hadoop platform with myriad data subjects, types and formats useful to a varied audience of scientists, analysts, applications and others. By clarifying the destination, the roadmap also clarifies the paths that can be followed, and how far each goes toward helping you reaching your overall destination. Preventing you from getting stuck along the way, a roadmap like this helps you maintain momentum on the journey.
“All roads lead to Rome” — proverb
Second, a roadmap can illustrate how there are numerous paths that can be taken to arrive at a destination. If Hadoop is to be useful for a variety of teams and different kinds of data analysis, prediction and reporting capabilities, it is unlikely that all the teams that contribute to an enterprise grade Hadoop environment will follow the same path. Give them all the same roadmap, however, and you can help them to arrive at the same destination from different directions.
“Are we there yet?” — your kids
“Turn around when possible.” — your GPS
Finally, a roadmap points out milestones by which you measure progress and provides warning signs that you may be off course. Your kids like to know “how much further?” Your spouse likes to know “are you sure this is the right way?” Similarly, there will be people in your company who wonder the same things during your Hadoop journey. “It’s installed! Are we done?” “There’s lots of data in it, how much more work do we have to do?” “Are you sure this is the right way to use the capability of the platform we are building?” A strategic roadmap gives guidance regarding how to answer these questions.
Building this kind of roadmap takes time, but it’s worth the effort. Give some thought to the destination your company is seeking to achieve through the capacity and power of the Hadoop platform. If you want help in putting the pieces of a strategic roadmap together to get you there, check out “Adopting Hadoop for the Enterprise“.