Big Data, Cloud & DevOps

The role of the data curator: Make data scientists more productive

Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab. The ability to harness data to solve critical business challenges is an essential skill for every organization today. There are two primary roles responsible for this function—data scientists and data analysts. Unfortunately these

Fake News and the Responsibility of Data Scientists

Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab.    95% of statistics are made up. Discussions about fact versus truth come up quite a bit these days, especially with the proliferation of “fake news” and the news media’s coverage of certain

5 MINUTES READ Continue Reading »

Getting That Data Science Job

Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab. like a baby bird jumping from the nest Most people who attempt to get hired as a data scientist fail. This article is to help clarify what is happening and increase the chances

3 MINUTES READ Continue Reading »
  • Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Does It Make Sense to Do Big Data with Small Nodes?

    In this age of big data and powerful commodity hardware, there’s an ongoing debate about node size. Does it make sense to use a lot of small nodes to handle big data workloads? Or should we instead use only a handful of very big nodes? If we need to process 200TB of data, for example, is it better to do so with 200 nodes with 4 cores and 1 terabyte each, or to use 20 nodes with 40 cores and 10 terabytes each? One reason we hear is that having all this processing power doesn’t really matter because it’s all about the data. If nodes are limited to a single terabyte, increasing processing power doesn’t really help things and only serves to make bottlenecks worse.

    8 MINUTES READ Continue Reading »

    Data is a stakeholder

    Data science is currently very good at coming up with answers. It’s not very good at coming up with questions. I believe that requires data scientists to pay more attention to building non-technical skills, but I think it also requires us to build more tools that facilitate that part of the process. In fact, building the tools will contribute, in large measure, to building the non-technical skills.

    21 MINUTES READ Continue Reading »

    Trying to Persuade with Data?

    Data professionals have to consider the environment around them when creating a data story. It’s not enough to find an issue and then start raising a red flag. Consider your audience and craft your message in a way that they can hear bad news, consider if others even consider the issue a problem, and then work with others to solve the issue.

    3 MINUTES READ Continue Reading »

    Blurred Lines: Data Analyst vs Data Science

    In the world of exponential data growth, companies are turning to 2 jobs to solve some of their biggest problems, Data Analyst and Data Science. However, it’s becoming more apparent that the business world is unsure how to appropriately define the scope and differentiate between these roles. There are near identical skills required in both, but there is a key difference in what separates these roles. Businesses need to ensure they do not blur the lines.

    3 MINUTES READ Continue Reading »

    Piketty Revisited: Improving Economics through Data Science

    The data curation step involves discovering, analyzing, cleaning, transforming, combining, and de-duplicating data sources to produce target data sources that meet the requirements for input to the analysis. Every data curation step should be documented as data provenance that is then compared against the controls to determine the extent to which the appropriate data governance was followed and the required data quality was achieved. 

    3 MINUTES READ Continue Reading »

    Doubt and Verify: Data Science Power Tools

    In all fields new facts and knowledge are constantly being produced based on new data, discoveries, experience, and research -­‐ far more than a single individual can absorb let alone put into practice. So how do professionals or how does anyone understand that they have a bias, its nature and limitations? And re-evaluate their knowledge (world view) in light of new facts (“ground truth”) and conclusions?

    8 MINUTES READ Continue Reading »

    What’s the difference between data science, machine learning, and artificial intelligence?

    Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab. When I introduce myself as a data scientist, I often get questions like “What’s the difference between that and machine learning?” or “Does that mean you work on artificial intelligence?” I’ve responded enough

    6 MINUTES READ Continue Reading »

    What would be useful for aspiring data scientists to know?

    Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab. Now that I have secured a Data Science (DS) job, some people have come to ask me questions about how I made the transition into DS and into industry in general. I hope

    9 MINUTES READ Continue Reading »

    Sixteen useful Advices for Aspiring Data Scientists

    Ready to learn Data Science? Browse courses like Data Science Training and Certification developed by industry thought leaders and Experfy in Harvard Innovation Lab. Why is data science sexy? It has something to do with so many new applications and entire new industries come into being from the judicious use of copious amounts of data. Examples include

    20 MINUTES READ Continue Reading »