Big Data, Cloud & DevOps

Data Science: Why Retail Will Reap the Biggest Rewards

Here we discuss why retail is positioned to reap the biggest benefits of data analytics today. Organizations are using advanced analytics to do everything from understanding their customers to improving forecasting, driving better, faster results. While the impact of these approaches is being felt across nearly every industry, retail stands to reap the biggest benefits. With more big box retailers announcing layoffs, store closures, and bankruptcy, data science may just be the secret weapon for success.

The Most in Demand Skills for Data Scientists

Data scientists are expected to know a lot — machine learning, computer science, statistics, mathematics, data visualization, communication, and deep learning. Within those areas there are dozens of languages, frameworks, and technologies data scientists could learn. How should data scientists who want to be in demand by employers spend their learning budget? Which skills are most in demand for data scientists? 

7 MINUTES READ Continue Reading »

Three Signs You’re Suffering from Data Mismanagement, and How to Fix it

We have more data at our fingertips than entire generations before us. But due to data mismanagement issues, for many companies that don’t mean very much. Data Management is a serious obstacle for companies who want to increase productivity, collaborate more efficiently and generate data-driven decisions. The bright side is that executives are recognizing the need for improved Data Intelligence strategies. So, how can enterprises identify and fix their Data Management challenges? There are three major signs that your data isn’t being leveraged fully.

3 MINUTES READ Continue Reading »
  • Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Are Data Lakes Just Dumping Grounds?

    Data lakes quickly emerged as a technology front-runner in the race to make data more digestible – and to finally get it in one place. Data lakes are flexible, scalable and offer an easy solution to store data. Here are strategies to ensure that data moves beyond the raw material to take its rightful place as a valuable business asset. The article outlines common problems with data lakes, strategies for how business can avoid those problems, and how governance enables a data lake to become more than just a data repository.

    3 MINUTES READ Continue Reading »

    Hypothesis Testing In real life

    Hypothesis Tests, or Statistical Hypothesis Testing, is a technique used to compare two datasets or a sample from a dataset. It is a statistical inference method so, in the end of the test, you’ll draw a conclusion — you’ll infer something — about the characteristics of what you’re comparing. Before even thinking about what test you are going to use, you need to define your hypothesis to set the significance level of the statistical test, and then you’re good to pick the statistical test!

    8 MINUTES READ Continue Reading »

    Bridging the Data Gap Between Business and IT

    There are distinct differences in the way business executives and IT professionals think about their company’s data.   The disconnect is a result of perspective and manifests itself in a lack of communication that produces further confusion as IT systems evolve to meet the needs of a rapidly forward-charging business.  The fruit of this misunderstanding often includes friction between the business and IT, but require a decoder ring to help them understand each other. The decoder ring is an application created by the right people with the right experience to confront and address this known problem. 

    6 MINUTES READ Continue Reading »

    A Guide to Data Warehouse Automation for Today’s CDOs

    The four “Vs” of data are well known – volume, velocity, variety and veracity. However, Data Warehousing infrastructure in many organizations is no longer equipped to handle these. The fifth elusive “V” – value – is even more evasive. Meeting these challenges at the scale of data that modern organizations have requires a new approach – and automation is the bedrock. Creating a successful Data Warehouse, then, is critical for CDOs to succeed in monetizing data within their organization.

    5 MINUTES READ Continue Reading »

    It’s All in the Preparation: Four Strategies to Monetize Your Data

    As organizations continue to evolve their information strategy and find innovation opportunities, many seek to generate value through data monetization. Self-Service Data Prep technology allows data-to-information process to take place in the line of business where most of the knowledge about data, its context and understanding resides. This enables organizations to turn data into monetizable information assets – rapidly and seamlessly. While data monetization at its core is tied to a tangible financial value, it does not always translate to a commercial product. Companies monetize their data in one of four ways.

    3 MINUTES READ Continue Reading »

    Top Five Mistakes of Greenhorn Data Scientists

    Young Data Scientists provide tremendous value to companies. They’re fresh off taking online courses and can provide immediate help. They’re often self-taught, as few universities offer Data Science degrees, and thus show tremendous commitment and curiosity. They’re enthusiastic about the field they’ve chosen and are eager to learn more. Beware of the mentioned pitfalls to succeed in your first Data Science job. This article examines 5 common mistakes of early Data Scientists. This post aims to help you better prepare for your work in real-life.

    4 MINUTES READ Continue Reading »

    Linear regression in real life

    Linear regression is a linear approach to modelling the relationship between a dependent variable and one or more explanatory variables. In simple linear regression, a single independent variable is used to predict the value of a dependent variable. In multiple linear regressions, two or more independent variables are used to predict the value of a dependent variable. The difference between the two is the number of independent variables. In a situation where you need to estimate a quantity based on a number of factors that can be described by a straight line — you know you can use a Linear Regression Model.

    5 MINUTES READ Continue Reading »

    Five Reasons “Logistic Regression” should be the first thing you learn when becoming a Data Scientist

    The most important thing to learn to become a Data Scientist is the pipeline, i.e, the process of getting and processing data, understanding the data, building the model, evaluating the results (both of the model and the data processing phase) and deployment.  Learn Logistic Regression first to become familiar with the pipeline and not being overwhelmed with fancy algorithms. So here are 5 reasons why we should start with Logistic Regression first to become a Data Scientist. 

    6 MINUTES READ Continue Reading »

    K-Means in Real Life: Clustering Workout Sessions

    K-means clustering is a very popular unsupervised learning algorithm. It takes your data and learns how it can be grouped. Through a series of iterations, the algorithm creates groups of data points — referred to as clusters — that have similar variance and that minimize a specific cost function. By using the within-cluster sum of squares as cost function, data points in the same cluster will be similar to each other, whereas data points in different clusters will have a lower level of similarity.

    5 MINUTES READ Continue Reading »