Dynamic Programming for Data Scientists

Rahul Agarwal Rahul Agarwal
January 31, 2020 Big Data, Cloud & DevOps

Algorithms and data structures are an integral part of data science. While most of us data scientists don’t take a proper algorithms course while studying, they are important all the same.

Many companies ask data structures and algorithms as part of their interview process for hiring data scientists.

Now the question that many people ask here is what is the use of asking a data scientist such questions. The way I like to describe it is that a data structure question may be thought of as a coding aptitude test.

We all have given aptitude tests at various stages of our life, and while they are not a perfect proxy to judge someone, almost nothing ever really is. So, why not a standard algorithm test to judge people’s coding ability.

But let’s not kid ourselves, they will require the same zeal to crack as your Data Science interviews, and thus, you might want to give some time for the study of algorithms and Data structure and algorithms questions.

This post is about fast-tracking this study and explaining Dynamic Programming concepts for the data scientists in an easy to understand way.


How Dynamic Programming Works?

Let’s say that we need to find the nth Fibonacci Number.

Fibonacci series is a series of numbers in which each number ( Fibonacci number ) is the sum of the two preceding numbers. The simplest is the series 1, 1, 2, 3, 5, 8, etc. The answer is:

def fib(n):
if n<=1:
return 1
return fib(n-1) + fib(n-2)

This problem relates well to a recursive approach. But can you spot the problem here?

If you try to calculate fib(n=7) it runs fib(5) twice, fib(4) thrice, fib(3) five times. As n becomes larger, a lot of calls are made for the same number, and our recursive function calculates it again and again.

[Source](https://www.rubyguides.com/2015/08/ruby-recursion-and-memoization/)

Source

Now, Recursion is essentially a top-down approach. As in when calculating Fibonacci number n we start from n and then do recursive calls for n-2 and n-1 and so on.

In Dynamic programming, we take a bottom-up approach. It is essentially a way to write recursion iteratively. We start by calculating fib(0) and fib(1) and then use previous results to generate new results.

def fib_dp(n):
dp_sols = {0:1,1:1}
for i in range(2,n+1):
dp_sols[i] = dp_sols[i-1] + dp_sols[i-2]
return dp_sols[n]

Why Dynamic Programming is Hard?

Recursion is a mathematical concept and it comes naturally to us. We try to find a solution to a bigger problem by breaking it into smaller ones.

Now Dynamic Programming entails exactly the same idea but in the case of Dynamic programming, we precompute all the subproblems that might need to be calculated in a bottom-up manner.

We human beings are essentially hard-wired to work in a top-down manner. Be it our learning where most people try to go into the breadth of things before going in-depth. Or be it the way we think.

So how does one start thinking in a bottom-up way?

I found out that solving the below problem gives a lot of intuition in how DP works. I myself got highly comfortable with DP once I was able to solve this one and hope it helps you too.

Basically the idea is if you can derive/solve a bigger subproblem if you know the solution to a smaller one?


Maximum Path Sum

Given a m x n grid filled with gold, find a path from top left to bottom right which maximizes the sum of gold along its path. We can only move down or right starting from (0,0)

Now there can be decidedly many paths. We can go all the way to the right and then the bottom. Or we can take a zigzag path?

But only one/few paths are going to make you rich.

So how do you even start thinking about such a problem?

When we think of Dynamic Programming questions, we take a bottom-up approach. So we start by thinking about the simplest of problems. In our case, the simplest of problems to solve is the base case. What is the maximum value of Gold we can acquire if we just had to reach cell (0,0)?

And the answer to that is pretty simple — It is the cell value itself.

So we move on to a little harder problem.

What about cell (0,1) and cell (1,0)?

These are also pretty simple. We can reach (0,1)and (1,0) through only (0,0) and hence the maximum gold we can obtain is the value in cell (0,1)/(1,0) plus the maximum gold we can have when we reach cell(0,0)

What about cell(0,2)? Again only one path. So if we know the solution to (0,1) we can just add the value of cell (0,2) to get the solution for (0,2)

Let’s now try to do the same for an arbitrary cell. We want to derive a relation here.

So in the case of an arbitrary cell, we can reach it from the top or from the left.If we know the solutions to the top and left of the cell, we can definitely compute the solution to the arbitrary current target cell.


Coding

Once we have the intuition the coding exercise is pretty straightforward. We start by calculating the solutions for the first row and first column. And then we continue to calculate the other values in the grid using the relation we got previously.

def maxPathSum(grid):
m = len(grid)
n = len(grid[0])
# sol keeps the solutions for each point in the grid.
sol = list(grid)
# we start by calculating solutions for the first row
for i in range(1,n):
sol[0][i] += sol[0][i-1]
# we then calculate solutions for the first column
for i in range(1,m):
sol[i][0] += sol[i-1][0]
# we then calculate all the solutions in the grid
for i in range(1,m):
for j in range(1,n):
sol[i][j] += max(sol[i-1][j],sol[i][j-1])
# return the last element
return sol[-1][-1]

Conclusion

In this post, I talked about how I think about Dynamic Programming questions.

I start by asking myself the simplest problem I could solve and if I can solve the bigger problem by using the solutions to the simpler problem.

Dynamic Programming forms the basis of some of the most asked questions in Data Science/Machine Learning job interviews, and a good understanding of these might help you land your dream job.

So go out there and do some problems with Leetcode/HackerRank. The problems are surely interesting.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Rahul Agarwal

    Tags
    Data Science
    Leave a Comment
    Next Post
    API Economy: Is It The Next Big Thing?

    API Economy: Is It The Next Big Thing?

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in Big Data, Cloud & DevOps
    Big Data, Cloud & DevOps
    Cognitive Load Of Being On Call: 6 Tips To Address It

    If you’ve ever been on call, you’ve probably experienced the pain of being woken up at 4 a.m., unactionable alerts, alerts going to the wrong team, and other unfortunate events. But, there’s an aspect of being on call that is less talked about, but even more ubiquitous – the cognitive load. “Cognitive load” has perhaps

    5 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    How To Refine 360 Customer View With Next Generation Data Matching

    Knowing your customer in the digital age Want to know more about your customers? About their demographics, personal choices, and preferable buying journey? Who do you think is the best source for such insights? You’re right. The customer. But, in a fast-paced world, it is almost impossible to extract all relevant information about a customer

    4 MINUTES READ Continue Reading »
    Big Data, Cloud & DevOps
    3 Ways Businesses Can Use Cloud Computing To The Fullest

    Cloud computing is the anytime, anywhere delivery of IT services like compute, storage, networking, and application software over the internet to end-users. The underlying physical resources, as well as processes, are masked to the end-user, who accesses only the files and apps they want. Companies (usually) pay for only the cloud computing services they use,

    7 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: support@experfy.com

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2023, Experfy Inc. All rights reserved.