An HOV lane for engineers and engineering students to learn data science.
My background is in engineering, and I learned data science by myself. Ten years ago, when I started learning data science, machine learning and artificial intelligence, I read too many articles on how should I start and what should be the order of courses. But honestly, I found that most recipes on learning data science are not efficient for engineering students.
In this article, I will give my engineering fellow the best way to learn data science:
What is different with engineers?
My suggested fast road to learning data analytics is for engineers because most of the engineers (electrical, mechanical, aerospace, petroleum, and …) already are familiar with most of the statistical and algebraic basics of data science and AI. Many data science learning roadmaps focus on learning theories and math behind this science. For many engineers, those courses and materials are repetitions of what they already know, and it discourages them at the beginning.
Instead, I believe engineers can focus more on tools, methods, and computer skills. I realized that through learning tools and skills, you would see mathematics and logic behind algorithms, and I am sure they look too familiar to you. This is a recipe for engineers, and I selected the order of courses to maximize your learning rate by not sacrificing the concepts.
How Fast is this HOV lane?
Normally, it takes 14–20 months to start data science and AI from zero to a semi-professional level. My HOV lane reduces time to 9 months. It is based on spending 10 hours per week on learning and exercising.
Without wasting more time, let’s get started …
Step 0, Learn Python
Stop searching “What language should I learn for data science?” on the internet. I am telling you. Python. End of the story. If you don’t know Python (like me when I started) start with this course.
For data science and AI, you need to complete the two first courses: Getting Started with Python and Python Data Structures B. Yes, that’s all you need. Don’t waste more time on other modules that you won’t use normally.
Step 1, Learn Pandas
Now, you need to learn a Python library that gives you the ability to load and manipulate data. So, learn Pandas. I am suggesting to watch this video series from Data School. This series has about 20–25 videos that teach you the most important Pandas skills.
Step 2, Learn Machine Learning
After finishing this video series, I recommend you to take this Coursera Specialization: Applied Data Science with Python | The 5 courses in this University of Michigan specialization introduce learners to data science through the python.
Personally, I recommend you to take only courses 1, 2, and 3 from this specialization. Again, I am trying to save your time and encourage you to dedicate your time and energy to the most important concepts.
Step 3, STOP and only exercise for a few weeks
By reaching this step, you should have been exhausted from learning. For a few weeks, STOP learning started participating in a few competitions by Kaggle (a competition platform for machine learning and AI). I cannot tell you how much better you learn data science and machine learning by exercising rather than reading and listening.
Probably the best competition for starting is Titanic Competition (a classification problem).Titanic: Machine Learning from DisasterStart here! Predict survival on the Titanic and get familiar with ML basics
Also, I recommend participating in a machine learning regression competition too. My suggestion is the House Prices competition.House Prices: Advanced Regression TechniquesPredict sales prices and practice feature engineering, RFs, and gradient boosting
On more advice on Kaggle. Please look at other people’s kernels. Many people share their codes (kernels) in Kaggle, and they are good examples of how you should approach a problem and write codes. I learned too much from running other people’s codes and trying to understand them.
Also, I think you need to read a few books that give you some business ideas about data science applications in the real world. My number one recommendation is this wonderful book by Provost and Fawcett.
Step 4, Go advanced and learn Deep Learning
So far, you have learned data analytics and machine learning (ML). Those are interesting topics in artificial intelligence (AI) but not as exciting as deep learning (DL). To learn DL, I highly recommend taking this Deep Learning Specialization by Andrew Ng (one of the best instructors in the field of DL):
In my opinion, this Coursera Specialization should be followed by the following wonderful book.
In this book, Francoise Chollet shows you how to use Keras (the easiest deep learning platform, in my opinion) to build DL networks. I see a growing number of people learn PyTorch instead of Keras and TensorFlow for deep learning in recent years. I don’t have too much experience with this library, and I would be happy to get your suggestion on the best way to learn this tool.
Like machine learning, I recommend you to participate in a few Kaggle competitions related to deep learning (especially computer vision or convolutional DL) and learn from professional Kagglers.
Step 5, Super advanced topics
If you want to go further, the next topic is Reinforcement Learning (RL). Many data scientists or AI professionals don’t go further than DL. But if you want to do super exciting things like Google DeepMind, the next stop is RL. They are not many good resources for this topic, but two good references are the following books.
Some other suggestions
There are a few other suggestions that I would like to make to my engineer friends.
First, look for a Hackathon in your industry or area of expertise. Hackathons are small and short competitions that people get together to form teams and solve a problem. It exposes you to problems that are important in your field.
Second, try to test what you learned on data related to your expertise. I am suggesting that because at the same time, you learn methods and get familiar with your specific data challenges. However, please please please, do not only focus on your expertise and sometimes let yourself to try something in other fields. For example, my field of expertise is petroleum and geosciences, but I learned too interesting ideas in genomics.
Third, at some point, even with your strong math background, you need to review algebra or calculus to refresh your mind. At that point, you have a better idea of what you should review again. For example, during learning DL, I felt I need to refresh my mind with some aspects of the gradient descent method. I picked up a book, and I read that specific topic in less than an hour. If I took deep learning with a heavy mathematics introduction initially, I would lose my motivation.
Your HOV lane experience?
This is my HOV lane and is based on my experience. Different people might have their own secret HOV lanes for learning data science. Please tell me about your experience and help me to expand this post.