Ultimate Q&A for Aspiring Data Scientists with Serious Guides
So…you want to become a data scientist? You’re a self-motivated person who is very passionate about data science and bringing values to companies by solving complex problems. But you have ZERO experience in data science and have no clue how to get started in this field. That’s why this post is dedicated to you — enthusiastic and aspiring data scientists — to answer the most common questions and challenges faced by most people.
If data is ‘the new oil’, then the data scientist functions much like an oil refinery, converting data into insights that can both save money and generate capital — Eva Short
All the questions below are not created by me, but you — the vibrant data science community. Please be noted that the questions below are not in order, therefore feel free to skip to any part of the questions where you find suitable.
I hope that by sharing my experience in this post would shed light on how to pursue a data science career and give you some general guides to hopefully make your learning journey more enjoyable. Let’s get started!
What Is The Current Trend in Data Science Skills Gap?
The International Data Corporation (IDC) predicts that worldwide revenues for big data and business analytics will reach more than $210bn in 2020.
According to the LinkedIn WorkForce Report in August 2018 for the United States, there was a national surplus of people with data science skills in 2015. Three years later, the trend has changed tremendously in the opposite way as more companies are facing shortages of people with data science skills with big data being increasingly used to generate insights and make decisions.
Economically speaking, it is all about SUPPLY and DEMAND.
The good news is: The “tables” are now turned. The bad news is: With rising job opportunities in data science, still, a lot of aspiring data scientists are facing challenges in getting their foot in the door simply because of their lack of data science skills gap relative to the requirements in the current job market.
In the coming section, you’ll see how to improve data science skills to close the “gap”, stand out among pool of other candidates and eventually increase your chances in landing your dream job.
Questions & Answers
1. What are the skill sets required and how to cover them up?
I’ll be very honest with you. To learn ALL the skills sets in data science is next to impossible as the scope is way too wide. There’ll always be some skills (technical/non-technical) that data scientists don’t know or haven’t learned as different businesses require different skill sets.
In general — in my opinion based on my experience and learning from other data scientists — there are some CORE skill sets that must be learned to become a data scientist.
Technical skills. Math and statistics, programming, and business knowledge. Despite an excellent proficiency in programming regardless of the languages used, we— as a data scientist — should be able to explain our model results to stakeholders in the language of business context and supported by math and statistics.
To learn math and statistics (or more comprehensive data science resources), check this website out created by Randy Lao. Randy has been helping aspiring data scientists and his repository on the website is truly a gold mine!
I still remember when I first started out in data science I read this textbook — An Introduction to Statistical Learning — with Applications in R. I highly recommend this textbook for beginners as the book focuses on the fundamental concepts of statistical modelling and machine learning with detailed and intuitive explanations. If you are a mathematically hardcore person, perhaps you would prefer this book: The Elements of Statistical Learning.
To learn programming skills, especially for beginners without prior experience, I’d suggest to focus on learning one language (personally I prefer Python since the concepts are applicable to other languages if needed and Python is more easier to learn. The importance and and usage of Python or R has been a subject of debate in data science. Personally, I think the focus should be on how you can help businesses solve problems, regardless of the languages used.
Finally, I can’t stress enough that the understanding of business knowledge is extremely crucial.
Soft skills. In fact, soft skills are more important than hard skills. Surprised? I hope not.
LinkedIn surveyed 2,000 business leaders and the soft skills that they’d most like to see their employees have in 2018 are: Leadership, Communication, Collaboration and Time Management. And I truly believe these soft skills play an essential part in data scientist’ day-to-day work. In particular, I learned the hard way on the importance of communication skills which you can read it here.
2. How to choose the right bootcamps and online courses when there are plenty of them out there?
With the hype surrounding AI and data science and many people jumping on the bandwagon, a lot of MOOCs, bootcamps, online courses, workshops (Free/Paid) are mushrooming to hopefully not “miss the boat”.
There are many resources out there. Be resourceful.
So the question is: How to choose the learning materials that are suitable to you?
My approach to filter and select the right online courses/workshops for me:
- Understand that there’s no single best course that can cover all the materials you need. Some courses overlap in some areas and it’s not worth the bucks to purchase different courses but repeat most of the teaching materials.
- Know what you need to learn in the very first place. NEVER dive into a course simply because of the fancy and catchy titles. Remember the technical skills mentioned earlier? Check out job descriptions of data scientists online and you’ll notice there are some common skills required by companies out there. Now you have known the skills needed and the skills that you’re lacking of. Fantastic. Go search for courses that can help you improve your knowledge (theoretical and hands-on).
- Research online on the best courses offered by different platforms. Once you’ve shortlisted a few courses that suit your needs, check out their respective reviews (very important!) by others before you pull out your wallet and get enrolled. On the other hand, there are also many FREE courses available on many platforms and YouTube?
- TIPS: Some platforms might offer financial aid to subsidize your course fees (Coursera etc.). Give it a try!
3. Is learning from open source sufficient to become a data scientist?
I’d say that learning from open source is sufficient to get yourself started in data science and anything beyond is to develop your career further as a data scientist, again, depending on business needs.
4. Should a beginner (from a totally different background) start with reading materials to understand the basics? What book would you suggest?
There’s no fixed path in learning as all roads lead to Rome. Reading materials is definitely a great start to understand the fundamentals which I did the same way as well!
Just be aware of not trying to read and memorize nitty-gritty of the maths and algorithms. Because chances are, you’ll forget everything without really applying the concepts to real problems when it comes to coding.
Just know and understand enough to get yourself started and move on to the next step. Be practical. Don’t try to be perfect in knowing everything simply because perfectionism is unknowingly the best reason of procrastination and not moving forward.
Below are some of the books that I’d suggest to understand the basics of Python, machine learning and deep learning (Hope it helps!):
- Learning Python
- Python for Data Analysis
- An Introduction to Statistical Learning
- Machine Learning for Absolute Beginners
- Python Machine Learning
- Python Data Science Handbook
- Introduction to Machine Learning with Python
- Deep Learning with Python
- Deep Learning with Keras
5. How to balance between understanding business problems (formulating solutions) and developing technical skills (coding, core math knowledge etc.)?
I started off by developing my technical skills before going into understanding business problems and formulating solutions.
Business problems give you WHAT and WHY. To solve a business problem, one has to first how to solve the problem. And the HOW comes from technical skills. Again, the approach depends on situation and my suggestion is mainly based on personal experience.
6. How can we overcome the challenges of starting a career as a data scientist?
One of the major challenges faced by many aspiring data scientists (including me) is that data science is an ocean of information. We could easily lose our focus by getting overwhelmed with all the advice and resources (Online courses, workshops, webinars, meetups, you name it…) that come from different directions. Stay focused. Know what you have and what you need and ALL IN.
Throughout my data science journey, challenges are uncountable but are also what have shaped who I am today. I’ll try my best to explain the main challenges faced by me and how to overcome them:
- I was confused with so many resources when I first started out. I filtered out the noises through the hard way. Listening to podcast and watching webinars given by data scientists, reading plenty of data science articles on how to pursue career in this field, experimenting with different various online courses, engaging with data science community on LinkedIn and learn from them. Ultimately, I focused only helpful resources that I’ve shared in this post.
- There was a point when I almost gave up. The thought of giving up came across my mind when the learning curve was too steep and I started doubting myself. Am I really capable of doing this? Am I really pursuing the right path? Passion and patience redirected me back and let me stay on my path. Keep grinding and keep hustling day in, day out.
Your work is going to fill a large part of your life, and the only way to be truly satisfied is to do what you believe is great work. And the only way to do great work is to love what you do.— Steve Job
- Getting a job as a data scientist (or similar job scope but different title). I wish I could have read these articles earlier written by Favio Vázquez — How to get a job as a Data Scientist? and The two sides of Getting a Job as a Data Scientist. Getting a job was no easy task for me due to the competitive nature in the job market. I submitted tons of resume for job applications but to no avail. Something must be wrong as I was thinking deeply. I revamped my approach and started networking: attending meetups and seminars, sharing my learning experience online, approaching prospective employers in career fairs and sharing sessions in a more systematic way, giving follow-up upon submitting my resumes etc. Things started to change and opportunities started to knock on my door.
7. How to put my work experience in my resume so that I will be hired and my experience will be counted?
I believe there is a misconception here — you’ll not be hired solely based on the experience in your resume. In fact, your resume is one of the ways to get the first entrance ticket to your next stage of application — interview.
Therefore, learning how to write work experience in resume is truthfully important to get the entrance ticket. Studies have shown that the average recruiter scans a resume for six seconds before deciding if the applicant is a good fit for the role. In other words, to pass the resume test, your resume only has six seconds to make the right impression with a prospective employer. Personally, I referred to the following resources to polish my resume:
- Vault
- TopResume
- Optimize Guide (Personal preference!)
- A Resume Expert Gives Career Advice
- How to Pass the 6-Second Resume Test
- How to tailor your Academic CV for Data Science roles
- What do Hiring Managers Look For in a Data Scientist’s CV?
- The 14 Things You Need On Your Resume To Land Your Dream Job
8. What kind of portfolio can help us to get a first job in data science or machine learning?
Here I mention the importance of building a portfolio. Having a well-polished resume is not enough to get you an interview without a good portfolio.
After the first glance at your resume, prospective employers want to understand more about your background and this is where your portfolio comes in. While you might wonder how to build a portfolio from scratch, start by documenting your learning journey. Share your learning experience, mistakes, takeaways — technical or non-technical — through social media platforms (LinkedIn, Medium, Facebook, Instagram, Personal blog — it doesn’t matter).
Interesting in talking in front of a video recorder? Then start by making videos (interview with other aspiring/well-established data scientists) and share on YouTube. Good at writing? Then start writing on the topics that you’re passionate about on different platforms. If you are not into visuals and writing, then reach out to others and conduct podcasts with them.
My point here is: The opportunities are seriously abundant with Internet to build your portfolio and gain traction, or potentially the attention of your prospective employers.
Try to engage with the data science community on LinkedIn and document. I learned the most on LinkedIn with the close-knit data science community in such a conducive sharing-learning environment. Gradually, I learned (still learning!) how to build my portfolio on LinkedIn with my experiences from different sources.
— More Resources —
I’ll just list down some of the useful resources:
- Towards Data Science, Quora, DZone, KDnuggets, Analytics Vidhya, DataTau, fast.ai
- Webinars — Data Science Office Hours, Data Science Connect, Humans of Data Science (HoDS)
- Storytelling with Data: A Data Visualization Guide for Business Professionals
- A Badass’s Guide to Breaking Into Data
- 10 Must Have Data Science Skills
- My Data Science & Machine Learning, Beginner’s Learning Path
- Machine Learning Mastery
- 24 Ultimate Data Science Projects To Boost Your Knowledge and Skills
Follow Inspiring Data Scientists and Professionals
(Source)
Data science community on LinkedIn is awesome and I strongly encourage you to follow the inspiring data scientists and professionals mentioned below:
- Randy Lao
- Kyle McKiou
- Favio Vázquez
- Vin Vashishta
- Eric Weber
- Sarah Nooravi
- Kate Strachnyi
- Tarry Singh
- Karthikeyan P.T.R.
- Megan Silvey
- Imaad Mohamed Khan
- Andreas Kretz
- Andriy Burkov
- Carla Gentry
- Nic Ryan
- Beau Walker
Final Thoughts
Don’t let anyone rush you with their timelines
It’s time to take massive actions towards achieving your goals as an aspiring data scientist. No action is too small to make a difference. Just move forward one step at a time. When you’re on the verge of giving up, PERSISTENCE is key.