You’ve arrived here because your goal is to get your first job as a data scientist. Currently, there are more data science jobs than there are people to fill them, so these types of jobs are in big demand today.
Now becoming a data scientist is not going to happen overnight, but there are some core skills and education that you will need to land that first data scientist job. Here are my thoughts on what you can do land the first role as a data scientist or data analyst.
Core skills
It is vital that the data scientist have solid problem-solving skills. The types of problems you are going to solve, can be complex. Thus you need a way to organize how to solve those problems.
Consider learning about Microsoft’s Team Data Science Process (TDSP), which is a data science methodology that can help teams deliver data science projects efficiently.
Data scientists often follow a process known as CRISP-DM, or the Cross Industry Standard Process for Data Mining. It is a vendor-neutral process that helps guide a data science project. You should become familiar with the steps of these methodologies. For example, data pre-processing and cleansing the data is imperative prior to developing your data mining models and applying algorithms to your data.
Data mining techniques are usually placed into categories like association rules, clustering, classification, regression, sentiment analysis, and so on. It’s very important to understand which data mining strategies and techniques can be used with which types of problems.
For example, if the overall goal is to determine whether a student is at-risk for dropping out of school, I may decide that this is a classification problem, where I want to classify the student as either at-risk or not.
This problem sounds like a binary classification problem, where this outcome variable “at-risk” is either a 1 or 0, true or false. Once you determine that this problem is a classification problem, you can then think about what algorithm(s) you will want to use, such as linear classifiers, support vector machines (SVM), decision trees or random forests.
You should understand and be able to use advanced statistical techniques including descriptive and inferential statistics such as ANOVA, linear and multiple regression as well as its variants (i.e., logistic regression, hierarchical regression, etc), and even factor analysis and principal component analysis. Data scientists are proficient in either Python or R and sometimes both. Other tools, such as SAS, SPSS, etc., may also be useful.
Networking and Competitions
I highly recommend getting involved in a competition. The Kaggle.com community is an excellent place to network, examine data sets, contribute kernels (which are essentially R or Python code that analyzes a data set), and participate in competitions. Build a portfolio of projects that you have worked on in your own time (outside of work, most likely), so that you can demonstrate and talk about them during your interviews. In fact, keep track of them on LinkedIn so your network can also see the projects you’ve worked on. As you work on projects for online classes they can be used to demonstrate competency in the specific techniques you used. Experfy’s consulting marketplace also has multiple projects for data science that could help build your portfolio.
Consider networking with other data scientists who are already working in the field, perhaps in the area that you are interested. For example, there are LinkedIn groups that focus on specific areas, such as HR analytics (also known as People Analytics). Find a data science group or two to join and be sure to network with others who are already data scientists. Many data scientists are willing to share their knowledge and experiences with others who are interested in this profession. You may want to consider having a mentor who can guide you and help you focus on skill development and steer you in the right direction. Through networking with others, you can also work on your communication skill, especially in working with other teams on competitions. There are slack groups like data discourse by Experfy for data science chat. Checkout the Meetup.com website, which has an entire section of technology-related meetups, and many of them are related to big data, analytics, data science, and so on. Attend some meetups and conferences and use some of that time to network and talk with others!
Education and training
The data scientist never stops learning – whether it is a new language, new tool, new algorithm or approach – the data scientist should always be learning new techniques and exploring new algorithms.
Typically, a bachelor’s degree is necessary for most entry-level jobs, and at least a Master’s and even a Ph.D. may be necessary for many high-level data science jobs. Many data scientists have degrees in areas such as statistics, mathematics, computer science, IT, or something closely related. You do not have to have an advanced degree in this area to get a data science job – but it does help. There are also low-cost online training courses available, such as Coursera’s Data Science certificate. Coursera’s Data Science courses, offered by Johns Hopkins University, are designed to help you build strong skill in R. You can find great video courses right here on Experfy in the areas of data science and machine learning. Udemy also has some great courses on Machine Learning, so check those out to help you develop your skills.
Practice, practice, practice. I cannot stress this one enough. Do not let your skills collect dust or go to waste since you will be spending a lot of time building those skills. You can also learn a lot from looking at other people’s code and learning from them.
Consider specializing in a specific area of data science and analytics. For example, I’ve focused a lot of my work on educational data mining (K-12 and higher education) as well as HR analytics. Data science will continue to grow, and there will be more demand for people that specialize in one area or another.
Troll job boards and other data science resources
Keep searching for entry-level data scientist jobs in your area.
There are pretty “hot” areas in the country for data scientist jobs (in no particular order): NYC, NY; Palo Alto, CA; Mountain View, CA; Redmond, WA; San Francisco; Seattle; and Washington DC, and Boston and Cambridge, MA, just to name a few.
Read the job descriptions of data scientist and analyst roles to get a better idea of the experience and skills needed. Try to look at the more entry-level data science and data analyst roles to give you a reasonable understanding of what to expect. Keep in mind that not all data scientist jobs are created equal or are looking for the same skills, technologies, or experience.
One of the more senior level jobs I found trolling the job boards was for a data scientist in Boston, with a salary of $130k-$175k/year. The role I came across was a senior level role, requiring multiple years of industry data science experience (machine learning, deep learning, neural networks, NLP, etc), and in-depth knowledge of specific algorithms such as kNN, SVM, Naïve Bayes, random forests, and advanced understanding of statistics and probability. This particular role required a Masters or Ph.D. in statistics, computer science, or physics – all heavily mathematics-based majors. Take a look at some of the more entry level roles, like this one, or this one.
Keep in mind that you probably won’t make the $130-$175k fresh out of school – you would need years of experience for that. After 5 years of data science experience, you could conceivably be bringing in a 6-figure salary.
Lastly, I highly recommend bookmarking the KDNuggets.com website, operated by Dr. Gregory Pietetsky-Shapiro, who is a recognized expert in data science. Read the KDNuggets site frequently for jobs, news, and more on data science and data mining.
I wish you the very best of success!
Getting started in data science or looking at courses to better your skills?
Don’t forget to check out expert course on data science, here.