AutoML is not a threat for Data Scientists
In the last years, a lot of automated machine learning pieces of software have been introduced. They can automate some tasks that a Data Scientist has usually to perform manually. They have reached a very remarkable level of complexity and effectiveness. Are they a threat to Data Scientist’s job or are they an opportunity?
What is AutoML?
AutoML is a generic expression to indicate pieces of software that perform Machine Learning tasks automatically. They usually automate the entire pipeline processing like, for example, cleaning, encoding, feature and model selection, and hyperparameters tuning. Such pieces of software can be Python libraries like Auto-Sklearn or software programs like Data Robot.
Is AutoML useful to Data Scientists?
Yes, I think that it’s very useful because it automates all the boring tasks that usually require a lot of code and give a high chance of making some mistake. Without AutoML, a Data Scientist must create his own ML pipeline from scratch. Every ML model has its own requirements (e.g. scaling the features for the neural networks), so the complete set of pipelines to test may become quite complex and time-consuming. Using an AutoML tool will easily make a Data Scientist create a good ML model without caring too much about the code. Remember: a Data Scientist is not a software engineer, so he must write as little code as possible, in order to focus on data and information.
Yes, I think that it’s very useful because it automates all the boring tasks that usually require a lot of code and give a high chance of making some mistake. Without AutoML, a Data Scientist must create his own ML pipeline from scratch. Every ML model has its own requirements (e.g. scaling the features for the neural networks), so the complete set of pipelines to test may become quite complex and time-consuming. Using an AutoML tool will easily make a Data Scientist create a good ML model without caring too much about the code. Remember: a Data Scientist is not a software engineer, so he must write as little code as possible, in order to focus on data and information.
I think that Data Scientists must follow change and innovation, so AutoML can become a very useful friend of theirs if they start using it properly. If they automate boring tasks, they will likely have more time to spend analyzing information, that is the real goal of a Data Scientist.