Automated Machine Learning Is Coming… And It Won’t Matter

Tommy Blanchard Tommy Blanchard
December 11, 2020 AI & Machine Learning

Recently, I’ve been seeing a lot of services and products advertising automation of machine learning. Data Robot and H2O.ai offer platforms that allow the creation of machine learning algorithms in point-and-click interfaces. They’ll even do the feature engineering for you! This functionality, or something like it, is slowly being built into various tools and programs. They promise to automate the creation of the whole machine learning pipeline — from feature transformations, hyperparameter tuning, to model selection. There are open-source tools that do much the same things (like TPOT, a cool module I love the idea of but can never get to actually work on a data set that isn’t trivially small).

Right now, these tools mostly aren’t great and/or are absurdly expensive (for the cost of a subscription to Data Robot, you can employ a full-time data scientist). But I have no doubt that soon tools will exist that will completely take care of the model/hyperparameter/feature-transformation process.

I’ve had people ask me if I’m worried about my job security as a data scientist. No, I am not. I can’t wait until these tools are there and open source so I can just type “import machinelearn” and just have it do the stupid hyperparameter optimization and I can get on with the hard part of the job.

When I get data to the point where it could conceivably be ingested by one of these tools, the problem is basically done. At that point I need to run a bit of code to do the grid search and find a reasonably decent model and tune the hyperparameters. Hell, if I just ran XGBoost with the default parameters at this point it would usually be almost as good as I am ever going to get it anyways. Doing the extra work of tuning things a bit more is only worth it because it’s relatively easy, and you very quickly get to the point of diminishing returns (unless you’re in a Kaggle competition, where even diminished returns might take you from 10th place to 1st so you milk every tiny incremental increase in accuracy you can).

Once you have your data in the format where you could make a Kaggle competition out of it, you’ve done the hard part. I would love it if at that point I just ran a single function that did a well optimized search that was way more thorough than my typical grid searches, and also explored some different feature transformations. Maybe my models would do marginally better, and I would save myself a few minutes writing the code. It would be nice. But if it would put you out of a job, maybe you should be seriously thinking about what skills you bring to the table.

In most data science positions I’ve heard of, the hard part isn’t building a model once the problem has been framed, data collected, samples chosen, and data is in a neat one-row-per-sample format. The hard part is getting to that point. While I don’t doubt some of these steps will be made simpler in the future as tools evolve, I can’t see anytime in the near future where the whole process could be easily automated. Translating a business problem into a prediction problem is hard and requires a lot of business knowledge coupled with abstract, quantitative thinking. Figuring out what data to use and how to get it is hard — businesses evolve and the data infrastructure isn’t always so clean, so there aren’t ready solutions here. Choosing an unbiased sample set for training can be extremely difficult and there isn’t a cookie-cutter solution to this. Most often, some structure needs to be imposed on the data from knowledge about the particulars of the problem.

I have no doubt that in the next few years, we’ll have some nice tools for automating the building of a machine learning pipeline. Hopefully once that problem is rendered trivial, fewer aspiring data scientists will try to prove their skills by showing off how accurate their model is on the Iris data set. I don’t see much impact on the field beyond that.

  • Experfy Insights

    Top articles, research, podcasts, webinars and more delivered to you monthly.

  • Tommy Blanchard

    Tags
    Artificial IntelligenceAutomated Machine LearningAutomationMachine Learning
    Leave a Comment
    Next Post
    Cloud Foundations for Data Scientists

    Cloud Foundations for Data Scientists

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    More in AI & Machine Learning
    AI & Machine Learning,Future of Work
    AI’s Role in the Future of Work

    Artificial intelligence is shaping the future of work around the world in virtually every field. The role AI will play in employment in the years ahead is dynamic and collaborative. Rather than eliminating jobs altogether, AI will augment the capabilities and resources of employees and businesses, allowing them to do more with less. In more

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    How Can AI Help Improve Legal Services Delivery?

    Everybody is discussing Artificial Intelligence (AI) and machine learning, and some legal professionals are already leveraging these technological capabilities.  AI is not the future expectation; it is the present reality.  Aside from law, AI is widely used in various fields such as transportation and manufacturing, education, employment, defense, health care, business intelligence, robotics, and so

    5 MINUTES READ Continue Reading »
    AI & Machine Learning
    5 AI Applications Changing the Energy Industry

    The energy industry faces some significant challenges, but AI applications could help. Increasing demand, population expansion, and climate change necessitate creative solutions that could fundamentally alter how businesses generate and utilize electricity. Industry researchers looking for ways to solve these problems have turned to data and new data-processing technology. Artificial intelligence, in particular — and

    3 MINUTES READ Continue Reading »

    About Us

    Incubated in Harvard Innovation Lab, Experfy specializes in pipelining and deploying the world's best AI and engineering talent at breakneck speed, with exceptional focus on quality and compliance. Enterprises and governments also leverage our award-winning SaaS platform to build their own customized future of work solutions such as talent clouds.

    Join Us At

    Contact Us

    1700 West Park Drive, Suite 190
    Westborough, MA 01581

    Email: support@experfy.com

    Toll Free: (844) EXPERFY or
    (844) 397-3739

    © 2023, Experfy Inc. All rights reserved.