Programming AutoML In Python with AutoKeras
Automated Machine Learning, commonly abbreviated as AutoML, is the automation of the building of neural network structures. Through intelligent architecture manipulations, AutoML can not only make deep learning more accessible for everyone but accelerate deep learning research.
In this article, we’ll go over:
- How to install AutoKeras for neural architecture searches.
- How to use AutoKeras to find the best neural architectures using structured, image, and text data for regression and classification tasks.
- How to evaluate, predict, export to Keras/TensorFlow, and view architecture of obtained high-performing models.
A Glimpse into AutoKeras
- With AutoKeras, a neural architecture search algorithm finds the best architectures, like the number of neurons in a layer, the number of layers, which layers to incorporate, layer-specific parameters like filter size or percent of dropped neurons in Dropout, etc. Once the search is complete, you can use the model as a normal TensorFlow/Keras model.
- By using AutoKeras, you can build a model with complex elements like embeddings and spatial reductions that would otherwise be less accessible to those who are still in the process of learning DL.
- When AutoKeras creates models for you, much of preprocessing, like vectorizing or cleaning text data, is done and optimized for you.
- It takes two lines to initiate and train a search. AutoKeras boasts a Keras-like interface, so it’s not hard to remember and use at all.
Excited yet? Let’s get started!
Installation
Before you install, ensure you have the following prerequisites:
- Python 3 (AutoKeras does not work with Python 2)
- TensorFlow ≥ 2.3.0 (AutoKeras is based on TensorFlow)
In your command line, run two commands. This should properly install AutoKeras. Note that these installations, unfortunately, do not work in Kaggle notebooks, so it’s best to run these on a local environment.
As a test, run import autokeras
in your coding environment to ensure everything is working.
Structured Data Classification/Regression Tasks
Let’s use the famous iris dataset, which we can import from sklearn
’s several handy toy datasets. The result (data
) of a data import is a dictionary with two keys, ‘data
’ and ‘target
’.
However, since the target is categorical (three values), we’ll need to dummy-encode it. Since the result is a pandas DataFrame and a NumPy array is desired, we call .values
after pd.get_dummies
(which one-hot encodes).
Calling X.shape
yields (150, 4) and calling y.shape
yields (150, 3). This makes sense — there are 150 rows, 4 columns in X
, and 3 unique categories in y
(hence three columns). In order to evaluate our model, we will split the data into training and testing sets.
It’s always good practice to check the shape of arrays:
X_train
: (105, 4)X_test
: (45, 4)y_train
: (105, 3)y_test
: (45, 3)
Great! — everything adds up. We can begin working with autokeras
.
Importing autokeras
should be no problem. A StructuredDataClassifier
is a search object that works on ‘structured data’, or standard two-dimensional data with columns and a label. The max_trials
parameter indicates the maximum number of models to test; this can be set higher in the case of the iris dataset since it is so small. Often the search will end before max_trials
is met (it serves as an upper bound).
For regression problems, use StructuredDataRegressor
.
We can initiate the search process by calling .fit()
. verbose
is a parameter that can be set to 0 or 1, depending on if you would like the model to output information about training, like the architecture of potential networks and epoch-by-epoch progress. However, there are hundreds of epochs for each potential architecture, so it may take up space and slow down training.
It’s always worth setting verbose=1 at least once to peek into what is going on behind the scenes, since the entire search takes at least several minutes to complete. Take this sample output:
If you’re up to it, it’s fascinating to watch the architectures transform over time, and what elements are going into the final neural network.
On my local environment, it took ~1 hour in total with no GPU acceleration to run through 15 trials.
Once your model has been fit, we can evaluate its performance or obtain predictions. Generally, this is a quick process so verbose
can be set to 0.
To view more details about the model structure and to save it, we need to export it using model = search.export_model()
. Here, model
is a standard TensorFlow/Keras model, not a AutoKeras object.
Calling model.summary()
will print out best architecture obtained by the neural architecture search.
To save the weights, call model.save(‘filepath/weights.h5’)
.
Image Classification/Regression
Image classification and regression work much like classification and regression on standard structured data, but the neural network is built using elements of convolutional neural networks, like convolutional layers, max pooling, and flattening. All parameters are learned.
For example, let’s work on the MNIST dataset to find a convolutional neural network to recognize digits from 28 by 28 images. Keras datasets provide a handy method to retrieve this data, but the y
-variables need to be converted into dummy variables.
Let’s check the shape of the data.
X_train
: (60000, 28, 28)X_test
: (10000, 28, 28)y_train
: (60000, 10)y_test
: (10000, 10)
Great! AutoKeras can also handle four-dimensional data (colored images with multiple channels). We can create a search object with ImageClassifier
(or, ImageRegressor
for regression tasks)
*If you are running multiple searches, add a parameter overwrite=True when defining the search object (ak.object(overwrite=True, max_trials=x)
). Otherwise, there will be an error.
Evaluation, prediction, and exporting are performed the same as with structured classifiers and regressors. Note that a neural architecture search with images will take a substantially longer period of time.
Text Classification/Regression
AutoKeras doesn’t require any text vectorization, which is handy because there are so many ways to do so and each can be lengthy to implement. Instead, this is learned by the model.
Let’s use the 20 News Groups dataset, which consists of several news excerpts belonging to 20 different categories. To save time and computing expenses, we’ll only choose to import four categories and have a rather large test-size split (half the dataset).
X_train
and X_test
each are arrays comprised of 1128 raw text strings, y_train
and y_test
are each arrays of shape (1128, 4).
Note that AutoKeras can handle strings of variable lengths! Each string in the input does not need to be the same length.
Next, let’s build our text classifier/regressor.
*If you are running multiple searches, add a parameter overwrite=True when defining the search object (ak.object(overwrite=True, max_trials=x)
). Otherwise, there will be an error.
Evaluation, prediction, and exporting are performed the same as with structured classifiers and regressors.
It’s worth taking a look at the proposed networks. The fusion between NLP and Deep Learning can be extraordinary advanced, and it’s amazing that complex elements like embeddings, spatial reductions, etc. can be implemented without needing to be explicitly called.
https://towardsdatascience.com/media/ef7e5ecedaeca96d93195cc9d9652ba3
With AutoKeras, deep learning becomes more accessible for all!
Source: Towards Data Science.