Explaining supervised learning to a kid (or your boss)

Ready to learn Machine Learning? Browse Machine Learning Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.

Now that you know what machine learning is, let’s meet the easiest kind. My goal here is to get humans of all stripes and (almost) all ages comfy with its basic jargon: instance, label, feature, model, algorithm, and supervised learning.

Instances

Behold: four instances!

Instances are also called ‘examples’ or ‘observations.’

Data table

What do these examples look like when we put them in a table? Sticking with convention (because good manners are good), each row is an instance.

Isn’t data pretty? But what exactly are we looking at? Let’s start with two special columns: a unique ID and, because we’re lucky this time around, a label for each instance.

Labels

The label is the right answer. It’s what we’d like the computer to learn to output when we show it a photograph like this one, which is why some people prefer the term ‘target’, ‘output’, or ‘response’.

Features

What’s in the other columns? Pixel colors. Unlike you, the computer looks at images as numbers, not pretty lights. What you’re seeing is the red-green-blue values for the pixels, starting in the top left corner of the image and working our way down. Don’t believe me? Try entering the values from my data table into this RGB color wheel and see what colors it gives you. Want to know how to get the pixel values from a photo? Look over my shoulder at my code here.

You know what’s pretty cool? Every time you look at a digital photograph, that’s you analyzing data, making sense of something that’s stored as a bunch of numbers. No matter who you are, you’re already a data analyst. You rockstar, you!

You’re already a data analyst!

These pixel values are inputs that the computer will be learning from. I’m not a huge fan of the machine learning name for them (‘features’) because that word means all kinds of things in all kinds of disciplines. You might see people using other words instead: ‘inputs’, ‘variables’, or ‘predictors’.

Model and algorithm

Our features will form the basis of the model (that’s a fancy word for recipe) that the computer will use to go from pixel colors to labels.

But how? That’s the job of the machine learning algorithm. You can see how it works behind the scenes in my other article, but for now, let’s use an existing and awesome algorithm: your brain!

Supervised learning

I’d like you to be my machine learning system. Glance at the instances again and do some learning! What is this?

Classify this image using what you’ve learned from the examples above.

“Blorkle”? Yup. You’ve got this! What you just did was supervised learning, awesome! You’ve now experienced the easiest learning type there is. If you’re able to frame your problem as supervised learning, that’s a good idea. The others are harder… so let’s go meet one: unsupervised learning.

Summary: You’re dealing with supervised learning if the algorithm has the correct label handy for every instance. Later, it will use the model, or recipe, to label new instances, just like you did.