Less predictions, more insights, please.
Taken from pexels
Though it’s my job to develop algorithms nowadays, I still hold a healthy prejudice against deep learning and much of machine learning. Despite all its powerful predictions, I have not seen it generate much direct insights valuable to society. What is the point of a having learning algorithms, if we cannot learn from them ourselves?
Alright, I admit it, sometimes we learn from them through observation. For example, human Chess and Go players are getting better by looking at what the algorithms are doing. And there are surely many more things to be learned. But wouldn’t it be so much more efficient, if the algorithms would immediately tell us why?
The problem with black boxes
As professor Pentland describes in a recent Edge conversation, the problem is that these algorithms do not use all the physics and causal knowledge we currently have. There are just dumb neurons piecing together infinitely many small approximations. It doesn’t generalise, and therefore it can still easily make mistakes.
When new data comes in that the algorithm cannot make sense of, it doesn’t realize this and just goes bonkers.
On top of that, it doesn’t know about all the possible biases in the training data.
And when it makes mistakes, we cannot explain why to our customers and managers. And certainly not to judges and juries, should you find yourself in court for an algorithmic decision gone wrong.
Professor pentland describes using more physical functions in our AI. Using known causal structures. To make insight generators, instead of just making predictions. This kind of approach appeals more to me than the current brute force algorithms.
Why we opt for white boxes
At my work, we have endless discussions about algorithms, between mathematicians, domain specialists (physicists) and business managers. In the end, we want to build smart machines that operate at the very edge of what is physically possible.
The mathematicians and data scientists propose many fancy new algorithms they would like to use. They actually work sometimes, though it’s still hard to find enough good data in a closed industry.
The physicists denounce most of them, because they use no information about the real world. So how can we trust what they output?
The business managers want guarantees of the performance. They want control and accountability. How do we get this?
We end up choosing more white box approaches. Linear regression with expected polynomial or sinusoidal functions. Perhaps simple decision trees. A few Bayes rules if needed. And regularizing everything with known physical constraints. With the right effort, the results generally perform virtually as good as any neural network approach, yet with the benefit of knowing why.
On top of that we slap algorithms that monitor the incoming data using our knowledge about how it should behave. Because if the data changes in unexpected ways, we can no longer trust our output predictions.
We then end up with algorithms we can control, and explain when they do go wrong. However, they do not automatically update their physical assumptions. As such they also do not teach us anything about the world we do not already know.
In favor of black boxes
I would still like to try to defend black box algorithms.
1. They work
For one thing, they can work so spectacularly well! The current Deep Neural Networks are great, especially for those who own all the data. There are no better ways to detect images of cats, or paint them in the style of Van Gogh. One good application we always try, is to use these black box algorithms as a benchmark to test whether our white boxes are missing some information.
2. They are old
Another argument in favor of black box algorithms is very simple. We trust them every day already. Despite all progress in psychology and neuroscience, we have little understanding of our own minds, let alone the workings of our social networks.
Our brains are pretty much a black box. However, we trust them, because they often give us verbal statements about their internal reasoning. Yet we also know our minds are full of biases, and often perceive the world different than it really is.
3. They can be improved
Perhaps we could use this knowledge by creating algorithms that output some verbose rationale. If we ask why it made a certain prediction, it will give some reasonable explanation of itself. DARPA coined this the 3rd wave of AI. For example, if the HR algorithm rejects a candidate, it will say that it did so on the basis of the human skin color. After which we can say: “No, no, that is unacceptable, please update your neural network”. It may then do so accordingly, or just come up with better reasons. A downside is that we may then have to deal with the possibility of lying algorithms.
We have one influential colleague at work who likes Bayesian Networks. In those, you piece together a lot of observations and root causes with their base rates and use Bayes rules as connections. When new observations come in, all probabilities get updated, and a likely cause can be identified. Afterwards you know exactly how the inference took place.
It cannot teach you about novel observations, but it can be quite helpful for root cause analysis in complex environments. I would say it’s one of many interesting pathways ahead.
What kind of teacher do you want?
Look, you could say AI is already teaching us. You ask questions to search engines all day, and get answers in return. If you train them right, algorithms can feed you valuable knowledge. Here on Medium I learn from writers all over the world, fed to me by Medium’s AI.
However, you cannot ask it WHY it gives you those answers and articles. Good teaching is an open two way process. And as long as the black boxes outperform the white boxes, the people that build money-generating oracles have no incentive to switch their ways. That means it is up to the rest of the world to build our future teachers. I just really hope more people find this worth the effort.
Originally posted at Towards Data Science