Ready to learn Machine Learning? Browse Machine Learning Training and Certification courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.
Deep Learning enjoys a massive hype at the moment. People want to use Neural Networks everywhere, but are they always the right choice? That will be discussed in the following sections, along with why Deep Learning is so popular right now. After reading it, you will know the main disadvantages of Neural Networks and you will have a rough guideline when it comes to choosing the right type of algorithm for your current Machine Learning problem. You will also learn about what I think is one of the major problems in Machine Learning we are facing right now.
Table of Contents:
- Why Deep Learning is so hyped
- Computational Power
- Neural Networks vs. traditional Algorithms
- Black Box
- Duration of Development
- Amount of Data
- Computationally Expensive
Why Deep Learning is so hyped
Deep Learning enjoys its current hype for four main reasons. These are data, computational power, the algorithms itself and marketing. We will discuss each of them in the following sections.
One of the things that increased the popularity of Deep Learning is the massive amount of data that is available in 2018, which has been gathered over the last years and decades. This enables Neural Networks to really show their potential since they get better the more data you fed into them.
In comparison, traditional Machine Learning algorithms will certainly reach a level, where more data doesn’t improve their performance. The chart below illustrates that perfectly:
2. Computational Power
Another very important reason is the computational power that is available nowadays, which enables us to process more data. According to Ray Kurzweil, a leading figure in Artificial Intelligence, computational power is multiplied by a constant factor for each unit of time (e.g., doubling every year) rather than just being added to incrementally. This means that computational power is increasing exponentially.
The third factor that increased the popularity of Deep Learning is the advances that have been made in the algorithms itself. These recent breakthroughs in the development of algorithms are mostly due to making them run much faster than before, which makes it possible to use more and more data.
Also important was marketing. Neural Networks are around for decades (proposed in 1944 for the first time) and already faced some hypes but also times where no one wanted to believe and invest in it. The phrase „Deep Learning“ gave it a new fancy name, which made a new hype possible, which is also the reason why many people wrongly think that Deep Learning is a newly created field.
Also, other things contributed to the marketing of deep learning, like for example the controversial „humanoid“ robot Sophia from Hanson robotics and several breakthroughs in major fields of Machine Learning that made it into mass-media and much more.
Neural Networks vs. traditional Algorithms
When you should use Neural Networks or traditional Machine Learning algorithms is a hard question to answer because it depends heavily on the problem you are trying to solve. This is also due to the „no free lunch theorem“, which roughly states that there is no „perfect“ Machine Learning algorithm that will perform well at any problem. For every problem, a certain method is suited and achieves good results while another method fails heavily. But I personally see this as one of the most interesting parts of Machine Learning. It is also the reason why you need to be proficient with several algorithms and why getting your hands dirty through practice is the only way to get a good Machine Learning Engineer or Data Scientist. Nevertheless, I will provide you some guidelines in this post that should help you to better understand when you should use which type of algorithm.
The main advantage of Neural Network lies in their ability to outperform nearly every other Machine Learning algorithms, but this goes along with some disadvantages that we will discuss and lay our focus on during this post. Like I already mentioned, to decide whether or not you should use Deep Learning depends mostly on the problem you are trying to solve with it. For example, in cancer detection, a high performance is crucial because the better the performance is the more people can be treated. But there are also Machine Learning problems where a traditional algorithm delivers a more than satisfying result.
1. Black Box
(Image Source: https://www.learnopencv.com/neural-networks-a-30000-feet-view-for-beginners/)
The probably best-known disadvantage of Neural Networks is their “black box” nature, meaning that you don’t know how and why your NN came up with a certain output. For example, when you put in an image of a cat into a neural network and it predicts it to be a car, it is very hard to understand what caused it to came up with this prediction. When you have features that are human interpretable, it is much easier to understand the cause of its mistake. In Comparison, algorithms like Decision trees are very interpretable. This is important because in some domains, interpretability is quite important.
This is why a lot of banks don’t use Neural Network to predict whether a person is creditworthy because they need to explain to their customers why they don’t get a loan. Otherwise, the person may feel wrongly threatened by the Bank, because he can not understand why he doesn’t get a loan, which could lead him to change his bank. The same thing is true for sites like Quora. If they would decide to delete a users account because of a Machine Learning algorithm, they would need to explain to their user why they have done it. I doubt that they will be satisfied with an answer such as “that’s what the computer said”.
Other scenarios would be important business decisions, driven by Machine Learning. Can you imagine that a CEO of a big company will make a decision about millions of dollars without understanding why it should be done, just because the „computer“ says he needs to do so?
2. Duration of Development
(Image Source: http://slideplayer.com/slide/6816119/)
Although there are libraries like Keras out there, which make the development of Neural Networks fairly simple, you sometimes need more control over the details of the Algorithm, when for example you trying to solve a difficult problem with Machine Learning that no one has ever done before.
Then you probably use Tensorflow, which provides you with much more opportunities but because of that it is also more complicated and the development takes much longer (depending on what you want to build). Then the question arises for a companies management if it is really worth it that their expensive engineers spend weeks to develop something, which may be solved much faster with a simpler algorithm.
3. Amount of Data
Neural Networks usually require much more data than traditional Machine Learning algorithms, as in at least thousands if not millions of labeled samples. This isn’t an easy problem to deal with and many Machine Learning problems can be solved well with less data if you use other algorithms.
Although there are some cases where NN’s deal well with little data, most of the time they don’t. In this case, a simple algorithm like Naive Bayes, which deals much better with little data, would be the appropriate choice.
(Image Source: https://abm-website-assets.s3.amazonaws.com/wirelessweek.com/s3fs-public/styles/content_body_image/public/embedded_image/2017/03/gpu%20fig%202.png?itok=T8Q8YSe-)
4. Computationally Expensive
Usually, Neural Networks are also more computationally expensive than traditional algorithms. State of the art deep learning algorithms, which realize successful training of really deep Neural Network, can take several weeks to train completely from scratch. Most traditional Machine Learning Algorithms take much less time to train, ranging from a few minutes to a few hours or days.
The amount of computational power needed for a Neural Network depends heavily on the size of your data but also on how deep and complex your Network is. For example, a Neural Network with one layer and 50 neurons will be much faster than a Random Forest with 1,000 trees. In comparison, a Neural Network with 50 layers will be much slower than a Random Forest with only 10 trees.
Great! Now you know that Neural networks are great for some tasks but not as great for others. You learned that huge amounts of data, more computational power, better algorithms and intelligent marketing increased the popularity of Deep Learning and made it into one of the hottest fields right now. On top of that, you have learned that Neural Networks can beat nearly every other Machine Learning algorithms and the disadvantages that go along with it. The biggest disadvantages are their „black box“ nature, increased duration of development (depending on your problem), the required amount of data and that they are mostly computational expensive.
In my opinion, Deep Learning is a little bit over-hyped at the moment and the expectations exceed what can be really done with it right now. But that doesn’t mean it is not useful. I think we live in a Machine Learning renaissance because it gets more and more democratized which enables more and more people to build useful products with it. Out there are a lot of problems that can be solved with Machine Learning and I am sure this will happen in the next few years.
One of the major problems is that only a few people understand what can be really done with it and know how to build successful Data Science teams that bring real value to a company. On one hand, we have PhD-level engineers that are geniuses in regards to the theory behind Machine Learning but lack an understanding of the business side. And on the other hand, we have CEO’s and people in management positions that have no idea what can be really done with Deep Learning and think that it will solve all of the world’s problems in the next years to come. In my opinion, we need more people that bridge this gap, which will result in more products that are useful for our society.
This article was initially published at machinelearning-blog.