Will Deep Learning Hit the Wall?

Original article was published by Andrey Semenyuk on Deep Learning on Medium


Will Deep Learning Hit the Wall?

Better algorithms or more computing power?

If you are interested in deep learning, then you could already heard about recent paper published by researchers from USA, Korean and Brazilian universities and labs.

Neil C. Thompson, MIT Computer Science and A.I. Lab, Kristjan Greenewald, MIT Initiative on the Digital Economy, Keeheon Lee, Underwood International College, Yonsei University, Seoul, and Gabriel F. Manso, UnB FGA, University of Brasilia.

Photo by Andy Kelly on Unsplash

In their study they analyzed more than 1000 research papers in the domains of image classification, object detection, question answering, named entity recognition, and machine translation, and found, that progress in deep learning performance has been significantly based on increases in computing power.

Generally speaking, progress in every computer domain can be achieved by two main ways:

  • either by delivering more computing power, which means not only faster CPUs or more nodes, but also more memory and storage
  • or by researching for new algorithms and methods

So, researchers found that significant of the recent progress in the domains listed above was made due to increase in computational power, rather than creation and adoption of new algorithms. To put it simply — lots of achievements of deep learning last years were made just because computers became faster and now can execute same old algorithms much faster, than before.

Is it bad? Not necessarily. The increase in computing power is neutral itself, not being good or bad, this is just fact that the world have to live with — computing power is always growing over time, and if you will look around, you’ll find that it was in charge of better performance in many (if not all) areas.

And this is very intuitive — if something is executed on a computer, it can be executed faster with faster CPU, or can produce better results by allowing the task to process more data with faster CPU or larger memory.

Machine learning always was computationally expensive by design, as also mentioned in the study.

And after all, none of the research says that computing power was the only factor driving machine learning. But researches found two interesting points:

  1. “actual computational burden of deep learning models is scaling more rapidly than (known) lower bounds from theory, suggesting that substantial improvements might be possible.”
  2. “if progress continues along current lines, these computational requirements will rapidly become technically and economically prohibitive.”

For both points we could find counterarguments, of course. I.e. if deep learning models are scaling more rapidly than lower bounds from theory, then we can just assume that theory is not precise enough yet.

And also, our estimation of technical and economical bounds for computational requirements are based on our current knowledge of technical methods now used for production of computing resources, current costs of production and ownership and projections made from our current understanding and knowledge.

But maybe instead of arguing with the conclusions of the research, we should think about another point: if we can found new or improve current algorithms, then we can get significant boost in the quality of the results, whatever it is — classification, object detection, machine translation, etc.

When I am looking at the experiments with GAN and improvements in their architecture we can see over past few years, I can see it as a good example of the area, where new methods produced brilliant results.

As researchers estimate, three years of improvements in algorithms is equivalent to 10 times increase in computing power.

But as also mentioned in the research, sometimes new improved algorithm itself requires more computing power. You know, some algorithms are more resource-hungry, than others. And it could be problem itself, although temporary — could be we just need to achieve more computing power to try new ways of training and then running models.

Increase in computing power is what we had in past years and hopefully will have in future, which means that we will see improvements in machine learning for sure. But will it be steady improvement just due to increase in computing power, or will we see significant boost due to improvements of algorithms?

I personally hope for the last.

Resources

The Computational Limits of Deep Learning by Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, Gabriel F. Manso