The unreasonable effectiveness of data

by Shane Legg

We recently had a visitor to the Gatsby Unit talk about his work in reinforcement learning, in particular the use of planning and forward models to speed up the learning of difficult tasks.  The substance of his talk was good, but that’s not what I want to talk about: it was the motivation he gave in his introduction that bothered me.  Basically he said that humans learn much faster than reinforcement learning algorithms, and thus we should try to figure out how to make our algorithms learn faster.

Really? It takes babies half a year or more to learn to control their limbs in fairly basic ways. How many reinforcement learning algorithms get run for six months in a single learning trial?  As an adult if we try to learn some new control task, such as balancing a pole, it can take hours of effort despite having years of prior motor control experience. A reinforcement learning algorithm, on the other hand, can learn to solve some of these problems in seconds with no prior experience at all.  In a few minutes algorithms can even learn the much more difficult double pole balancing problem. This is a problem that would take me months to master, if indeed I could ever get the hang of it.  If we think about problems that humans can learn to solve quite quickly, but that machines have not yet mastered, there is usually a massive amount of prior knowledge that people are using, knowledge that may have taken years to acquire.

It appears to me that we now have some very powerful learning algorithms, in the sense that they can learn moderately complex control tasks with very little data. The performance of these algorithms is already significantly super human in some contexts. This is true not just for reinforcement learning, but many kinds of machine learning algorithms.  Unfortunately, for the more ambitious goals of artificial intelligence these highly data efficient algorithms on our moderately sized data sets haven’t worked very well.  One key reason for this, I suspect, is that these are inherently messy and complex problems that can only be solved with truly massive amounts of data.  I think that lot of human abilities that AI research has struggled with probably fall into this category, e.g. vision, language and common sense knowledge.

Something similar has been expressed in The Unreasonable Effectiveness of Data by Alon Halevy, Peter Norvig and Fernando Pereira. One of the examples they cite is that for years people tried to construct a grammar for the English language by hand. Even at 1,700 pages, however, the grammar was still incomplete! Similar efforts have gone into systems to translate one language into another. To start with a manual approach looked promising as relatively few rules cover a significant percentage of the cases. However, as you try to improve the system the number of rules needed starts to grow rapidly, and eventually explodes. The solution, which is now used by the best translation engines, has been to move to a data driven approach: you take a learning algorithm that scales well and then feed it massive quantities of data. Given enough data, all sorts of subtleties and complexities of the language become statistically learnable.  I think there is a key idea here for wannabe AGI designers.