We recently had a visitor to the Gatsby Unit talk about his work in reinforcement learning, in particular the use of planning and forward models to speed up the learning of difficult tasks. The substance of his talk was good, but that’s not what I want to talk about: it was the motivation he gave in his introduction that bothered me. Basically he said that humans learn much faster than reinforcement learning algorithms, and thus we should try to figure out how to make our algorithms learn faster.
Really? It takes babies half a year or more to learn to control their limbs in fairly basic ways. How many reinforcement learning algorithms get run for six months in a single learning trial? As an adult if we try to learn some new control task, such as balancing a pole, it can take hours of effort despite having years of prior motor control experience. A reinforcement learning algorithm, on the other hand, can learn to solve some of these problems in seconds with no prior experience at all. In a few minutes algorithms can even learn the much more difficult double pole balancing problem. This is a problem that would take me months to master, if indeed I could ever get the hang of it. If we think about problems that humans can learn to solve quite quickly, but that machines have not yet mastered, there is usually a massive amount of prior knowledge that people are using, knowledge that may have taken years to acquire.
It appears to me that we now have some very powerful learning algorithms, in the sense that they can learn moderately complex control tasks with very little data. The performance of these algorithms is already significantly super human in some contexts. This is true not just for reinforcement learning, but many kinds of machine learning algorithms. Unfortunately, for the more ambitious goals of artificial intelligence these highly data efficient algorithms on our moderately sized data sets haven’t worked very well. One key reason for this, I suspect, is that these are inherently messy and complex problems that can only be solved with truly massive amounts of data. I think that lot of human abilities that AI research has struggled with probably fall into this category, e.g. vision, language and common sense knowledge.
Something similar has been expressed in The Unreasonable Effectiveness of Data by Alon Halevy, Peter Norvig and Fernando Pereira. One of the examples they cite is that for years people tried to construct a grammar for the English language by hand. Even at 1,700 pages, however, the grammar was still incomplete! Similar efforts have gone into systems to translate one language into another. To start with a manual approach looked promising as relatively few rules cover a significant percentage of the cases. However, as you try to improve the system the number of rules needed starts to grow rapidly, and eventually explodes. The solution, which is now used by the best translation engines, has been to move to a data driven approach: you take a learning algorithm that scales well and then feed it massive quantities of data. Given enough data, all sorts of subtleties and complexities of the language become statistically learnable. I think there is a key idea here for wannabe AGI designers.
Why don’t you use that method to have AI learn about AI?
Oh! I forgot, this is the very goal of seed AI.
May be you are missing some ingredient?
Great post, shane. Thanks for the link to the article by Norvig, too, that will be helpful for my research.
“So, follow the data. Choose a representation
that can use unsupervised
learning on unlabeled data, which is
so much more plentiful than labeled
data. Represent all the data with a
nonparametric model rather than trying
to summarize it with a parametric
model, because with very large data
sources, the data holds a lot of detail.”
I think the ideas discussed in this post, while not unknown, are greatly undervalued by the community.
It’s also interesting from an AGI development perspective: even if you had a the right hardware and a pretty good algorithm, it might take quite a long time and a huge amount of data before your system developed things like human common sense knowledge or the ability to model human ethical decision making.
Oh Yes! this is what I try to explain in my blog and what I am using in my predictor project.
How can human solve exponential problems? There are unsolvable problems, why can human solve these problems?
Becouse all existing problems are restricted by the reality.
Using logic and mathematics you can construct problems which need exponential resources , but these resource does not exist.
I think is possible to use huge dataset to construct an about “real general priori knowledge” to speed up algorithms giving a general context.
If we start from the point that all the possible 2^1000 bit strings does not exist knowing which of these bitstrings of length 1000 exist is not an useless information .
Pingback: The Inevitable Effectiveness of Data « Breakingkeyboards’s Blog