Am I the only one who, upon hearing the year 2010, imagines some date far off in the future? I think I felt the same way in the weeks before 2000, so I’m sure it will pass. Anyway, another year has gone, indeed another decade, and it’s time for my annual review of predictions. You can find my last annual post here.
It’s been an interesting year in which I’ve been exposed to far more neuroscience than ever before. What I’ve learnt, plus other news I’ve absorbed during the year, has helped to clarify my thinking on the future of AI. First, let’s begin with computer power. I recently gave a talk at the Gatsby Unit on the singularity in which I used the following graph showing the estimated LINPACK scores of the fastest computers over the last 50 years.
The first two points beyond 2010 are for some supercomputers that are already partly constructed. In the past performance estimates for these kinds of machines near to their delivery have been reasonably accurate so I’ve put these on the graph. Rather more speculative is the 2019 data point for the first ExaFLOPS machine. IBM is in discussions about how to put this machine together based on the technology used in the 20 PetaFLOPS machine due in a year and a bit. Based on articles on supercomputer sites like top 500, it appears to be a fairly mainstream opinion that this target should be achievable. Nevertheless, 9 years is a while away so I’ve marked it in grey.
First observation: just like the people who told me in 1990 that exponential growth in supercomputer power couldn’t continue for another decade, the people who told me this in 2000 were again completely wrong. Ha ha, told you so! So let me make another prediction: for the next decade this pattern will once again roughly hold, taking us to about 10^18 FLOPS by 2020.
Second observation: I’ve always been a bit sceptical of Kurzweil’s claim that computer power growth was double exponential, but I’m now thinking that there is some evidence for this having spent some time putting together data for this graph and attempting to compensate for changes in measurement etc. in the data. That said, I think it’s unlikely to remain double exponential much longer.
Third observation: it looks like we’re heading towards 10^20 FLOPS before 2030, even if things slow down a bit from 2020 onwards. That’s just plain nuts. Let me try to explain just how nuts: 10^20 is about the number of neurons in all human brains combined. It is also about the estimated number of grains of sand on all the beaches in the world. That’s a truly insane number of calculations in 1 second.
Desktop performance is also continuing this trend. I recently saw that a PC with just two high end graphics cards is around 10^13 FLOPS of SGEMM performance. I also read a paper recently showing that less powerful versions of these cards lead to around 100x performance increases over CPU computation when learning large deep belief networks.
By the way, in case you think the brain is doing weird quantum voodoo: I had a chat to a quantum physicist here at UCL about the recent claims that there is some evidence for this. He’d gone through the papers making these claims with some interest as they touch on topics close to his area of research. His conclusion was that it’s a lot of bull as they make assumptions (not backed up with new evidence) in their analysis that essentially everybody in the field believes to be false, among other problems.
Conclusion: computer power is unlikely to be the issue anymore in terms of AGI being possible. The main question is whether we can find the right algorithms. Of course, with more computer power we have a more powerful tool with which to hunt for the right algorithms and it also allows any algorithms we find to be less efficient. Thus growth in computer power will continue to be an important factor.
Having dealt with computation, now we get to the algorithm side of things. One of the big things influencing me this year has been learning about how much we understand about how the brain works, in particular, how much we know that should be of interest to AGI designers. I won’t get into it all here, but suffice to say that just a brief outline of all this information would be a 20 page journal paper (there is currently a suggestion that I write such a paper next year with some Gatsby Unit neuroscientists, but for the time being I’ve got too many other things to attend to). At a high level what we are seeing in the brain is a fairly sensible looking AGI design. You’ve got hierarchical temporal abstraction formed for perception and action combined with more precise timing motor control, with an underlying system for reinforcement learning. The reinforcement learning system is essentially a type of temporal difference learning though unfortunately at the moment there is evidence in favour of actor-critic, Q-learning and also Sarsa type mechanisms — this picture should clear up in the next year or so. The system contains a long list of features that you might expect to see in a sophisticated reinforcement learner such as pseudo rewards for informative queues, inverse reward computations, uncertainty and environmental change modelling, dual model based and model free modes of operation, things to monitor context, it even seems to have mechanisms that reward the development of conceptual knowledge. When I ask leading experts in the field whether we will understand reinforcement learning in the human brain within ten years, the answer I get back is “yes, in fact we already have a pretty good idea how it works and our knowledge is developing rapidly.”
The really tough nut to crack will be how the cortical system works. There is a lot of effort going into this, but based on what I’ve seen, it’s hard to say just how much real progress is being made. From the experimental neuroscience side of things we will soon have much more detailed wiring information, though this information by itself is not all that enlightening. What would be more useful is to be able to observe the cortex in action and at the moment our ability to do this is limited. Moreover, even if we could, we would still most likely have a major challenge ahead of us to try to come up with a useful conceptual understanding of what is going on. Thus I suspect that for the next 5 years, and probably longer, neuroscientists working on understanding cortex aren’t going to be of much use to AGI efforts. My guess is that sometime in the next 10 years developments in deep belief networks, temporal graphical models, liquid computation models, slow feature analysis etc. will produce sufficiently powerful hierarchical temporal generative models to essentially fill the role of cortex within an AGI. I hope to spend most of next year looking at this so in my next yearly update I should have a clearer picture of how things are progressing in this area.
Right, so my prediction for the last 10 years has been for roughly human level AGI in the year 2025 (though I also predict that sceptics will deny that it’s happened when it does!) This year I’ve tried to come up with something a bit more precise. In doing so what I’ve found is that while my mode is about 2025, my expected value is actually a bit higher at 2028. This is not because I’ve become more pessimistic during the year, rather it’s because this time I’ve tried to quantify my beliefs more systematically and found that the probability I assign between 2030 and 2040 drags the expectation up. Perhaps more useful is my 90% credibility region, which from my current belief distribution comes out at 2018 to 2036. If you’d like to see this graphically, David McFadzean put together a graph of my prediction.