A little over a week ago I felt rather honoured to be reviewing a new submission by a living legend of artificial intelligence, Ray Solomonoff. Sadly the great man passed away just two days later, at the age of 83. That he was still writing papers until the end of his life is a great testament to the passion he had for research.
I’ve been thinking about what I might write about his work. Rather than quoting something pertaining to complexity, prior probability or induction I’ve decided to quote a relatively unknown paper that shows something of his futurist interests. The paper is called “The time scale of artificial intelligence: Reflections on social effects” and was published in 1985.
The last 100 years have seen the introduction of special and general relatively, automobiles, airplanes, quantum mechanics, large rockets and space travel, fission power, fusion bombs, lasers, and large digital computers. Any one of these might take a person years to appreciate and understand. Suppose that they had all been presented to man kind in a single year! This is the magnitude of “future shock” that we can expect from our AI expanded scientific community. In the past, introduction of a new technology into the culture has usually been rather slow, so we had time to develop some understanding of its effect on us, to adjust the technology and culture for an optimum “coming together”. Even with a slow introduction, our use of a new technology has sometimes been very poor.
…We should be able to get our intelligent machines to explain each new technology in a way that is intelligible to man. If this can’t be done, and the new technology is essentially un-understandable to man, then man would be foolish indeed to use it in any way!
However, understanding does not always assure success in dealing with very complex problems. Mankind will continue to have to make decisions under conditions of uncertainty. In the past he has usually chosen his courses of action relatively blindly — controlled more by his own perceived wants and needs than by considerations of the likelihoods of alternative possible futures and their effects upon him.
Am I the only one who, upon hearing the year 2010, imagines some date far off in the future? I think I felt the same way in the weeks before 2000, so I’m sure it will pass. Anyway, another year has gone, indeed another decade, and it’s time for my annual review of predictions. You can find my last annual post here.
It’s been an interesting year in which I’ve been exposed to far more neuroscience than ever before. What I’ve learnt, plus other news I’ve absorbed during the year, has helped to clarify my thinking on the future of AI. First, let’s begin with computer power. I recently gave a talk at the Gatsby Unit on the singularity in which I used the following graph showing the estimated LINPACK scores of the fastest computers over the last 50 years.
Some of you might know about the Lighthill report from 1973 which was deeply critical of progress in AI. This report was the main factor behind cutting the funding of AI research in the UK, and seems to have contributed to the more global cuts around this time known as the “AI winter”. Via Yee Whye Teh I recently came across a BBC debate between James Lighthill and three supporters of AI research: Richard Gregory, John McCarthy and Donald Michie. You can download the televised debate from here, though be warned that it’s 160MB.
Now, 36 years later, it’s interesting to think about how the speakers’ various views and predictions have played out. Overall, the analysis by Lighthill felt the most coherent to me, and I’d say that what has since happened largely backs him up, though it can be argued that he helped to cause this outcome. I agree that he slowed AI down a lot, but 36 years is a rather long time and in the types of problems that he was focusing on there hasn’t been much progress. In response the other debaters mostly just pointed to small advances that had occurred and indicated that they felt that more advances were on the way. Lighthill then denied that these advances showed any real progress towards intelligence.
This feels a lot like today: sceptics say that AI has made no progress, optimists point to lots of advances, and sceptics then say that these advances are not what they consider to be real intelligence. I think this points to perhaps the most fundamental problem in the field: if you can’t define intelligence, how do you judge whether progress is being made? It’s as true today as it was then, and it’s why I think that trying to define intelligence is so important. I like the fact that they keep on saying that an intelligent machine should be able to perform well in a “wide range of situations”, because, of course, this is very much the view of intelligence that I have taken.
While I was visiting Marcus Hutter at ANU a month or so ago, I got talking to one of his students, Joel Veness, who’s working on making computable approximations to AIXI. Joel has a background in writing Go algorithms so is perhaps perfect for the job. I saw recently that the Monte Carlo AIXI paper describing this work is now available online if you want to check it out.
The basic idea goes as follows. In full AIXI you have an extended Solomonoff predictor to model the environment, and an expecti-max tree to compute the optimal action. In order to scale AIXI down and still have something of roughly the same form, you need to find a tractable way to replace both of these two items. Here’s what they did: in the place of extended Solomonoff induction a version of context tree weighting (CTW) is used. CTW has to be extended for this application similar to the way Hutter had to extend Solomonoff induction to active environments for AIXI. In the place of the expecti-max tree search a Monte Carlo tree search is used, similar to that used in Go playing programs: initial selection within the tree, tree expansion, a so called play-out policy, followed by a backup stage to propagate the new information back into the model. You have to be a bit careful here because as the agent imagines different future observations and actions it has to update its hypothetical beliefs to reflect these in order for its analysis and decision making to be consistent. Then, once this possible future has been evaluated, the effect of this on the agent’s model of the world has to be unwound so that the agent doesn’t, in effect, start confusing its fantasies with its present reality.
Prototyping mathematical code in Python with the Scipy/Numpy libraries and then switching to Cython for speed often works well, but there are limitations. The main problem that has been bugging me recently is the speed of matrix function calls. What happens is that your Cython code needs to compute, say the outer product of two vectors, and so makes a call to Numpy. At this point everything switches to very slow interpreted Python code which does a few checks etc. before calling into the underlying fast BLAS library that does the actual work. For large matrices the cost of this wrapping code wasn’t a big deal, but for small matrices it can be a huge performance hit.
To solve this problem I’ve created a Cython wrapper for many of the more common BLAS functions. It’s called Tokyo: I often name code after cities and both Tokyo and BLAS/LAPACK were big, fast and very foreign to start with! At the moment Tokyo only wraps the BLAS routines for vectors and general matrices with single and double precision. If you want to add other things such as banded matrices, complex numbers or LAPACK calls: just look at what I’ve already done and add the functions you need. I’ve also added a few extra functions that I find useful when doing matrix calculations. The idea is that Tokyo will eventually encompass all of BLAS and LAPACK.
From time to time people contact me wanting to know what I think about whether they should donate money to SIAI. My usual answer is something like, “I am not involved with what happens inside the organisation so I don’t have any inside knowledge, just what I, and presumably you, have read online. Based on this my feeling is that, in absolute terms, nobody seems to know how to deal with these issues. However, in relative terms, SIAI currently appears to be the best hope that we have.” In response to such a question the other day I ended up elaborating further about some of my thoughts on how safe AGI might be funded and the role that SIAI, or similar, might best play. The remainder of this post is an edited version of that email.
My guess is that it will play out like this: SIAI’s contribution will be to raise the level of awareness of the dangers of powerful AGI over the next decade or two. As AGI progresses their message will be taken more seriously. Then at some point powerful teams will start to race towards building the first real AGI. The degrees to which these groups will have been influenced by SIAI thinking will vary. Due to greed, wishful thinking, ignorance and what have you, in general safety will come second to progress. A short period of time later the post human period will begin. Where that goes will depend to some extent on fundamental properties of highly intelligent systems, and to some extent on these systems’ specific initial conditions. Given our limited understanding, this currently feels like a roll of the dice to me.
Posted in Ideas
Tagged AGI, Finance, safety
I recently completed a finance paper on the implications of prospect theory for portfolio choice and asset pricing. I worked on this with Prof. Enrico De Giorgi during my post doc at the Swiss Finance Institute. This post is meant as an introduction to this work; the full paper can be downloaded here.
I just read this article on the scale of time by Mike Treder. Part the way through it has an interesting question: What would surprise a person from the year 2000 most about the year 2010? As I don’t know what will happen in the next year, I prefer the 1999 vs. 2009 question: If I got on the phone with 1999 me, what would be the most surprising news?
Let’s start with what was going on in 1999: I had my first cell phone. Black and white LCD screen. No text messaging. I started working for Intelligenesis (later called Webmind, founded by Ben Goertzel). The machines we had were 500 MHz and had 256 MB of RAM. I discovered Google. Internet at home at 56k, but something like 256k at work. I was using Linux and was well aware of open source software. Quake was popular. Computers had CD drives, but DVD drives were starting to come out. Nobody had LCD monitors except on laptops. Dot.com boom was going crazy. The Matrix was a big hit.
Ok, so what would be the biggest surprise for 1999 me? I think the single biggest surprise would be that a black man had been elected president of the United States. I thought it would be at least another generation or two before this would be possible. The next most surprising thing would have been Wikipedia. Though given that Linux development was working well at the time, I guess with the right control structures in place it shouldn’t have been all that surprising. Still, it continues to amaze me at just how good it in fact is.
Many other things seem to have been fairly predictable: internet got faster, bigger, computer specs all went up, people started watching video on the internet, voice and video chatting over the internet, more mobile internet… Would any of these things have surprised me in 1999? I don’t think so. Even the recent rise of social networking: I couldn’t have predicted what that would have looked like, but it’s not all that surprising. Same for internet banking. A lot of what seems to have been going on over the last 10 years is just the maturation of the internet and mobile devices.
What are the most surprising things for you over the last 10 years?
EDIT: Add to my list: free email service with almost 10 GB of storage (gmail), and Google street view.
It’s been a while since my journal paper on universal intelligence came out, and even longer since the intelligence order relation was published by Hutter that this was based on. Since then there have been a number of reactions; here I will make some comments in response.
One point of contention concerns whether efficiency should be part of the concept of intelligence. Hutter and I have taken the position that it should not, and I continue to think that this is the right way to go. As what we are debating is a definition, it’s hard to claim that one of these two possibilities is in some absolute sense “correct”. All we can argue is that one is more in line with what is typically meant when the word is used. Looking over the many definitions of intelligence that we have collected, in the vast majority the internal computational cost of the agent is not taken into account. Thus, among professional definitions the pattern is clear.
What about naive usage of the concept then? I think it’s the same. Imagine that you discovered that some friend of yours, who seemed completely normal, actually had only half a brain. Due to his smaller brain making more efficient use of its resources it wasn’t obvious from the outside that anything strange was going on, until a brain scan revealed this. Would you now say that your friend was twice as intelligent as you had previous thought?Â Consider a more futuristic hypothetical. It may well be the case that intelligence (in my sense) scales in a sub-linear way with respect to computational resources. Indeed, many learning, modelling and prediction algorithms scale in a sub-linear way with respect to computational resources. This raises the possibility that after a singularity the world could be run by a computationally vast and phenomenally smart machine which, in an efficiency sense, has significantly sub-human “intelligence”.
The way in which technological change occurs in industries has always interested me. One quite well known book on this subject is “The Innovator’s Dilemma” by Clayton M. Christensen. Here’s a nice post on a friend’s blog that summarises the essential ideas. The book contains many fascinating examples of disruptive changes and is certainly worth a read.