vetta project

I’m speaking at Extrobritannia

On Saturday the 31st of October, I’m going to be the speaker at Extrobritannia here in London. I went along to their last meeting and it was totally packed out, nearly a hundred people I believe. Having both Dr. Aubrey de Grey and Dr. Anders Sandberg speaking explains why!

I’ll be covering topics from my PhD thesis, such as the definition of intelligence, Solomonoff’s model of Induction, Hutter’s AIXI and more recent work such as the Monte Carlo approximation of AIXI by Veness et. al. I’ll also include a few thoughts on how recent discoveries in theoretical neuroscience might help guide work towards AGI.

You can find the event on facebook here, and the announcement of the event on the Extrobritannia list here.

Dressing as a witch, wizard, ghost, etc. is optional.

Post-singularity summit

With the summit still fresh in my mind I thought I’d put a bit of a summary together — or perhaps more a collection of random thoughts and observations. For a less personal overview, read the Reason magazine article.

What I will remember most clearly about this summit was Peter Thiel. Firstly, the pre-summit party at his penthouse apartment. That was a treat: a tiny peak into the world of the ultra-rich. His mix of intelligence, focus and energy was quite something to behold and he left a real impression on me. His talk was also among the most engaging in my opinion. No slides, no fluffy stuff, just a straight delivery of ideas and analysis seemingly off the cuff with no notes. In his talk and comments afterwards, the main thing that stuck in my mind was his concern that the singularity wouldn’t arrive quickly enough. Really?
Read the rest of this entry »

US visa waiver scam

I got scammed online. I guess it was just a matter of time, but I’d thought that I was smart enough to avoid such things. It’s a pretty slick scam, here’s how it works:

To visit the US from many countries one must now apply online to something called ESTA in order to obtain a so called “visa waiver”. We’ve been doing this for many years on the plane, recently it’s gone online and now you must to do it online before your travel. Knowing this, I googled for US visa waiver and up came a site for applying for US ESTA visa waivers online. I went through the usual process and at the end had to pay a processing fee. A few hours later I went to the site to see if I had been processed. Then I noticed a typo in the word “New Zewland”, weird. Then I saw a grammatical mistake in their faq, a simple mistake, but a mistake nonetheless. Really strange. Oh oh… was this registration site for real?

So I went back to Google and searched again. The EIGHTH link that google returns when searching for “US visa waiver” is in fact the real US government site that you want. The service is free and I was approved in a few seconds. There is even a warning about the scam sites there: of course if you’re reading their warning you must already be on the right site! Anyway, there is now some shady group with money from me, all my credit card details and even my passport details. Bugger. At least I realised my mistake and made a real application and was accepted. It would have been much worse if it had caused me to miss gaining entry permission to the US and messed up my travels.
Read the rest of this entry »

Monte Carlo AIXI

While I was visiting Marcus Hutter at ANU a month or so ago, I got talking to one of his students, Joel Veness, who’s working on making computable approximations to AIXI. Joel has a background in writing Go algorithms so is perhaps perfect for the job. I saw recently that the Monte Carlo AIXI paper describing this work is now available online if you want to check it out.

The basic idea goes as follows. In full AIXI you have an extended Solomonoff predictor to model the environment, and an expecti-max tree to compute the optimal action. In order to scale AIXI down and still have something of roughly the same form, you need to find a tractable way to replace both of these two items. Here’s what they did: in the place of extended Solomonoff induction a version of context tree weighting (CTW) is used. CTW has to be extended for this application similar to the way Hutter had to extend Solomonoff induction to active environments for AIXI. In the place of the expecti-max tree search a Monte Carlo tree search is used, similar to that used in Go playing programs: initial selection within the tree, tree expansion, a so called play-out policy, followed by a backup stage to propagate the new information back into the model. You have to be a bit careful here because as the agent imagines different future observations and actions it has to update its hypothetical beliefs to reflect these in order for its analysis and decision making to be consistent. Then, once this possible future has been evaluated, the effect of this on the agent’s model of the world has to be unwound so that the agent doesn’t, in effect, start confusing its fantasies with its present reality.
Read the rest of this entry »

Tokyo: A Cython BLAS wrapper for fast matrix math

Prototyping mathematical code in Python with the Scipy/Numpy libraries and then switching to Cython for speed often works well, but there are limitations. The main problem that has been bugging me recently is the speed of matrix function calls. What happens is that your Cython code needs to compute, say the outer product of two vectors, and so makes a call to Numpy. At this point everything switches to very slow interpreted Python code which does a few checks etc. before calling into the underlying fast BLAS library that does the actual work. For large matrices the cost of this wrapping code wasn’t a big deal, but for small matrices it can be a huge performance hit.

To solve this problem I’ve created a Cython wrapper for many of the more common BLAS functions. It’s called Tokyo: I often name code after cities and both Tokyo and BLAS/LAPACK were big, fast and very foreign to start with! At the moment Tokyo only wraps the BLAS routines for vectors and general matrices with single and double precision. If you want to add other things such as banded matrices, complex numbers or LAPACK calls: just look at what I’ve already done and add the functions you need. I’ve also added a few extra functions that I find useful when doing matrix calculations. The idea is that Tokyo will eventually encompass all of BLAS and LAPACK.
Read the rest of this entry »

Creating deliberately evil AGI

It was just a matter of time before somebody started working on something like this.  Amusement aside, I’m impressed that Prof. Bringsjord managed to make a magazine as serious as Scientific American with this topic. In order to make a “classically evil” AGI, as opposed to a merely “indifferently evil” AGI, I guess you would face some similar issues to the creation of ethical AGI — formalising the concept of maximal evilness is probably pretty hard.

Funding safe AGI

From time to time people contact me wanting to know what I think about whether they should donate money to SIAI.  My usual answer is something like, “I am not involved with what happens inside the organisation so I don’t have any inside knowledge, just what I, and presumably you, have read online.  Based on this my feeling is that, in absolute terms, nobody seems to know how to deal with these issues.  However, in relative terms, SIAI currently appears to be the best hope that we have.”  In response to such a question the other day I ended up elaborating further about some of my thoughts on how safe AGI might be funded and the role that SIAI, or similar, might best play.  The remainder of this post is an edited version of that email.

My guess is that it will play out like this: SIAI’s contribution will be to raise the level of awareness of the dangers of powerful AGI over the next decade or two.  As AGI progresses their message will be taken more seriously. Then at some point powerful teams will start to race towards building the first real AGI.  The degrees to which these groups will have been influenced by SIAI thinking will vary.  Due to greed, wishful thinking, ignorance and what have you, in general safety will come second to progress.  A short period of time later the post human period will begin.  Where that goes will depend to some extent on fundamental properties of highly intelligent systems, and to some extent on these systems’ specific initial conditions.  Given our limited understanding, this currently feels like a roll of the dice to me.
Read the rest of this entry »

Reinforcement learning in the brain

Model-free reinforcement learning (RL) algorithms are computationally cheap as each state-action pair keeps a cached estimate of its value that can easily be looked up in order to make a decision. Their weakness is that they are not easy to update when the agent’s goals, or the state of the world, changes in some critical way. Model-based RL, on the other hand, is better in this respect as it can use reasoning or search on a model in order to find paths leading to the fulfilment of the agent’s current goals. The downside, of course, is much greater computational cost.

So what does the brain do? For over a decade it has been known that temporal difference learning, a type of model-free RL algorithm, appears to explain the activity of dopamine neurons and their dorsolateral striatal projections. It has also been observed that parts of the prefrontal cortex appear to implement some kind of model-based RL algorithm. Mammalian brains, then, appear to get the best of worlds by having model-free and model-based RL algorithms and then choosing which to use on the fly.  Pretty clever huh?
Read the rest of this entry »

The unreasonable effectiveness of data

We recently had a visitor to the Gatsby Unit talk about his work in reinforcement learning, in particular the use of planning and forward models to speed up the learning of difficult tasks.  The substance of his talk was good, but that’s not what I want to talk about: it was the motivation he gave in his introduction that bothered me.  Basically he said that humans learn much faster than reinforcement learning algorithms, and thus we should try to figure out how to make our algorithms learn faster.

Really? It takes babies half a year or more to learn to control their limbs in fairly basic ways. How many reinforcement learning algorithms get run for six months in a single learning trial?  As an adult if we try to learn some new control task, such as balancing a pole, it can take hours of effort despite having years of prior motor control experience. A reinforcement learning algorithm, on the other hand, can learn to solve some of these problems in seconds with no prior experience at all.  In a few minutes algorithms can even learn the much more difficult double pole balancing problem. This is a problem that would take me months to master, if indeed I could ever get the hang of it.  If we think about problems that humans can learn to solve quite quickly, but that machines have not yet mastered, there is usually a massive amount of prior knowledge that people are using, knowledge that may have taken years to acquire.
Read the rest of this entry »

Prospect theory investors

I recently completed a finance paper on the implications of prospect theory for portfolio choice and asset pricing. I worked on this with Prof. Enrico De Giorgi during my post doc at the Swiss Finance Institute. This post is meant as an introduction to this work; the full paper can be downloaded here.
Read the rest of this entry »