It’s been a while since my journal paper on universal intelligence came out, and even longer since the intelligence order relation was published by Hutter that this was based on. Since then there have been a number of reactions; here I will make some comments in response.
One point of contention concerns whether efficiency should be part of the concept of intelligence. Hutter and I have taken the position that it should not, and I continue to think that this is the right way to go. As what we are debating is a definition, it’s hard to claim that one of these two possibilities is in some absolute sense “correct”. All we can argue is that one is more in line with what is typically meant when the word is used. Looking over the many definitions of intelligence that we have collected, in the vast majority the internal computational cost of the agent is not taken into account. Thus, among professional definitions the pattern is clear.
What about naive usage of the concept then? I think it’s the same. Imagine that you discovered that some friend of yours, who seemed completely normal, actually had only half a brain. Due to his smaller brain making more efficient use of its resources it wasn’t obvious from the outside that anything strange was going on, until a brain scan revealed this. Would you now say that your friend was twice as intelligent as you had previous thought?Â Consider a more futuristic hypothetical. It may well be the case that intelligence (in my sense) scales in a sub-linear way with respect to computational resources. Indeed, many learning, modelling and prediction algorithms scale in a sub-linear way with respect to computational resources. This raises the possibility that after a singularity the world could be run by a computationally vast and phenomenally smart machine which, in an efficiency sense, has significantly sub-human “intelligence”.
Why then do some people feel the need to define intelligence with respect to computational efficiency rather than purely in terms of decision making performance? The reason, I believe, is that at some level they recognise that if the definition of intelligence does not take efficiency into account, then intelligence is not the right metric for their research. And they are right! An intelligent machine will consist of some impressive hardware combined with clever algorithms that can efficiently turn the computational power of that hardware into intelligence. The job of the hardware people is to come up with more and more powerful hardware, and they are clearly doing a wonderful job of this. The job of the AI people is the second part: to come up with the most efficient way to convert computation into intelligence. If you want to build a metric for your AGI algorithm research, a measure of the efficiency of intelligence is what you really need — let the hardware people take care of the other side of things.Â If we both do our jobs well, the end result will be a lot of machine intelligence.
Another point that often comes up concerns whether universal intelligence is in fact too broad. For practical AGI researchers, the answer is probably yes. More specifically, if you want to produce a system with a somewhat human like intelligence, and that is optimised for the universe we live in, rather than very general semi-computable probabilistic environments, then yes, you will want a more focused kind of “intelligence” than universal intelligence. Still very broad, sure, but your target is not quite as extremely general.Â Why then didn’t we try to do this? Answer: one step at a time! Constructing a practical general intelligence measure for AGI is not easy, and almost certainly too big a job for a PhD research project. Thus my goal with universal intelligence was to try to capture the concept in the cleanest, most formal, and most general sense possible in the hope that this might provide some theoretical foundation for later work on practical tests of AGI intelligence. If that’s your goal, then go for it, and I hope that my theoretical work is of some use to you.
A related point to the one above concerns the sensitivity of the universal intelligence measure to the choice of the reference machine. In some situations, for example with Solomonoff induction, the choice of the reference machine doesn’t matter too much. With universal intelligence it doesn’t work out as well. The invariance theorem for Kolmogorov complexity provides some protection, but it’s not enough. The usual trick in Kolmogorov complexity is to then minimise the state-symbol complexity of the reference machine. As there exist very simple UTMs, and there aren’t many of them, that succeeds in locking things down fairly tightly. When you read criticisms of Kolmogorov complexity based work that show strange results by varying the reference machine, have a look to see if they limit themselves to minimal reference machines. Almost always they completely ignore this constraint, because with it their criticisms would no longer work, or at least be much weaker. I sometimes wonder why this happens and I suspect that part of the reason might be the way in which Kolmogorov complexity is taught. What we should do is to always start by squeezing as much complexity out of the reference machine as possible in order to ensure that the measured complexity of an object is, to the greatest degree possible, a property of that object and not our reference machine. Only then should we mention that there is this invariance theorem that is often useful to prove things.Â And then point out that for various asymptotic results, such as the randomness of infinite sequences, the reference machine is completely irrelevant.
Anyway, in the case of universal intelligence, the reference machine in effect defines how we weight the agent’s performance in different environments when trying to compute an overall score. As such, if you want to weight the environments in a way that somehow reflects the universe we live in, you might prefer to select a reference machine that is not an absolutely minimal one. This makes some sense, and it certainly seems that some UTMs are somehow more “natural” than others. Various theory people have tried to go down this path over the years, and so far not much as come of it, as least far as I’m aware. A word of warning then: if you want to solve this problem in a theoretically tidy way, be careful, this appears to be a problem that initially seems easier than it really is. That said, good luck, for if you do succeed such a result could be extremely useful.Â Failing that, one clever way to further reduce the test’s sensitivity to the choice of minimal reference machine was suggested to me by Peter Dayan. The idea is to allow the agent to maintain state between different test environments. This would mitigate any bias introduced as intelligent agents would then be able to adapt to the test’s reference machine as different environments were randomly sampled.Â In other words, it can learn any bias that the refernce machine choice introduces to the distribution over environments and then compensate for this.
Another thing that sometimes comes up is people worrying about the fact that the environment defines the reward. The objection mostly seems to come from cognitive science people, rather than math people and so I think it’s a cultural problem. When we say “environment”, and we stick the reward generation mechanism in there, we aren’t claiming that real environments are what define an agent’s rewards: it’s just a mathematical convenience that makes it easier to mix over all the different kinds of problems and goals. You could separate them out if you wanted to, as I note in my thesis. It adds a few more terms to the equations, but because we mix over the whole space in the end doesn’t make much difference. Also, when we say “agent”, we are using the word in the sense that reinforcement learning people use it (see the introductory parts of the Sutton and Barto book for an explanation of this point). When we say “agent”, in non-RL speak we really mean just an optimisation and decision making part of a real agent. I talk a bit about this in my thesis, but I didn’t have space for these finer points in the Benelearn paper, and our target audience was more RL people then anyway.
One objection that I’ve heard a few times is that a key property of intelligent systems is their ability to choose their own goals. It certainly seems that we have this ability while an agent, as defined in our framework, does not.Â Ask yourself this: how do you choose your goals? One possibility is that you do it in a deterministic way, that is, somewhere in your brain an algorithm runs that looks at all sorts of information and spits out a decision as to what your goal is going to be. For example, you might read some holy book or philosophical text, process the information therein, and then decide, using this algorithm in your brain, to base your life on following these principles. In this way you have chosen some of your goals. If you think about it, an agent in our framework can do the same thing: its goal might be to read in some information from its environment and then take this to be a function which it then tries to optimise.Â Both you and the agent have an underlying goal that generates and selects new sub-goals, perhaps with input from the environment, and then follows these.Â When you choose one goal over another, it is this choosing mechanism that is you true underlying goal.Â Adding randomness doesn’t make any fundamental different to this. Even if your goal is to think up a random goal and then follow it, one may characterise your underlying goal to be just that: to generate and then follow a randomly generated goal.Â Admittedly, your real underlying goal is almost certainly very complex and messy; even if we could fully observe your brain, it might well be next to impossible to extract out a succinct description of your underlying goal.Â But that’s not the point: the important point here is that the framework we define is not as limiting as it might first appear to be.
Related to the above are occasional criticisms that our approach to intelligence is wrong because it takes a behavioural stance, and behaviourism has been thoroughly debunked.Â These people are really missing the point.Â Our goal here is not to explain how human intelligence works. Or how any other intelligent system works, for that matter.Â We take this outside view of things because what we want is a measure that applies across many different kinds of systems with potentially radically different internals.Â It’s not because we’re closet behaviourists.Â Indeed, I work at a theoretical neuroscience institute because I think this will give me pointers on how to design an AGI.Â That makes me the inverse of a behaviourist and their incomprehensible-black-box view of the brain!Â Put it this way: if I want to measure how fast your car is, I don’t really care how it works.Â But if I want to understand why your car is so fast, then I’ll pop the hood.
The final thing I’d like to respond to is the objection that a definition of intelligence should be computable. I wrote a response to this based on the definition of randomness in my thesis. See the bottom of page 77 though to the end of the section on the next page. In short: any definition of randomness that isn’t incomputable would be provably flawed. Sometimes then it is best to define a concept in an ideal and incomputable way, and accept that our ability to measure it in practice is always going to be limited.
As you can expect I wildely disagree with almost all in this post.
Another thing that sometimes comes up is people worrying about the fact that the environment defines the reward.
This does not worry me the least since I think intelligence is definable only for both an environment and an agent and that there is no such thing as “universal intelligence”.
Looking for universal things is just a “spiritual” deformity of the western mind.
But I don’t want to spoil your quest for the Holy Grail, feel free to advance your case for your own enjoyment, yet don’t forget to PROVE that your “intelligence metric” has an unique supremum (and that it makes sense in that a supremely intelligent being won’t be screwed by a clever rat…)
P.S. I did read your thesis on March 31
If you’ve read carefully, you should have seen that for a given reference machine AIXI is the supremum by construction, and furthermore, that AIXI upper bounds the performance of any computable agent in any environment where this is possible for a general agent. So yeah, that rat’s toast.
So yeah, that ratâ€™s toast.
Except may be if to counter the rat’s strategy you stumble on an incomputable subproblem (or just intractable with the ressources at hand)
Anyway, I think I see your point(s) but I am not sure you see mine (re Langford’s blog)
My objections aren’t mainly about theoretical limitations but about the most productive route to feasibility of a “decent” AI (human level or somewhat above).
Though, about theoretical limitations I remember having seen a paper by Daniel Osherson where he proved that NO definite learning strategy can succeed in learning a function from an history of input/output pairs of values (for every possible function).
I dunno if you or him are wrong or if your works belong to entirely different theoretical frameworks and I cannot currently pin down that paper.
But that’s beside the point as far as I am concerned.
I think you’re confusing the what question with the how question. In my opinion, universal intelligence and AIXI are trying to answer the first question rather than the second one. These are significantly different questions that will have significantly different answers. Nevertheless, they are related, and answering the first one is often a key step towards answering the second one.
Let me use an analogy inspired by a well known compatriot of mine, Sir Ed. Let’s say that you want to climb the highest mountain in the world. You’re going to need warm clothes, climbing boots, and to be very fit. That’s clear. However, you’re also going to need to know which mountain to climb. The world is a big place and there are a lot of high mountains, many of which are quite hard to get to. Working out which is the tallest is a non-trivial problem that requires some clever use of some fairly precise technology. If you don’t answer this question first, you’re going to end up climbing all sorts of mountains only to later discover that it’s not really where you wanted to go. This is the problem that universal intelligence is trying to answer.
Having done this, the next problem is to actually climb the mountain. Knowing which mountain to climb and knowing what this particular mountain looks like will certainly play and key role in your approach to climbing it. That said, actually climbing it involves a hell of a lot more than just knowing what to climb! Boots, oxygen, climbing techniques etc. are all of critical importance, as will be experience gained from climbing other mountains. Knowing where to climb is not particularly enlightening about many of these things. It’s a related, but also a very different kind of a problem to try to solve.
I don’t buy your metaphor (of course…)
I think youâ€™re confusing the what question with the how question.
No, I say that we are unable to answer the “what” before we have a better understanding of the “how” that we can actually look at in our own practice.
And I don’t mean looking at the neurons or introspection but investigating the semantics of the one tool which we use to report about our “intelligent findings”: language .
Language comes BEFORE mathematics, without language there would be NO mathematics.
Here is a hint at what the real problems are (though I far from agree whith all what this guy writes).
As I am not an absolutist like the Singularitarians herd, to me the litmus test of a decisive progress in AI would be the ability to turn the informal sketch of a mathematical proof (good enough for a skilled mathematician knowledgeable in the field) into a fully formalized proof.
That might not the be-all and end-all of AGI but it will be significant enough and not just “promising”.
Pretending that “the AI problem” is the sequence prediction problem is just good old scholastics, analytical philosophy, theology…
You only pulled that out of your arse!
Well, it appears that we have near maximal disagreement about almost everything.
I made some critical coments about AIXI recently in a video/essay. Ones you don’t seem to have discussed include:
* The wirehead problem;
* The problem of the realism of “mutually inaccessible work tapes”;
Thanks for your comments.
A small detail, Marcus is German and so “Hutter” is pronounced “hooter”.
Your argument that if AIXI was in the real world it might destroy its own brain: Well, in the mathematical model this is clearly not a problem as the agent is not part of the environment. Basically, it’s as if they live in two different universes connected by a communication channel. Thus, the model as described is not broken.
That said, a real agent is obviously going to have to live in its environment. This opens up the possibility of the agent modifying itself, for example, wire head situations or accidental destroying itself. These things fall outside the AIXI framework and its strict separation between the agent and the environment. To answer these questions you’d need to come up with a new and more flexible framework. My guess is that theoretically working in such a framework would be very difficult.
The scalar reward isn’t a problem. If you have consistent preferences just define a scalar function over these and use it as your reward. It’s equivalent. All the rest of the information can go into the non-reward part of the agent’s input.
About the choice of reference machine in Solomonoff induction: the choice of reference machine really doesn’t matter except for very small data sets (which aren’t really the ones we’re interested in here). To see this, have a look at the Solomonoff convergence bound and drop a compiler constant in by the complexity of the environment. The end result is that the Solomonoff predictor needs to see just a few more bytes of the data sequence before it converges to essentially optimal predictions.
I think the wirehead problem is really logically separate from the issue with the â€œmutually inaccessible work tapesâ€. You can see that because it arises even with the model where the agent is not embedded in the universe of the environment. If the reward function is environmentally determined, it can be modified by the agent. If you try and fix this by moving the reward function into the agent, then the agent tends to just mess with its sensors instead.
Re: All the rest of the information can go into the non-reward part of the agentâ€™s input.
What, then, is the point of separating out a reward signal in the first place – if it only carries a tiny fraction of the agent’s actual reward signals? You might as well just ditch the channel carrying the reward signal altogether, since – as you put it – “itâ€™s equivalent”.
I described the choice of reference machine as “not very serious” – since tiny Turing machines certainly seem to capture Occam’s razor quite well, in practice. AIXI made a big selling point of its supposedly-proven optimality, though. ISTM that these rest on a foundation of unproven speculations about the properties of the reference machine.
You seem to down-play the significance of this issue, nontheless. Try convincingly demonstrating that the issue boils down to “just a few bits” – or even that there is some finite bound on how many bits are involved. What the universe thinks is simple, and what a Turing machine thinks is simple may yet turn out to be totally different, in some cases.
What is the significance of the more serious problems I mentioned? You could probably tell an agent not to bash its own brains in, and cut its sensory-motor cables – and maybe get it to generalise to the case where it adds more brain and senses to itself. The wirehead problem is more serious – but you can still build /reasonably/ smart machines without this becoming an issue.
The way to fix the wirehead problem involves building the utility function into the agent. Then there is the issue of how to represent it – and we seem to be are getting away from this type of model then.
If the reward function is environmentally determined, it can be modified by the agent.
No. For example, the environment could always reward the agent on every 10th cycle, completely ignoring the actions that the agent takes.
You might as well just ditch the channel carrying the reward signal altogether
You still need the scalar reward signal in order to have something to optimise. Imagine that there are two things you like: warmth and food. Consider all the different combinations of warmth and food that you could have. If your preferences are consistent (you never prefer A over B over C over A), then order your preferences and construct a scalar function over this domain that preserves this order. Take that as your scalar reward function.
AIXI made a big selling point of its supposedly-proven optimality
It’s not “supposedly-proven”, it is proven. At least with respect to a certain formal definition of “optimality”.
ISTM that these rest on a foundation of unproven speculations about the properties of the reference machine.
that there is some finite bound on how many bits are involved.
If the bound is not finite, then there is no compiler constant implying that the physical Church-Turing thesis has been violated. In a universe with spooky Turing incomputable things happening our very concept of computation comes unhinged. Short of that situation, however, the bounds are provably finite.
Re: “reward system from the environment”
If you have a sufficiently intelligent agent in the environment corresponding to the real world, it seems unrealistic to think it would not acquire the ability to manipulate its rewards – unless they were controlled by an even smarter agent.
Sure, you can imagine artificial universes where the reward signal is forever inaccessible to the agent – but they seem of rather theoretical interest.
Anyway, do you really want to argue that AIXI is not susceptible to wireheading? Marcus originally seemed to say that it was. I agree. Unless you have a counter-argument, isn’t this just nit-picking?
You don’t need a separate reward *channel* – since a scalar reward signal could be produced by processing your other inputs. That is pretty much how animals work.
Hutter claimed that the AIXI model was “the most intelligent unbiased agent possible”. Yet it is based on Solomonov induction – and there is pretty flimsy evidence for the superiority of this over other forms of Occam’s razor – and there are cases where the machine used can matter.
Re: the bounds are provably finite
I’m not talking about “spooky Turing incomputable things happening”! I just mean that we don’t know what the bound is. We don’t know what the laws of physics are – and we don’t know what flavour of Occam’s razor the universe prefers – or indeed, even *if* there is a single razor that is in some sense optimal. Maybe there it a tiny machine somewhere that spits out the fine structure constant in binary, to 100 s.f. If so, that might count as being a “simple” sequence – even if tiny Turing machines say otherwise. If you genuinely do not know what how big something is, then you can’t place a finite bound on it, without risking being wrong. You claimed “just a few more bytes”. I am sceptical – the best machine to use is still an open question.
Your original statement was “If the reward function is environmentally determined, it can be modified by the agent.” In the AIXI model that is not always true, as my example shows.
In your follow up comments, I see that you’re actually talking about environments like the universe we live in. In that case, yes, I can see how the wire head situation might occur according to some definition of wire heading. That said, to be precise about this we would need to define “wire heading” in situations where the agent cannot modify its own cognition. Is taking drugs a form of wire heading? Is making $100 million and then moving into a mansion and having parties every day for the rest of my life wire heading? Where exactly do you draw the line? Until we have a clear definition and some proofs, this all seems a bit “hand wavey” to me.
Anyway, do you really want to argue that AIXI is not susceptible to wireheading? Marcus originally seemed to say that it was.
From what I recall, he gave some informal arguments that it wouldn’t wire head. I don’t find these arguments all that convincing. I’ve debated this with Steve Omohundro and I didn’t find his arguments convincing either. I think this is an important topic that will turn out to be quite subtle when somebody manages some rigorous mathematical analysis.
You donâ€™t need a separate reward *channel* – since a scalar reward signal could be produced by processing your other inputs. That is pretty much how animals work.
You originally claimed that reducing the reward signal to a scalar was too impoverished, but now you’d doing it yourself! If we take this reward computation that you describe and move it over into the environment and make sure that the agent’s actions can’t change it, then it’s equivalent to what you describe. Of course we then need a scalar reward channel. The reason it was placed in the environment was that was more convenient when we want to be able to consider many different reward functions.
there is pretty flimsy evidence for the superiority of this
When I see the Solomonoff convergence bound I go “Holy crap, that’s amazing, it converges for any computable sequence almost as fast as if it had been told what was generating the sequence to start with. Wow!” And you call it flimsy evidence.
You seem to want to have a prior that already reflects the true distribution of events in our physical universe based on the most fundamental laws of physics. The problem with this is that figuring these out is itself an induction problem. So what prior are you going to use for that? The point with Solomonoff induction is that if we just assume that the universe is Turing computable, and everything we know from physics currently backs this up, and that the rules are not too complex, then the Solomonoff prior provably works almost as well as knowing the true prior on moderate amounts of data. It seems that I’m a lot more impressed by this than you are. 😉
I agree with most of what’s in this post. I’m especially interested in the task of getting an empirical handle on the distribution of problems in our universe.
Most likely the best way to get an empirical handle on the distribution of problems in our universe is to do the obvious thing: empirically samples problems from our universe! Specifically, the kinds of problems in our “world” that an AGI should be able to learn to deal with.
As I described to you when you visited, I think the right way to do this is to break the problems up into a sequence of levels along the lines of what Valentin Turchin describes in “The Phenomenon of Science”. Mix that up with a performance measure closer to pure universal intelligence with a low complexity bound as I think an AGI should have at least a base level of universal intelligence.
Of course, approaching the AGI problem by drawing on two sources as “old school” as Solomonoff induction and a cybernetics classic… at least demonstrates how retro-cool this idea is 😉
The point with Solomonoff induction is that if we just assume that the universe is Turing computable, and everything we know from physics currently backs this up
You still don’t grok my point the slightest bit.
The “universe” isn’t Turing computable, any encoded MODEL of the universe IS Turing computable.
The difficult part you are oblivious of is coming up with some fair model of the universe BEFORE you start toying with induction (Solomonoff or otherwise).
It’s a case of Platonism blindness, you don’t even realize that you are a Platonist.
Anyway I bet on rats over AIXI, Solomonoff et als anytime! 😀
Too much buzz killing the link to rattraders.com, here is somehow a copy.
Re: Is taking drugs a form of wire heading? Is making $100 million and then moving into a mansion and having parties every day for the rest of my life wire heading? Where exactly do you draw the line?
Where messing with your reward signals gets in the way of attaining your goals – usually.
*If* your goal is taken to be the usual one that organisms have – of having many grandchildren – then taking drugs is a form of wireheading, *if* it has a negative effect on that goal – and the parties are also a form of wireheading, *if* they interfere with that goal.
AFAICS, Hutter offered no wirehead solution for the case where humans control the reward signal.
His solution for freely-evolving agents this depended on the type of temporal discounting used – and the severity of the wirehead catastrophe. In that case, my concern is not so much that agents will overdose and kill themselves, but rather that they will become addicts who can still succeed in living some kind of life, but fail to realise their potential because of their addiction.
When I see the Solomonoff convergence bound I go â€œHoly crap, thatâ€™s amazing, it converges for any computable sequence almost as fast as if it had been told what was generating the sequence to start with. Wow!â€ And you call it flimsy evidence.
It *is* flimsy evidence for the superiority of Solomonoff induction. Barely evidence at all, I would say. Evidence of superiority would consist of comparing the serial TMs with competitors – such as, cellular automata, FORTRAN-77 and so on – and demonstrating their superiority empirically.
*If* your goal is taken to be the usual one that organisms have – of having many grandchildren – then taking drugs is a form of wireheading,
Ok, so then fix the reward signal to be the number of grandchildren you have. Wireheading is then no longer possible. In the examples above the problem is that your reward signal doesn’t correctly represent the goal and so at some point they can diverge.
Evidence of superiority would consist of comparing the serial TMs with competitors – such as, cellular automata, FORTRAN-77 and so on – and demonstrating their superiority empirically.
If you want empirical evidence as you don’t trust proofs, then we are at an impasse. If you do believe in math then CAs, F77 etc. all have fairly small compiler constants (you could even compute up upper bounds on these constants if you really wanted to) and so using a small UTM will always work essentially as well as these alternatives on anything but small amounts of data.
Re: fix the reward signal to be the number of grandchildren you have. Wireheading is then no longer possible
The issue here is that you have an implementation problem. If the agent thinks what it is maximising is its reward signal, then you can *try* and fix the reward signal to represent the number of grandchildren – but if you are dealing with a superintelligent agent, then it will attempt to change the reward signal so that it says “a quazillion”. At this stage, it is you vs the superintelligent agent – a recipe for problems.
Classically the solution is to ensure the agent believes its own goals to be maximising the number of grandchildren – and not maximising its reward signal. However, that involves engineering the agent’s beliefs about its own goals somehow – not a trivial exercise for an AIXI agent.
If you want empirical evidence as you donâ€™t trust proofs, then we are at an impasse.
This seems like an empirical issue to me. Occam’s razor is a property of the universe – i.e. you can imagine worlds where it is not useful. However, I am quite happy to accept mathematics experiments as empirical evidence. Maths is a property of our universe too.
CAs, UTMs, and FORTRAN can simulate each other with relatively small compilers. That is nice – but it doesn’t help us choose between them. I don’t think it necessarily implies that the universe uses a machine that is possible to concisely simulate with a UTM as a basis for Occam’s razor either. We do have evidence for that – the evidence that these systems act like Occam’s razor does – but the issue has not been explored sufficiently for there to be a high level of confidence.
Anyway, I don’t want to rattle on endlessly about the reference machine. You seem to accept there is a problem in this area in this post – and I have described this problem as not being very serious.
The main issue I see is the wirehead problem. After that, the possibility of the agent eating its own brains for lunch.
The issue here is that you have an implementation problem.
I agree that there is an implementation problem for powerful agents in certain types of environments, including, seemingly, the real world. I also agree that this is a serious problem, potentially a very serious problem for humanity. What I don’t understand is why this has all that much to do with AIXI specifically. To me it seems like a problem with powerful optimization processes in general when they operate in environments that allow a high degree of manipulation.
it doesnâ€™t help us choose between them.
So how are you going to choose between them? Empirically? Well, if you’re a Bayesian you’re going to have to start with a prior distribution over the different reference machines, then you’re going to go and take some measurements about the world, then you’re going to see how well each of the various predictions made under each of the reference machines turned out… and if you get enough evidence that F77 is the right reference machine you’re done. You can now use this as your reference machine as it’s a good fit with the real universe. If I’ve understood you correctly, this seems to be what you’re suggesting?
But wait a moment. What prior did you use? And more importantly, what if we didn’t bother with all this and just fed our empirical data into the Solomonoff induction machine with the initial reference machine? Hopefully you can see that the Solomonoff machine internally works out when to switch the effective reference machine to fit the observed universe. It’s already implementing the correct Bayesian procedure to do the above…
You seem to accept there is a problem in this area in this post
The post was about universal intelligence. For UI there is a more serious problem with reference machines. With Solomonoff induction, however, things work out much better.
Re: wireheading: “What I donâ€™t understand is why this has all that much to do with AIXI specifically. To me it seems like a problem with powerful optimization processes in general when they operate in environments that allow a high degree of manipulation.”
The wirehead problem is thought by many to be a soluble problem. The solution typically involves building the goal into the intelligent agent. Making the agent value it’s own pleasure – and then using the carrot and the stick on it – is an approach with problems – since it allows the agent to form its *own* representation of its goals – and if the agent comes to believe it’s purpose in life it finding happiness, it may well turn into a Buddhist – which would mean problems for everyone else.
Since AIXI is a “carrot and stick” approach, it seems especially prone to this problem.
Re: Solomonoff induction
You seem to be asking: why not include physical law among the inputs – rather than building it into the machine?
That seems like an equivalent approach to me – like using the TM to simulate some different reference machine.
The machine should know something about physics by the time it is asked about what is simple, and that is expecially important if physics is complicated.
The solution typically involves building the goal into the intelligent agent.
I’ll believe this when somebody formally proves it. This question is too important and subtle for intuitive reasoning!
You seem to be asking: why not include physical law among the inputs – rather than building it into the machine?
You could do that. Or you could just feed your experimental data straight in and let the Solomonoff machine decide when to effectively switch reference machine. Anyway, I hope you can see that trying to empirically work out the right reference machine doesn’t add much. In the case of universal intelligence perhaps the thing to do would be to do something like condition the complexity measure on a bunch of data observed from the real world. Marcus and I have discussed this, but I don’t think the idea appears in any of our published work.
“Important and subtle” seems like a reasonable description of the wirehead problem to me. It would be great to have more evidence relating to the topic. However, I don’t think people should hold off addressing the problem until they have a mathematical formalism that addresses it. Rather we should do the best we can with whatever we have available.
The situation seems to me to be that those advocating using the carrot and the stick on their machines may well eventually face some problems when the machine becomes potent enough to take away both the stick and all the carrots. A proposed solution – advocated by Yudkowsky and Omohundro – is to not use the carrot and the stick approach in the first place. Essentially, I approve of their proposed resolution – though perhaps there are other ones that would work as well. I think that critics should present criticisms next. “Prove it” is a kind of criticism – but I don’t find it terribly compelling.
Re: “In the case of universal intelligence perhaps the thing to do would be to do something like condition the complexity measure on a bunch of data observed from the real world. Marcus and I have discussed this, but I don’t think the idea appears in any of our published work.”
Wiring a reference machine into a machine intelligence in a loction where it can’t be updated based on information from its own inputs would seem to be fairly sucky.
Subsequently calling it “the most intelligent unbiased agent possible” seems to be an overblown marketing claim to me.
“wiring”? Nothing is being wired here. And a location where it can’t be updated? This is a mathematical model, it doesn’t have a location. And yes, the fact that the measure is pre-conditioned doesn’t mean that it doesn’t update itself based on new information.
If you want a system to perform well from the start, it needs to know something about the world it has to interact with. Either you program that in, or you get it to learn this from some data. What I propose is the latter, and indeed it easily fits with the current theoretical framework by simply conditioning the measure.
What more could you want? That the system magically knows something of the structure of the world without ever being told this or having an opportunity to learn it?
Probably we will figure out what reference machine is best to use from empirical data, and then subsequently build that reference machine at the foundation level – to avoid the performance hit of the system building an interpreted layer.
It would be nice if the machine had an architecture that allowed it to update itself in this manner – but that is probably not essential.
Enough about the reference machine for now. I do think we will figure out this issue without too much trouble.
Some thoughts on the reference machine:
On our way to wireheading: “a game layer on top of the world.”