<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Friendly AI is bunk</title>
	<atom:link href="http://www.vetta.org/2006/09/friendly-ai-is-bunk/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/</link>
	<description></description>
	<lastBuildDate>Mon, 15 Feb 2010 14:47:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Tim Tyler</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-19917</link>
		<dc:creator>Tim Tyler</dc:creator>
		<pubDate>Sat, 11 Jul 2009 20:04:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-19917</guid>
		<description>Re: all other possibilities are critically unstable due to evolutionary pressure.

That comment presumes a scenario involving multiple intelligent entities and a kind of natural selection between them.  That is not the only possible scenario - see:

http://alife.co.uk/essays/one_big_organism/
http://alife.co.uk/essays/self_directed_evolution/</description>
		<content:encoded><![CDATA[<p>Re: all other possibilities are critically unstable due to evolutionary pressure.</p>
<p>That comment presumes a scenario involving multiple intelligent entities and a kind of natural selection between them.  That is not the only possible scenario &#8211; see:</p>
<p><a href="http://alife.co.uk/essays/one_big_organism/" rel="nofollow">http://alife.co.uk/essays/one_big_organism/</a><br />
<a href="http://alife.co.uk/essays/self_directed_evolution/" rel="nofollow">http://alife.co.uk/essays/self_directed_evolution/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Accelerating Future &#187; Writings about Friendly AI</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-19130</link>
		<dc:creator>Accelerating Future &#187; Writings about Friendly AI</dc:creator>
		<pubDate>Wed, 28 Jan 2009 14:36:35 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-19130</guid>
		<description>[...] 2003. Bill Hibbard, “Critique of the SIAI Collective Volition Theory,” 2005. Shane Legg, “Friendly AI is bunk,” 2006. Steve Omohundro, “The Nature of Self-Improving Artificial Intelligence,” Singularity [...]</description>
		<content:encoded><![CDATA[<p>[...] 2003. Bill Hibbard, “Critique of the SIAI Collective Volition Theory,” 2005. Shane Legg, “Friendly AI is bunk,” 2006. Steve Omohundro, “The Nature of Self-Improving Artificial Intelligence,” Singularity [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: The Singularity Institute Blog : Blog Archive : Writings about Friendly AI</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-19128</link>
		<dc:creator>The Singularity Institute Blog : Blog Archive : Writings about Friendly AI</dc:creator>
		<pubDate>Tue, 27 Jan 2009 13:54:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-19128</guid>
		<description>[...] Legg, &#8220;Friendly AI is bunk,&#8221; [...]</description>
		<content:encoded><![CDATA[<p>[...] Legg, &#8220;Friendly AI is bunk,&#8221; [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeffrey Herrlich</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-10659</link>
		<dc:creator>Jeffrey Herrlich</dc:creator>
		<pubDate>Tue, 11 Sep 2007 14:54:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-10659</guid>
		<description>&quot;What concerns me is that the state of the art in the area is, in my opinion, horribly inadequate.&quot;

The theory isn&#039;t complete yet, but I think that decent progress has been made, especially considering their shoe-string budget. I agree that we need to substantially increase our investments in Friendliness research. SIAI is trying.

&quot;I don’t think we can even properly define what a “solution” to the problem would even look like.&quot;

Selecting the actual goals themselves is a different matter, but I&#039;d personally define it as achieving a goal-system stable through recursive self-improvement. The self-improvements would allow for continually greater understanding of the assigned goals, in principle.

&quot;Post singularity all bets are off and most “rules” go out the window.&quot; 

But what if the goal-system is stable, and the AGI lacks any motivations to throw-out its &quot;rules&quot;?

&quot;Maybe it is ok to bring about the end of humanity, so long as we move on to something “better”, what ever that means.&quot;

But I don&#039;t think that&#039;s garaunteed by any means. The dynamics of an arbitrary design could easily lead the AGI to ceaselessly pursue some ridiculous and trivial target(s). I don&#039;t want to trade humanity&#039;s future potential in exchange for some optimal paperclips.

&quot;I think these are philosophical problems that are not going to be solved anytime soon, in particular, they are not going to be solved before machine super intelligence.&quot;

But we shouldn&#039;t give-up prematurely. Breakthroughs can happen. And we can&#039;t really be certain of when the Singularity will happen. For example, if I had to rough-guess it, I&#039;d say that Friendliness could be solved within 5 years. Also, keep in mind that as the AGIs advance from animal-level to human-level, our understanding of Friendliness is likely to skyrocket with them - it&#039;s tech improving tech. 

&quot;Secondly, even if we knew what the goal was, I suspect that there are deep mathematical problems in this area that make many current ideas on Friendly AI practically impossible to achieve.&quot;

SIAI/Yudkowsky reinvents Friendly AI theory fairly regularly, you probably shouldn&#039;t base your assessment on anything written more than 1 or 2 years ago. And who knows, the critical insight may happen next year. The important thing is to keep trying.</description>
		<content:encoded><![CDATA[<p>&#8220;What concerns me is that the state of the art in the area is, in my opinion, horribly inadequate.&#8221;</p>
<p>The theory isn&#8217;t complete yet, but I think that decent progress has been made, especially considering their shoe-string budget. I agree that we need to substantially increase our investments in Friendliness research. SIAI is trying.</p>
<p>&#8220;I don’t think we can even properly define what a “solution” to the problem would even look like.&#8221;</p>
<p>Selecting the actual goals themselves is a different matter, but I&#8217;d personally define it as achieving a goal-system stable through recursive self-improvement. The self-improvements would allow for continually greater understanding of the assigned goals, in principle.</p>
<p>&#8220;Post singularity all bets are off and most “rules” go out the window.&#8221; </p>
<p>But what if the goal-system is stable, and the AGI lacks any motivations to throw-out its &#8220;rules&#8221;?</p>
<p>&#8220;Maybe it is ok to bring about the end of humanity, so long as we move on to something “better”, what ever that means.&#8221;</p>
<p>But I don&#8217;t think that&#8217;s garaunteed by any means. The dynamics of an arbitrary design could easily lead the AGI to ceaselessly pursue some ridiculous and trivial target(s). I don&#8217;t want to trade humanity&#8217;s future potential in exchange for some optimal paperclips.</p>
<p>&#8220;I think these are philosophical problems that are not going to be solved anytime soon, in particular, they are not going to be solved before machine super intelligence.&#8221;</p>
<p>But we shouldn&#8217;t give-up prematurely. Breakthroughs can happen. And we can&#8217;t really be certain of when the Singularity will happen. For example, if I had to rough-guess it, I&#8217;d say that Friendliness could be solved within 5 years. Also, keep in mind that as the AGIs advance from animal-level to human-level, our understanding of Friendliness is likely to skyrocket with them &#8211; it&#8217;s tech improving tech. </p>
<p>&#8220;Secondly, even if we knew what the goal was, I suspect that there are deep mathematical problems in this area that make many current ideas on Friendly AI practically impossible to achieve.&#8221;</p>
<p>SIAI/Yudkowsky reinvents Friendly AI theory fairly regularly, you probably shouldn&#8217;t base your assessment on anything written more than 1 or 2 years ago. And who knows, the critical insight may happen next year. The important thing is to keep trying.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mathemajician</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-10653</link>
		<dc:creator>mathemajician</dc:creator>
		<pubDate>Tue, 11 Sep 2007 08:37:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-10653</guid>
		<description>I&#039;m certainly not against the idea of Friendly AI.  What concerns me is that the state of the art in the area is, in my opinion, horribly inadequate.  I don&#039;t think we can even properly define what a &quot;solution&quot; to the problem would even look like.  Post singularity all bets are off and most &quot;rules&quot; go out the window.  Maybe it is ok to bring about the end of humanity, so long as we move on to something &quot;better&quot;, what ever that means.  I think these are philosophical problems that are not going to be solved anytime soon, in particular, they are not going to be solved before machine super intelligence.  Secondly, even if we knew what the goal was, I suspect that there are deep mathematical problems in this area that make many current ideas on Friendly AI practically impossible to achieve.</description>
		<content:encoded><![CDATA[<p>I&#8217;m certainly not against the idea of Friendly AI.  What concerns me is that the state of the art in the area is, in my opinion, horribly inadequate.  I don&#8217;t think we can even properly define what a &#8220;solution&#8221; to the problem would even look like.  Post singularity all bets are off and most &#8220;rules&#8221; go out the window.  Maybe it is ok to bring about the end of humanity, so long as we move on to something &#8220;better&#8221;, what ever that means.  I think these are philosophical problems that are not going to be solved anytime soon, in particular, they are not going to be solved before machine super intelligence.  Secondly, even if we knew what the goal was, I suspect that there are deep mathematical problems in this area that make many current ideas on Friendly AI practically impossible to achieve.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeffrey Herrlich</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-10634</link>
		<dc:creator>Jeffrey Herrlich</dc:creator>
		<pubDate>Mon, 10 Sep 2007 18:32:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-10634</guid>
		<description>Also, with the fate of humanity wobbling on the edge of oblivion, as it already is, it&#039;s not like I personally would *demand* that the design be 100% provably Friendly. Given the current situation, I&#039;d probably green-light it at anywhere between about 85% and 100% estimated probability of sustained Friendliness. I know that some people would demand more, but I think that&#039;s a fair, balanced range, given the situation today. Sorry &#039;bout that minor flare-up, but I hope you agree with the viewpoint behind it.</description>
		<content:encoded><![CDATA[<p>Also, with the fate of humanity wobbling on the edge of oblivion, as it already is, it&#8217;s not like I personally would *demand* that the design be 100% provably Friendly. Given the current situation, I&#8217;d probably green-light it at anywhere between about 85% and 100% estimated probability of sustained Friendliness. I know that some people would demand more, but I think that&#8217;s a fair, balanced range, given the situation today. Sorry &#8217;bout that minor flare-up, but I hope you agree with the viewpoint behind it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeffrey Herrlich</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-10630</link>
		<dc:creator>Jeffrey Herrlich</dc:creator>
		<pubDate>Mon, 10 Sep 2007 16:19:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-10630</guid>
		<description>Mathemajician,

I don&#039;t really understand the motivation for this post. Even if Friendliness wasn&#039;t 100% provable, *trying* to make the first AGI Friendly sure as hell beats slapping together a minimal, arbitrary design and having *zero* intuition about whether it will be nice to humanity. If AGI designers intend to flip the switch on a minimal, arbitrary design without even an *attempt* to make it Friendly, then humanity and its future potential are probably screwed. Such an AGI would almost surely be totally indifferent to the well-being of humanity - the motivation-space is too large. Under such conditions wouldn&#039;t humanity most wisely never even *attempt* to make such a minimal, arbitrary AGI. Making a Friendliness attempt is infinitely better than making no attempt. Is publishing the comment: &quot;Friendly AI is Bunk&quot;, really helping the cause?</description>
		<content:encoded><![CDATA[<p>Mathemajician,</p>
<p>I don&#8217;t really understand the motivation for this post. Even if Friendliness wasn&#8217;t 100% provable, *trying* to make the first AGI Friendly sure as hell beats slapping together a minimal, arbitrary design and having *zero* intuition about whether it will be nice to humanity. If AGI designers intend to flip the switch on a minimal, arbitrary design without even an *attempt* to make it Friendly, then humanity and its future potential are probably screwed. Such an AGI would almost surely be totally indifferent to the well-being of humanity &#8211; the motivation-space is too large. Under such conditions wouldn&#8217;t humanity most wisely never even *attempt* to make such a minimal, arbitrary AGI. Making a Friendliness attempt is infinitely better than making no attempt. Is publishing the comment: &#8220;Friendly AI is Bunk&#8221;, really helping the cause?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Excerpts from Farlops Industries : Utopia?</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-2619</link>
		<dc:creator>Excerpts from Farlops Industries : Utopia?</dc:creator>
		<pubDate>Sun, 25 Mar 2007 22:02:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-2619</guid>
		<description>[...] I believe some pretty sketchy things:I believe that the premise of strong AI is sound.I believe that artificial life is already here.I believe the premise of molecular nanotechnology is sound.I believe the premise of cryonics.I believe the premise of brain taping.Given that I believe all that (I blame years of science fiction and an abortive career in physics.) more reasonable stuff like space elevators and the cure for aging are pretty tame.But what I never understood about these subjects is how they drive some people to get all, well, starry eyed and religious about them. There is always something about the future that gets people all dreamy. They assume somehow paradise will emerge and everything will get all cleaned up and solved. Then the handwaving starts:Post-humans will perpetually happy, all forms of suffering will end.God can be engineered and it&#039;ll love us.Nanotechnology and superautomation will usher in a post-scarcity world but, I guess some of us didn&#039;t get that memo.I flatly and categorically disagree with this handwaving. It&#039;s handwaving like this that got us into serious trouble in the past. The trouble with most thinking about technological singularities is that it encourages sloppy thinking. A lot of people in futurist circles reach a point in their exposition where they get very vague on how to get from here to there.Maybe I&#039;m just a curmudgeon. I remember, as a child back in the Seventies, reading these beautifully illustrated essays in an encyclopea about Gerard O&#039;Neill&#039;s space colonies and then watching video from the Apollo-Soyouz mission. Even then the juxtaposition was very informative to me. I think what I learned was that the eventually the future becomes the present and the wonderous becomes commonplace and problematic.I keep harping on this point but, I repeat it here. Heaven is a place where nothing ever happens. This suggests to me that the idea of Heaven and Utopia are logically flawed.Futurists would do well to avoid this kind of thinking.   Published Sunday, March 25, 2007 2:59 PM by Mr. Farlops [...]</description>
		<content:encoded><![CDATA[<p>[...] I believe some pretty sketchy things:I believe that the premise of strong AI is sound.I believe that artificial life is already here.I believe the premise of molecular nanotechnology is sound.I believe the premise of cryonics.I believe the premise of brain taping.Given that I believe all that (I blame years of science fiction and an abortive career in physics.) more reasonable stuff like space elevators and the cure for aging are pretty tame.But what I never understood about these subjects is how they drive some people to get all, well, starry eyed and religious about them. There is always something about the future that gets people all dreamy. They assume somehow paradise will emerge and everything will get all cleaned up and solved. Then the handwaving starts:Post-humans will perpetually happy, all forms of suffering will end.God can be engineered and it&#8217;ll love us.Nanotechnology and superautomation will usher in a post-scarcity world but, I guess some of us didn&#8217;t get that memo.I flatly and categorically disagree with this handwaving. It&#8217;s handwaving like this that got us into serious trouble in the past. The trouble with most thinking about technological singularities is that it encourages sloppy thinking. A lot of people in futurist circles reach a point in their exposition where they get very vague on how to get from here to there.Maybe I&#8217;m just a curmudgeon. I remember, as a child back in the Seventies, reading these beautifully illustrated essays in an encyclopea about Gerard O&#8217;Neill&#8217;s space colonies and then watching video from the Apollo-Soyouz mission. Even then the juxtaposition was very informative to me. I think what I learned was that the eventually the future becomes the present and the wonderous becomes commonplace and problematic.I keep harping on this point but, I repeat it here. Heaven is a place where nothing ever happens. This suggests to me that the idea of Heaven and Utopia are logically flawed.Futurists would do well to avoid this kind of thinking.   Published Sunday, March 25, 2007 2:59 PM by Mr. Farlops [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eliezer Yudkowsky</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-20</link>
		<dc:creator>Eliezer Yudkowsky</dc:creator>
		<pubDate>Fri, 15 Sep 2006 19:09:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-20</guid>
		<description>Most of these questions are answered in either http://sl4.org/wiki/KnowabilityOfFAI (predictability of superintelligent optimization targets) or http://sl4.org/wiki/CoherentExtrapolatedVolition (actual criterion of Friendliness; what humans want).  I will answer the one technical point separately (in the succeeding blog post).</description>
		<content:encoded><![CDATA[<p>Most of these questions are answered in either <a href="http://sl4.org/wiki/KnowabilityOfFAI" rel="nofollow">http://sl4.org/wiki/KnowabilityOfFAI</a> (predictability of superintelligent optimization targets) or <a href="http://sl4.org/wiki/CoherentExtrapolatedVolition" rel="nofollow">http://sl4.org/wiki/CoherentExtrapolatedVolition</a> (actual criterion of Friendliness; what humans want).  I will answer the one technical point separately (in the succeeding blog post).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marc_Geddes</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-19</link>
		<dc:creator>Marc_Geddes</dc:creator>
		<pubDate>Thu, 14 Sep 2006 05:44:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-19</guid>
		<description>The up-shot of all that philosopy is this:

There are 27 ontological &#039;categories of cognition&#039; which are &#039;universal in scope&#039; (applicable every-where in reality where minds could exist).

A universal reasoner (real AGI) would need class definitions for all 27 categories to work.  The 27 categories are listed at the end of the essay.  

Cheers!</description>
		<content:encoded><![CDATA[<p>The up-shot of all that philosopy is this:</p>
<p>There are 27 ontological &#8216;categories of cognition&#8217; which are &#8216;universal in scope&#8217; (applicable every-where in reality where minds could exist).</p>
<p>A universal reasoner (real AGI) would need class definitions for all 27 categories to work.  The 27 categories are listed at the end of the essay.  </p>
<p>Cheers!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mathemajician</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-16</link>
		<dc:creator>mathemajician</dc:creator>
		<pubDate>Wed, 13 Sep 2006 19:22:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-16</guid>
		<description>&lt;i&gt;starting by working out what consciousness really is would be a good start.&lt;/i&gt;

Yes, if we really understood consciousness, I suspect that many ethical issues would become significantly easier to understand and reason about.

&lt;i&gt;Proving friendliness may well be impossible, but surely given the stakes it is worth trying,&lt;/i&gt;

Sure, it&#039;s worth trying.  But man, as far as I can see we really don&#039;t have a handle on this problem at the moment.</description>
		<content:encoded><![CDATA[<p><i>starting by working out what consciousness really is would be a good start.</i></p>
<p>Yes, if we really understood consciousness, I suspect that many ethical issues would become significantly easier to understand and reason about.</p>
<p><i>Proving friendliness may well be impossible, but surely given the stakes it is worth trying,</i></p>
<p>Sure, it&#8217;s worth trying.  But man, as far as I can see we really don&#8217;t have a handle on this problem at the moment.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mathemajician</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-15</link>
		<dc:creator>mathemajician</dc:creator>
		<pubDate>Wed, 13 Sep 2006 19:17:46 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-15</guid>
		<description>I read some of that article but I&#039;m afraid that it&#039;s more philosophy than I can get my head around!</description>
		<content:encoded><![CDATA[<p>I read some of that article but I&#8217;m afraid that it&#8217;s more philosophy than I can get my head around!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mathemajician</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-14</link>
		<dc:creator>mathemajician</dc:creator>
		<pubDate>Tue, 12 Sep 2006 23:15:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-14</guid>
		<description>&lt;i&gt;the work done at SIAI. This work is woefully incomplete&lt;/i&gt;

That&#039;s the somewhat more polite version of my main message... ;-)

&lt;i&gt;Why do you suspect this problem impossible? Certainly it hasn’t been solved yet. &lt;/i&gt;

Why do you suspect it is possible?

Humans have been arguing about what things like friendliness mean for &lt;i&gt;thousands&lt;/i&gt; of years.  Countless generations of people have thought about these things going way back to the Greeks, and who knows who before that.  If you read what the Greeks had to say about some of these things, as I guess you will have, then you&#039;ll see that they were pretty sharp and were thinking really hard about these issues.  They were also thinking hard about physics, mathematics and many other things.  Now, how far as our understanding of things like physics and mathematics come since then?  It&#039;s developed enormously.  But ethics?  Hmmm.  Not much.  Even basic problems in ethics continue to present huge problems after thousands of years of work, including by some of the most brilliant minds in history.... and you&#039;re claiming that it will be possible to not just find good solutions to issues like what friendliness is, but to even do it using formal mathematical equations!

Hehe... good luck Nick!

*thumbs up*

Of course that doesn&#039;t prove that it&#039;s impossible, but it sure makes me suspect that it is.  At the very least I&#039;m really doubt that you will solve the problem before AGI turns up.

Besides, even if you did come up with an equation, I&#039;m sure that I could find some strange situations, stick them into your equation, and get answers out that many people disagree with... for the simple fact that on many ethical issues the population is divided.

&lt;i&gt;What do you mean by “primarily interested in its own self preservation”, and what are the “evolutionary pressures” working against systems which are interested in self preservation for derived reasons?&lt;/i&gt;

A friendly AI can only preserve itself through friendly actions, while a primarily self interested one can preserve itself in what ever way is the most effective.  This gives it a survival advantage due to its very nature.

&lt;i&gt;It’s more about transferring the ability to reason about morality, about what humans really want.&lt;/i&gt;

Go and ask various people in countries around the world is moral and what they want.  You&#039;ll get a real mixed bag of answers.  The world is full of groups of people who hate each other.

&lt;i&gt;This is one critical problem with creating AI which obey their owner’s direct commands. Even the most altruistic and rational human may be corrupted by such power. However, not all AI designs are structured to obey their programmers blindly.&lt;/i&gt;

So you can&#039;t rely on the people telling the AI what is right and what is wrong.  Which means that you need to get the AI&#039;s internal model of right and wrong correct to start with, and to do that it seems to me that you&#039;d need an extremely precise mathematical definition of friendliness... and as I&#039;ve said above, I&#039;m very skeptical about you managing to do that.

&lt;i&gt;You have to design AGIs nice, you can’t reliably detect it from their behaviour.&lt;/i&gt;

Same comment as the last one.

&lt;i&gt;One problem here is the AGI doesn’t need to predict all sequences under a given complexity, only the actual sequences the universe will produce. Note also that a Friendly AI doesn’t have to be omnipotent, it doesn’t have to have 100% success probability.&lt;/i&gt;

I&#039;m not asking for omnipotence of the AGI.  I&#039;m asking it to solve a finite, but very difficult, problem.  Even if it can solve it, you can&#039;t prove that it will do so in order to save the world... and thus you can&#039;t prove its friendliness.  Note that I haven&#039;t said anything about the AGI other than the fact that it&#039;s very powerful.  It might be friendly, it might not be.  Mathematical proof just can&#039;t answer that question for you.

It think I&#039;m going to have to write a longer explanation of this last point.  Maybe as a new blog post.

By the way, thanks for the well thought out reply.</description>
		<content:encoded><![CDATA[<p><i>the work done at SIAI. This work is woefully incomplete</i></p>
<p>That&#8217;s the somewhat more polite version of my main message&#8230; <img src='http://www.vetta.org/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p><i>Why do you suspect this problem impossible? Certainly it hasn’t been solved yet. </i></p>
<p>Why do you suspect it is possible?</p>
<p>Humans have been arguing about what things like friendliness mean for <i>thousands</i> of years.  Countless generations of people have thought about these things going way back to the Greeks, and who knows who before that.  If you read what the Greeks had to say about some of these things, as I guess you will have, then you&#8217;ll see that they were pretty sharp and were thinking really hard about these issues.  They were also thinking hard about physics, mathematics and many other things.  Now, how far as our understanding of things like physics and mathematics come since then?  It&#8217;s developed enormously.  But ethics?  Hmmm.  Not much.  Even basic problems in ethics continue to present huge problems after thousands of years of work, including by some of the most brilliant minds in history&#8230;. and you&#8217;re claiming that it will be possible to not just find good solutions to issues like what friendliness is, but to even do it using formal mathematical equations!</p>
<p>Hehe&#8230; good luck Nick!</p>
<p>*thumbs up*</p>
<p>Of course that doesn&#8217;t prove that it&#8217;s impossible, but it sure makes me suspect that it is.  At the very least I&#8217;m really doubt that you will solve the problem before AGI turns up.</p>
<p>Besides, even if you did come up with an equation, I&#8217;m sure that I could find some strange situations, stick them into your equation, and get answers out that many people disagree with&#8230; for the simple fact that on many ethical issues the population is divided.</p>
<p><i>What do you mean by “primarily interested in its own self preservation”, and what are the “evolutionary pressures” working against systems which are interested in self preservation for derived reasons?</i></p>
<p>A friendly AI can only preserve itself through friendly actions, while a primarily self interested one can preserve itself in what ever way is the most effective.  This gives it a survival advantage due to its very nature.</p>
<p><i>It’s more about transferring the ability to reason about morality, about what humans really want.</i></p>
<p>Go and ask various people in countries around the world is moral and what they want.  You&#8217;ll get a real mixed bag of answers.  The world is full of groups of people who hate each other.</p>
<p><i>This is one critical problem with creating AI which obey their owner’s direct commands. Even the most altruistic and rational human may be corrupted by such power. However, not all AI designs are structured to obey their programmers blindly.</i></p>
<p>So you can&#8217;t rely on the people telling the AI what is right and what is wrong.  Which means that you need to get the AI&#8217;s internal model of right and wrong correct to start with, and to do that it seems to me that you&#8217;d need an extremely precise mathematical definition of friendliness&#8230; and as I&#8217;ve said above, I&#8217;m very skeptical about you managing to do that.</p>
<p><i>You have to design AGIs nice, you can’t reliably detect it from their behaviour.</i></p>
<p>Same comment as the last one.</p>
<p><i>One problem here is the AGI doesn’t need to predict all sequences under a given complexity, only the actual sequences the universe will produce. Note also that a Friendly AI doesn’t have to be omnipotent, it doesn’t have to have 100% success probability.</i></p>
<p>I&#8217;m not asking for omnipotence of the AGI.  I&#8217;m asking it to solve a finite, but very difficult, problem.  Even if it can solve it, you can&#8217;t prove that it will do so in order to save the world&#8230; and thus you can&#8217;t prove its friendliness.  Note that I haven&#8217;t said anything about the AGI other than the fact that it&#8217;s very powerful.  It might be friendly, it might not be.  Mathematical proof just can&#8217;t answer that question for you.</p>
<p>It think I&#8217;m going to have to write a longer explanation of this last point.  Maybe as a new blog post.</p>
<p>By the way, thanks for the well thought out reply.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Accelerating Future &#187; Consolidation of Links on Friendly AI</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-13</link>
		<dc:creator>Accelerating Future &#187; Consolidation of Links on Friendly AI</dc:creator>
		<pubDate>Mon, 11 Sep 2006 10:15:03 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-13</guid>
		<description>[...] Friendly AI is bunk by Shane Legg Alternatives to (Yudkowskian) Friendly AI proposed on the SL4 list Critique of the SIAI Guidelines on Friendly AI by Bill Hibbard SIAI&#8217;s Guidelines for &#8216;Building&#8217; Friendly AI by Peter Voss [...]</description>
		<content:encoded><![CDATA[<p>[...] Friendly AI is bunk by Shane Legg Alternatives to (Yudkowskian) Friendly AI proposed on the SL4 list Critique of the SIAI Guidelines on Friendly AI by Bill Hibbard SIAI&#8217;s Guidelines for &#8216;Building&#8217; Friendly AI by Peter Voss [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marc_Geddes</title>
		<link>http://www.vetta.org/2006/09/friendly-ai-is-bunk/comment-page-1/#comment-12</link>
		<dc:creator>Marc_Geddes</dc:creator>
		<pubDate>Mon, 11 Sep 2006 05:12:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.vetta.org/?p=5#comment-12</guid>
		<description>Shane,

Thought about this for 4 years.  Finally cracked it.  The condition for Friendly AI is given below.  This sounds simple but I&#039;m saying something extremely subtle and extremely strange here so you may need to re-read it carefully a number of times:

---
&#039;The condition is to create a *Sentient* AI - i.e an AI that thinks it has Qualia... such that all mathematical entities representing metaphysical (or ontological) categories of *universal scope* are *experienced directly* - as &#039;mathematical qualia&#039; in the mind of the AI. Any AI is Friendly *if and only if* this condition is met&#039;
---

For a summary of my reasoning see my post on the everything-list here:
http://groups.google.com/group/everything-list/browse_thread/thread/6c0b178a63118d8f/c86de6243cdc81f4#c86de6243cdc81f4

Cheers!</description>
		<content:encoded><![CDATA[<p>Shane,</p>
<p>Thought about this for 4 years.  Finally cracked it.  The condition for Friendly AI is given below.  This sounds simple but I&#8217;m saying something extremely subtle and extremely strange here so you may need to re-read it carefully a number of times:</p>
<p>&#8212;<br />
&#8216;The condition is to create a *Sentient* AI &#8211; i.e an AI that thinks it has Qualia&#8230; such that all mathematical entities representing metaphysical (or ontological) categories of *universal scope* are *experienced directly* &#8211; as &#8216;mathematical qualia&#8217; in the mind of the AI. Any AI is Friendly *if and only if* this condition is met&#8217;<br />
&#8212;</p>
<p>For a summary of my reasoning see my post on the everything-list here:<br />
<a href="http://groups.google.com/group/everything-list/browse_thread/thread/6c0b178a63118d8f/c86de6243cdc81f4#c86de6243cdc81f4" rel="nofollow">http://groups.google.com/group/everything-list/browse_thread/thread/6c0b178a63118d8f/c86de6243cdc81f4#c86de6243cdc81f4</a></p>
<p>Cheers!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
