<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>vetta project &#187; reinforcement learning</title>
	<atom:link href="http://www.vetta.org/tag/reinforcement-learning/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.vetta.org</link>
	<description></description>
	<lastBuildDate>Thu, 22 Jul 2010 19:13:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Reinforcement learning in the brain</title>
		<link>http://www.vetta.org/2009/06/reinforcement-learning-in-the-brain/</link>
		<comments>http://www.vetta.org/2009/06/reinforcement-learning-in-the-brain/#comments</comments>
		<pubDate>Sun, 21 Jun 2009 22:10:48 +0000</pubDate>
		<dc:creator>Shane Legg</dc:creator>
				<category><![CDATA[Research Review]]></category>
		<category><![CDATA[Neuroscience]]></category>
		<category><![CDATA[reinforcement learning]]></category>

		<guid isPermaLink="false">http://www.vetta.org/?p=548</guid>
		<description><![CDATA[Model-free reinforcement learning (RL) algorithms are computationally cheap as each state-action pair keeps a cached estimate of its value that can easily be looked up in order to make a decision. Their weakness is that they are not easy to &#8230; <a href="http://www.vetta.org/2009/06/reinforcement-learning-in-the-brain/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Model-free reinforcement learning (RL) algorithms are computationally cheap as each state-action pair keeps a cached estimate of its value that can easily be looked up in order to make a decision.  Their weakness is that they are not easy to update when the agent&#8217;s goals, or the state of the world, changes in some critical way.  Model-based RL, on the other hand, is better in this respect as it can use reasoning or search on a model in order to find paths leading to the fulfilment of the agent&#8217;s current goals.  The downside, of course, is much greater computational cost.</p>
<p>So what does the brain do?  For over a decade it has been known that temporal difference learning, a type of model-free RL algorithm, appears to explain the activity of dopamine neurons and their dorsolateral striatal projections.  It has also been observed that parts of the prefrontal cortex appear to implement some kind of model-based RL algorithm.  Mammalian brains, then, appear to get the best of worlds by having model-free <em>and</em> model-based RL algorithms and then choosing which to use on the fly.  Pretty clever huh?<br />
<span id="more-548"></span></p>
<p>A key question then is how this choice is made.  I recently read <a href="http://www.gatsby.ucl.ac.uk/~dayan/papers/dawnivd05.pdf">Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control</a> from <em>Nature Neuroscience</em>, by Nathaniel Daw, Yael Niv and Peter Dayan.  They suggest that the brain may be using some kind of Bayesian principle based on the uncertainty estimates generated by each system.  They implement such a system and show how it can explain a range of experimental data from animal studies.</p>
<p>This research is now four years old and there is plenty of significant research in the area that follows it &#8212; an &#8220;embarrassment of riches&#8221; as Dayan recently described it.   Nevertheless, I think it&#8217;s a good example of the kind of crossover between machine learning and neuroscience that is starting to take place.  Indeed, the more neuroscientists learn about the brain, the more it starts to look like the rough outline of an AGI design.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.vetta.org/2009/06/reinforcement-learning-in-the-brain/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
