<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Prefrontal.org &#187; Psychology</title>
	<atom:link href="http://prefrontal.org/blog/category/psychology/feed/" rel="self" type="application/rss+xml" />
	<link>http://prefrontal.org/blog</link>
	<description>A personal weblog of developmental cognitive neuroscience.</description>
	<lastBuildDate>Fri, 06 Jan 2012 22:06:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Quote of the Week &#8211; Cameron</title>
		<link>http://prefrontal.org/blog/2012/01/quote-of-the-week-cameron/</link>
		<comments>http://prefrontal.org/blog/2012/01/quote-of-the-week-cameron/#comments</comments>
		<pubDate>Fri, 06 Jan 2012 22:06:20 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[Quotes]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1404</guid>
		<description><![CDATA[&#8220;It would be nice if all of the data which sociologists require could be enumerated because then we could run them through IBM machines and draw charts as the economists do. However, not everything that can be counted counts, and not everything that counts can be counted.&#8221; &#8211; William Bruce Cameron, Informal Sociology: A Casual [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;It would be nice if all of the data which sociologists require could be enumerated because then we could run them through IBM machines and draw charts as the economists do. However, not everything that can be counted counts, and not everything that counts can be counted.&#8221; &#8211; William Bruce Cameron, <em>Informal Sociology: A Casual Introduction to Sociological Thinking</em> (1963)</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2012/01/quote-of-the-week-cameron/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hot, Hot iPhone Love (More Terrible Neuromarketing)</title>
		<link>http://prefrontal.org/blog/2011/10/hot-hot-iphone-love-more-terrible-neuromarketing/</link>
		<comments>http://prefrontal.org/blog/2011/10/hot-hot-iphone-love-more-terrible-neuromarketing/#comments</comments>
		<pubDate>Mon, 03 Oct 2011 22:41:05 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Miscellany]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1321</guid>
		<description><![CDATA[I hate being late to a party. You finally arrive after the festivities have begun and you know that your friends have already been there for hours, having a grand time doing what they do best. So it is with the latest neuromarketing debacle involving the New York Times and the pseudoscience that appeared on [...]]]></description>
			<content:encoded><![CDATA[<p>I hate being late to a party.  You finally arrive after the festivities have begun and you know that your friends have already been there for hours, having a grand time doing what they do best.  So it is with the latest neuromarketing debacle involving the New York Times and the pseudoscience that appeared on the op-ed page.  All the best stuff has already been written.</p>
<p>Summary:</p>
<p>A branding consultant (Martin Lindstrom) commissions a neuromarketing company (MindSign) to do a neuroimaging study.  Sixteen subjects underwent fMRI data acquisition while being shown audio and video of ringing iPhones.  Visual and auditory cortex was active across all conditions.  There was also activity in the insula.  The authors interpret the sensory cortex activity as a kind of cross-modal synesthesia experience.  The authors further interpret the insula activity as the subjects experiencing feelings of love and compassion.  Headlines around the web ring loudly with headlines &#8220;YOU LOVE YOUR iPHONE&#8221;.</p>
<p>Web points of interest:</p>
<p>1) Read the original op-ed piece by Martin Lindstrom to give yourself some context regarding what was said and the arguments that were made.  It will probably make your skin crawl with tales of babies wanting cell phones to be iPhones and terrible definitions of synesthesia.  Stick with it anyway.<br />
<a href="http://www.nytimes.com/2011/10/01/opinion/you-love-your-iphone-literally.html">http://www.nytimes.com/2011/10/01/opinion/you-love-your-iphone-literally.html</a></p>
<p>2) Start at Russ Poldrack&#8217;s weblog and read his first post on the topic.  He called it crap, and he was being direct and truthful.<br />
<a href="http://www.russpoldrack.org/2011/10/nyt-editorial-fmri-complete-crap.html">http://www.russpoldrack.org/2011/10/nyt-editorial-fmri-complete-crap.html</a></p>
<p>3) Now read Tal Yarkoni&#8217;s excellent in-depth discussion of the problem.  If you read nothing else today, go and check this one out.<br />
<a href="http://www.talyarkoni.org/blog/2011/10/01/the-new-york-times-blows-it-big-time-on-brain-imaging/">http://www.talyarkoni.org/blog/2011/10/01/the-new-york-times-blows-it-big-time-on-brain-imaging/</a></p>
<p>4) Next, read the post by Vaughan Bell at Mind Hacks, which is also a nice follow-up.  Double points for using the term &#8220;facepalm jamboree&#8221;.<br />
<a href="http://mindhacks.com/2011/10/02/the-new-york-times-wees-itself-in-public/">http://mindhacks.com/2011/10/02/the-new-york-times-wees-itself-in-public/</a></p>
<p>5) Finally, see the list of people who support Poldrack&#8217;s position on the Lindstrom article.  Many of the best minds in neuroscience are agreed that the Op-Ed piece is not representative of good science:<br />
<a href="http://www.russpoldrack.org/2011/10/signers-of-letter-to-editor-of-new-york.html">http://www.russpoldrack.org/2011/10/signers-of-letter-to-editor-of-new-york.html</a></p>
<p>&nbsp;</p>
<hr />
&nbsp;</p>
<p>To be honest, I don&#8217;t have a whole lot to add to the conversation.  On the topic of reverse inference you really can&#8217;t do better than Russ Poldrack and Tal Yarkoni.  The Yarkoni blog post is particularly good, effectively nuking the Lindstrom piece from orbit.  It is, in a way, poetic since Poldrack and Yarkoni are working on the databases and methods that will enable probabilities to be put on arguments such as Lindstrom&#8217;s.  That is, if insula activation is observed how likely is it that the emotion of &#8216;love&#8217; is being experienced.  To give their technology a try surf on over to <a href="http://neurosynth.org/">http://neurosynth.org/</a> and check it out.</p>
<p>One aspect of the debate that I am particularly interested in is the purported role of the insula in the experience of love and affection.  Unfortunately, Lindstrom provided very little detail in terms of the spatial location of their insula activity, effectively preventing anyone from criticizing the work on that basis.  But, for the sake of argument, let&#8217;s put the insular question forward.  Does it matter where in the insula that the activity was observed?  The short answer is: absolutely.</p>
<p>There is an excellent paper by A. D. &#8220;Bud&#8221; Craig entitled &#8220;Forebrain emotional asymmetry: a neuroanatomical basis?&#8221; that details how the left and right insula  have a different pattern of connectivity to the homeostatic afferents that provide information on our current body state.  Craig describes how the right insula is preferentially involved in sympathetic nervous system activity geared toward engaging with the environment, energy use, and even &#8220;fight or flight&#8221; responses.  Conversely, the left insula is preferentially involved in parasympathetic activity geared toward contentment, energy conservation, and &#8220;rest and digest&#8221; responses.  </p>
<p>In our evolution, humans seem to have bolted-on social components to this underlying insular emotional asymmetry.  The right insula seems to be associated with the experience of social disgust and social avoidance.  This has been seen in work such as the original Philips et al. (1997) paper, showing prominent right anterior insula activity during disgust.  The left insula seems to be associated with the experience of social compassion and social approach.  There is less evidence for this, but meta-analyses such as Ortigue et al. (2010) have reported this pattern.</p>
<p>In short, leaving out which hemisphere the results occurred in is a huge faux pax on the part of Lindstrom.  It is not the greatest sin of the piece, and probably not even the greatest sin of the insula argument.  Still, it is certainly a prominent <em>FAIL</em> from the perspective of a researcher with an interest in the insula.</p>
<p>&nbsp;</p>
<hr />
&nbsp;</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2011/10/BridgeFail.png" alt="" title="BridgeFail" width="309" height="234" /></center></p>
<p>&nbsp;</p>
<hr />
&nbsp;</p>
<p>One final point of discussion I would like to raise is with regard to an earlier prefrontal.org <a href="http://prefrontal.org/blog/2011/04/the-seven-sins-of-neuromarketing/">post</a> on the Seven Sins of Neuromarketing.  Let&#8217;s see which ones are most prominent in the current discussion:</p>
<p><strong>1) The curtain of proprietary analysis methods limits our knowledge of how effective neuromarketing can be.</strong></p>
<p>We have no idea what methods Lindstrom and his colleagues used to arrive at their findings.  It could be the best study in the history of ever, or it could be riddled with common statistical flaws.  We have no idea because the work isn&#8217;t peer-reviewed.  As before, we don&#8217;t even know where in the insula the results were located!</p>
<p><strong>3) Most people’s introduction to neuromarketing is through press releases, not peer-reviewed studies.</strong></p>
<p>Let&#8217;s just establish this as a rule: the New York Times editorial page is not the right place to introduce the world to your cutting-edge, unproven fMRI methods.  Period.  In fact, we should come up with a verb for what always happens afterward: you get Poldrack&#8217;d.  </p>
<p><strong>4) Neuromarketing methods are not immune to subjectivity and bias.</strong></p>
<p>In a way, scientific claims are guilty until proven innocent by empirical evidence.  Honestly, can I trust a man who has written books with titles like Buyology, Brandwashed, and Brand Sense to be objective with regard to a neuromarketing study with a sensational headline?  If this was work was peer-reviewed then we could evaluate his evidence in a balanced manner, but an Op-Ed piece does not allow for this luxury and leaves the question of bias open.  </p>
<p><strong>6) People are rushing the field to make a quick buck, and not everyone is trustworthy.</strong></p>
<p>I think that this represents the case in point.</p>
<p>&nbsp;</p>
<hr />
&nbsp;</p>
<p>Ortigue S, Bianchi-Demicheli F, Patel N, Frum C, Lewis JW. (2010). Neuroimaging of love: fMRI meta-analysis evidence toward new perspectives in sexual medicine. J Sex Med. 7(11): 3541-3552.</p>
<p>Phillips ML, Young AW, Senior C, Brammer M, Andrew C, Calder AJ, Bullmore ET, Perrett DI, Rowland D, Williams SC, Gray JA, David AS. (1997). A specific neural substrate for perceiving facial expressions of disgust. Nature. 389(6650): 495-498.<br />
&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2011/10/hot-hot-iphone-love-more-terrible-neuromarketing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Significant Differences</title>
		<link>http://prefrontal.org/blog/2011/09/significant-differences/</link>
		<comments>http://prefrontal.org/blog/2011/09/significant-differences/#comments</comments>
		<pubDate>Mon, 26 Sep 2011 20:07:45 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1282</guid>
		<description><![CDATA[One of the first things you learn in an introductory psychology class is the topic of cognitive bias. These are situations or contexts in which human beings cannot reliably make effective judgements or discriminations. For instance, information that tends to confirm our own assumptions is generally judged to be correct (Confirmation Bias). Another example is [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2011/09/SigNotSig4.png" alt="" title="SigNotSig4" width="200" height="214" align="right" />One of the first things you learn in an introductory psychology class is the topic of cognitive bias.  These are situations or contexts in which human beings cannot reliably make effective judgements or discriminations.  For instance, information that tends to confirm our own assumptions is generally judged to be correct (<a href="http://en.wikipedia.org/wiki/Confirmation_bias">Confirmation Bias</a>).  Another example is the disproportionate attention given to negative experiences relative to positive experiences (<a href="http://en.wikipedia.org/wiki/Negativity_bias">Negativity Bias</a>).  In each situation perception and decision making is distorted <em>even though we should know better</em>.  It may be the case that we need to come up with a new bias to explain investigator behavior.  Significance Bias anyone?</p>
<p>There is a great <a href="http://www.nature.com/neuro/journal/v14/n9/abs/nn.2886.html">article</a> by Nieuwenhuis, Forstmann, and Wagenmakers in this month&#8217;s edition of Nature Neuroscience.  Entitled &#8220;Erroneous analyses of interactions in neuroscience: a problem of significance&#8221;, the paper discusses the problem of how to gauge when two effects differ in neuroscience.  It turns out that many papers misjudge the difference between effects by basing their judgement on significance values, <em>even though they should know better</em>.</p>
<p>The crux of the issue is that it is improper to judge the difference between two effects by looking at their relative significance.  The perceived difference between a significant effect ( i.e. p < 0.05) and non-significant effect (i.e. p > 0.05) does not necessarily mean that the two effects are themselves significantly different.  You have to explicitly test for that.</p>
<p>In fMRI, this could mean relating one brain area that is significant to another brain area that is not significant.  The temptation is to discuss the significant region as being more active than the nonsignificant region based on the fact that the latter region was below the significance threshold. This actually may or may not be the case.</p>
<p>Andrew Gelman and Hal Stern wrote a similar <a href="http://www.stat.columbia.edu/~gelman/research/published/signif4.pdf">article</a> on the problem a few years ago.  The focus of their piece was simply to draw attention to the issue through the use of several theoretical and real life examples.  While they were able to say that the problem existed, they were unable to say how prevalent the problem was across any particular scientific discipline.  The power of the Nieuwenhuis, Forstmann, and Wagenmakers paper is that it extends the Gelman &#038; Stern work through an analysis of the existing literature to put concrete numbers on how widespread the problem is in neuroscience.</p>
<p>The authors conducted a survey of 513 articles in major neuroscience journals.  They identified 157 papers containing an analysis where the authors would be tempted to make an inferential error by focusing on significance.  They found that in 78 out of 157 cases (50%) the authors did indeed make an error.  That is far higher that I would have guessed, and one of the reasons I felt compelled to write about it today.  I mean, come on, <em>fifty percent</em>?  Really?</p>
<p>In the next to last paragraph the authors specifically state the the error of comparing significance levels is particularly acute in neuroimaging.  From my perspective we are almost setup for failure in this regard, as significant regions are visualized as a range of attention-grabbing colors while regions that are not significant are visualized as completely blank.  </p>
<p>I could rail on a bit longer, but that is time you could be using to go and read this article.  There is a lot of good information in the text &#8211; it is short, punchy, and well worth your time.</p>
<p>Some additional discussion on the topic:<br />
<a href="http://andrewgelman.com/2011/09/the-difference-between-significant-and-not-significant/">http://andrewgelman.com/2011/09/the-difference-between-significant-and-not-significant/</a></p>
<p>Gelman A and Stern H. (2006). The Difference Between “Signiﬁcant” and “Not Signiﬁcant” is not Itself Statistically Signiﬁcant. <em>The American Statistician</em> 60(4), 328-331.</p>
<p>Nieuwenhuis S, Forstmann BU, and Wagenmakers EJ. (2011). Erroneous analyses of interactions in neuroscience: a problem of significance.  <em>Nature Neuroscience</em> 14, 1105-1107.</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2011/09/significant-differences/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Neuromarketing Debate, May 23rd</title>
		<link>http://prefrontal.org/blog/2011/05/neuromarketing-debate-may-23rd/</link>
		<comments>http://prefrontal.org/blog/2011/05/neuromarketing-debate-may-23rd/#comments</comments>
		<pubDate>Sat, 21 May 2011 20:49:11 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Miscellany]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1255</guid>
		<description><![CDATA[Do you feel like neuromarketing is a disruptive new technology, or just another example of neurohype? Regardless of where you stand on the issue you might be interested in a debate I will be participating in next Monday, the 23rd of May, at Stanford Medical School. The Stanford Interdisciplinary Group on Neuroscience and Society (SIGNS) [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2011/05/cajal.jpg" alt="" title="cajal" width="200" height="227" align="right">Do you feel like neuromarketing is a disruptive new technology, or just another example of neurohype?  Regardless of where you stand on the issue you might be interested in a debate I will be participating in next Monday, the 23rd of May, at Stanford Medical School.  </p>
<p>The Stanford Interdisciplinary Group on Neuroscience and Society (SIGNS) is hosting the debate, which is focused on neuroscience in the marketplace.  Jim Sullivan, the CEO of <a href="http://www.neurosky.com/">NeuroSky</a>, <a href="http://www.stanford.edu/~ukarma/">Uma Karmarkar</a> from the Stanford Graduate School of Business, and myself will all weigh in on the topic of whether neuroscience is being used to manipulate consumers.  </p>
<p>I think you might already know where I stand based on my &#8216;Seven Sins&#8217; neuromarketing post, but the event promises to be a lively affair with a diverse array of perspectives.  Come check it out if you are in the bay area next week!</p>
<p>Grab some <a href="http://www.law.stanford.edu/calendar/details/5714/Neuroscience%20in%20the%20Marketplace%3A%20A%20Debate/">details</a> on the event, or check out the event <a href="http://www.law.stanford.edu/display/images/dynamic/events_media/Neuromarketing%20Poster.pdf">poster</a> for more information.</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2011/05/neuromarketing-debate-may-23rd/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The Seven Sins of Neuromarketing</title>
		<link>http://prefrontal.org/blog/2011/04/the-seven-sins-of-neuromarketing/</link>
		<comments>http://prefrontal.org/blog/2011/04/the-seven-sins-of-neuromarketing/#comments</comments>
		<pubDate>Sat, 23 Apr 2011 00:59:44 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1098</guid>
		<description><![CDATA[I got quoted in a random neuromarketing article recently. In the flurry of people I have been chatting with about statistics and functional neuroimaging I often neglect to ask what organizations people are associate with. In this case it was Forbes magazine. http://www.forbes.com/forbes/2009/1116/marketing-hyundai-neurofocus-brain-waves-battle-for-the-brain.html In the online version of the article there was a user comment [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2011/04/NewOrleansSign.jpg" alt="" title="NewOrleansSign" width="198" height="278" align="right">I got quoted in a random neuromarketing article recently.  In the flurry of people I have been chatting with about statistics and functional neuroimaging I often neglect to ask what organizations people are associate with.  In this case it was Forbes magazine.</p>
<p><a href="http://www.forbes.com/forbes/2009/1116/marketing-hyundai-neurofocus-brain-waves-battle-for-the-brain.html">http://www.forbes.com/forbes/2009/1116/marketing-hyundai-neurofocus-brain-waves-battle-for-the-brain.html</a></p>
<p>In the online version of the article there was a user comment from a neuromarketing company CEO defending the honor of his business and the field in which they operate.  He went so far as to compare the launch of neuromarketing with the initial steps of market research in the early 20th century.  He further argued that neuromarketing would bring about the next revolution in understanding consumer behavior.</p>
<p>I have to admit, my gut reaction on first reading this statement was one of mild disgust.  This got me thinking about why neuromarketing hangs in a cloud of disdain among many scientists.  Below are some of the &#8216;sins&#8217; which I feel currently plague the field of neuromarketing.  This is all just my opinion of course, but I do think that it raises some interesting points for discussion.</p>
<p><strong>1) The curtain of proprietary analysis methods limits our knowledge of how effective neuromarketing can be.</strong></p>
<p>Neuromarketing seems to be primarily driven by the private industry, not academia.  This is not to say that research into consumer behavior has not occurred at the university level.  There has been a lot of good neuroeconomics research in the last several years.  Still, it is mostly companies in private industry that are driving the application of these findings to practical consumer behaviors.  Because these companies are in competition with each other they are reluctant to give others the recipe to their secret analysis sauce.  From the outside this means that the analysis pipeline of all neuromarketing companies is that of a black box, with data going in one end and the results-you-need coming out the other.</p>
<p>My colleagues and I have the position that fMRI research utilizing incorrect statistics can generate a large number of false positives. That is, many of the results will be there simply because of noise. Because so much of the current neuromarketing data is hidden behind the veil of proprietary analysis methods it is impossible to judge how successful their methods actually are, and to what degree their findings are false positives.  </p>
<p><strong>2) There is little peer-reviewed literature that is specific to neuromarketing.</strong></p>
<p>Neuromarketing is an emerging discipline that will, in time, give us new insight into human behavior. Unfortunately, little peer-reviewed research has currently been published in this area. Search for &#8216;<a href="http://www.ncbi.nlm.nih.gov/pubmed?term=neuromarketing">neuromarketing</a>&#8216; in the PubMed database of abstracts (www.pubmed.com) and you will find all of ten publications. This must change for neuromarketing to mature.  </p>
<p>Again, without peer-reviewed results on the effectiveness of neuromarketing experiments all we have to rely on are self-reports from the neuromarketing firms themselves.  An issue similar to the <a href="http://en.wikipedia.org/wiki/Publication_bias">file-drawer problem</a> then exists.  The file-drawer problem is when only positive results get published in journals while negative results sit unpublished in the file drawer.  Neuromarketing companies will be likely to report positive results while negative results sit undistributed.  Either way, the end result is a biased understanding.</p>
<p><strong>3) Most people&#8217;s introduction to neuromarketing is through press releases, not peer-reviewed studies.</strong></p>
<p>In 2006 there was an &#8220;instant-science&#8221; <a href="http://www.edge.org/3rd_culture/iacoboni06/iacoboni06_index.html">article</a> released online by Marco Iacoboni et al. revealing their analysis of fMRI date obtain while subjects were watching Super Bowl advertisements.  The much-discussed post, entitled &#8220;Who Really Won the Super Bowl?&#8221;, tried to determine the most effective commercial by judging which one activated regions involved in reward and empathy to the greatest degree.  They determined that a commercial from Disney fared the best when evaluated by these measures.  Many neuroscientists shook their heads and moved on.</p>
<p>In 2009 the same group published an <a href="http://www.nytimes.com/2007/11/11/opinion/11freedman.html?pagewanted=1&#038;ei=5090&#038;en=e0ca987ad4bd515f&#038;ex=1352437200&#038;partner=rssuserland&#038;emc=rss">op-ed</a> in the New York Times detailing the results of scanning 20 individuals while looking at pictures and videos of leading political candidates.  They drew conclusions on candidate evaluations by examining activity in areas like the amygdala and anterior cingulate.  For example, they concluded that amygdala activity indicated a state of anxiety and cingulate activity indicated cognitive conflict.  These oversimplifications were so well publicized and widely distributed that a number of leading neuroscientists were compelled to publish a <a href="http://query.nytimes.com/gst/fullpage.html?res=9907E1D91E3CF937A25752C1A9619C8B63">letter</a> in the New York Times calling the Iacoboni results into question.</p>
<p>Let&#8217;s put it this way, when many of the top minds in neuroimaging feel compelled to assemble a letter to the New York times regarding your non-peer-reviewed neuromarketing/neuropolitics results then the field has a problem.  </p>
<p>There are a handful of peer-reviewed neuromarketing papers that do deliver.  One recent <a href="http://www.ncbi.nlm.nih.gov/pubmed/19874974">paper</a> by Michael Schaefer was a very interesting investigation into the representation of brand associations.  However, these type of studies are typically rare, and it remains that the signal-to-noise ratio of information in the press is very low.</p>
<p><strong>4) Neuromarketing methods are not immune to subjectivity and bias.</strong></p>
<p>One of the most highly touted aspects of neuromarketing methods is that they are free from subjectivity and bias on the part of the participant.  For example, asking a subject what they thought of a particular brand introduces the muddying waters of conscious consideration.  The person&#8217;s response will be colored by a complex web of tangential cognitive factors and contextual biases.  The promise of neuromarketing is that you can bypass these confounding factors to get at the heart of the matter &#8211; the real representation of the brand.  While this is true to a degree, an entirely new set of confounding factors is introduced during the analysis of neuromarketing data.</p>
<p>While many neuromarketing measures are indeed more objective than verbal reports, I must disagree with the observation that they are unfiltered, true reports of the underlying representation. While the signals are not filtered by the consciousness of the research subject, a great deal of manipulation and filtering of the data is done by the researcher.  This does introduce the potential for bias, simply by a different avenue.</p>
<p>Small changes in processing pipelines can have a huge impact on the power of fMRI to detect relevant signals.  Some excellent <a href="http://www.ncbi.nlm.nih.gov/pubmed?term=Strother%20SC%5BAuthor%5D">papers</a> by Stephen Strother come to mind with regard to this point.  With no knowledge of what is going on we have no idea how objective the analyses by these companies can be.</p>
<p><strong>5) The value per dollar of neuromarketing methods has yet to be determined.</strong></p>
<p>Neuromarketing studies are expensive.  The Forbes article says that an average EEG or fMRI marketing study costs in the neighborhood of $50,000.  Immediately this number can trigger a &#8216;more expensive = better&#8217; response, especially if you have a large budget to support such studies.  What rarely gets discussed is what kind of value you obtain in return for the huge amount of money that is spent.</p>
<p>The key question in neuromarketing is what information can you get with EEG / fMRI / eye tracking / biometrics that you cannot obtain using other methods. If I can spend $1000 to do a traditional market study that gets me 85% of what a $50,000 fMRI study does then the return on my neuromarketing investment is not great.  Thinking about it another way, how much less or more could I get across 50 traditional studies relative to the value of one neuromarketing study.  </p>
<p>Many companies are not limited by the extreme cost of neuromarketing studies, and a significant fraction of them are not afraid to take the risk to try something new.  Perhaps part of the motivation is also the fear of being left behind &#8211; that a competitor will take the risk and gain a competitive advantage in consumer understanding.  Whatever the motivation, there will always be a market for neuromarketing methods.  Still, we must still acknowledge that the value of neuromarketing is an open question.</p>
<p><strong>6) People are rushing the field to make a quick buck, and not everyone is trustworthy.</strong></p>
<p>The emergence of neuromarketing represents a modern day gold rush in terms of buzz and promises.  Brilliant researchers will be attracted to this opportunity and will significantly advance the field of neuromarketing.  Morally questionable individuals will also be drawn to the opportunity, and will end up giving the field a black eye.  Reputations will build up over time and trustworthy companies will emerge from the fray, but the current situation is more akin to the wild west than a civilized exchange.</p>
<p><strong>7) The true value of neuromarketing is obscured by the above-mentioned problems.</strong></p>
<p>I thought I would end on a high note.  There is certainly significant value to using neuromarketing methods in consumer research.  Why else would companies like Nielsen Holdings be <a href="http://www.adweek.com/news/television/nielsen-buys-neurofocus-94865">investing</a> in neuromarketing firms like NeuroFocus?  One of the biggest problems is that the true value of these methods is obscured by those who treat it as a gimmick and have the loudest voice.  The next ten years will represent a true shakedown of the neuromarketing industry.  Companies that are able to provide real value to their customers will live on while those who simply seek to make pretty pictures will fall by the wayside.  It will be a fascinating time to be an observer of the business and politics in this emerging field.</p>
<p><strong>Conclusions.</strong></p>
<p>The above points ignore many other issues facing neuromarketing.  I have completely bypassed a discussion of the ethics of neuromarketing.  Many people worry that technologies like fMRI will help marketers find the &#8216;buy button&#8217; in the brain, stripping away people&#8217;s free will in product choice.  I am not terribly worried about that discussion, perhaps because I am ignoring the problem or perhaps because I know too much about brain function or neuroimaging methods.  Regardless, there are other issues and hurdles that neuromarketing must address to grow as a field.</p>
<p>In the end I do wish neuromarketing great success. I simply fear that those individuals who are seeking to profit on the popularity will tarnish the reputation of neuromarketing before it is able to legitimize itself.  </p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2011/04/the-seven-sins-of-neuromarketing/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>PAPER: An Argument For Proper Multiple Comparisons Correction</title>
		<link>http://prefrontal.org/blog/2010/11/paper-an-argument-for-proper-multiple-comparisons-correction/</link>
		<comments>http://prefrontal.org/blog/2010/11/paper-an-argument-for-proper-multiple-comparisons-correction/#comments</comments>
		<pubDate>Wed, 03 Nov 2010 22:39:49 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1187</guid>
		<description><![CDATA[It has been a long road, but our multiple comparisons paper including the salmon has been published. See below for more details, including the abstract and a link to the download page of the journal. If you have any questions or comments please post them below or send me an email directly. - &#8211; - [...]]]></description>
			<content:encoded><![CDATA[<p>It has been a long road, but our multiple comparisons paper including the salmon has been published.  See below for more details, including the abstract and a link to the download page of the journal.  If you have any questions or comments please post them below or send me an email directly.</p>
<p><center>- &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; - &#8211; -</center></p>
<p><strong>Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction</strong></p>
<p>Craig M. Bennett(1), Abigail A. Baird(2), Michael B. Miller(1) and George L. Wolford(3)<br />
1)Department of Psychology, University of California at Santa Barbara, Santa Barbara, CA 93106<br />
2)Department of Psychology, Blodgett Hall, Vassar College, Poughkeepsie, NY 12604<br />
3)Department of Psychological and Brain Sciences, Moore Hall, Dartmouth College, Hanover, NH 03755</p>
<p><a href="http://www.jsur.org/v1n1p1">Journal of Serendipitous and Unexpected Results</a>, 2010. 1(1):1-5<br />
Early Access: Oct 20, 2010</p>
<p>With the extreme dimensionality of functional neuroimaging data comes extreme risk for false positives. Across the 130,000 voxels in a typical fMRI volume the probability of at least one false positive is almost certain. Proper correction for multiple comparisons should be completed during the analysis of these datasets, but is often ignored by investigators. To highlight the danger of this practice we completed an fMRI scanning session with a post-mortem Atlantic Salmon as the subject. The salmon was shown the same social perspective-taking task that was later administered to a group of human subjects. Statistics that were uncorrected for multiple comparisons showed active voxel clusters in the salmon’s brain cavity and spinal column. Statistics controlling for the family-wise error rate (FWER) and false discovery rate (FDR) both indicated that no active voxels were present, even at relaxed statistical thresholds. We argue that relying on standard statistical thresholds (p < 0.001) and low minimum cluster sizes (k > 8) is an ineffective control for multiple comparisons. We further argue that the vast majority of fMRI studies should be utilizing proper multiple comparisons correction as standard practice when thresholding their data.</p>
<p>See the JSUR early access page to download the article and supplementary material.<br />
<a href="http://www.jsur.org/v1n1p1">http://www.jsur.org/v1n1p1</a></p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/11/paper-an-argument-for-proper-multiple-comparisons-correction/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Riverside Presentation Slides</title>
		<link>http://prefrontal.org/blog/2010/10/riverside-presentation-slides/</link>
		<comments>http://prefrontal.org/blog/2010/10/riverside-presentation-slides/#comments</comments>
		<pubDate>Thu, 28 Oct 2010 07:28:14 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Miscellany]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1178</guid>
		<description><![CDATA[Just wanted to take a second to thank the kind folks in the Psychology Department at UC Riverside for hosting me this afternoon. I gave a neuroimaging stats talk for their cognitive brown bag series, and it was a really great time! For anyone who is interested a copy of the slides from my presentation [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2010/10/Riverside.jpg" alt="" title="Riverside" width="160" height="154" align='right'>Just wanted to take a second to thank the kind folks in the Psychology Department at UC Riverside for hosting me this afternoon.  I gave a neuroimaging stats talk for their cognitive brown bag series, and it was a really great time!</p>
<p>For anyone who is interested a copy of the slides from my presentation can be downloaded at the link below.  If you have any questions or comments feel free to email me &#8211; I would love to chat more.  Take care UCR!</p>
<p><a href="http://prefrontal.org/files/presentations/Bennett-Riverside-2010.pdf">http://prefrontal.org/files/presentations/Bennett-Riverside-2010.pdf</a></p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/10/riverside-presentation-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Spring 2010 Conference Posters</title>
		<link>http://prefrontal.org/blog/2010/10/spring-2010-conference-posters/</link>
		<comments>http://prefrontal.org/blog/2010/10/spring-2010-conference-posters/#comments</comments>
		<pubDate>Wed, 20 Oct 2010 23:26:13 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Interoception]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1151</guid>
		<description><![CDATA[I have been remiss in uploading copies of my spring conference posters. October seems like a fine month to rectify that. Below are links to the research I presented at the Cognitive Neuroscience Society meeting in Montreal and at the Organization for Human Brain Mapping meeting in Barcelona. Both meetings were fantastic &#8211; I got [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2010/10/dome.jpg" alt="" title="dome" width="430" height="236" align='center'></p>
<p>I have been remiss in uploading copies of my spring conference posters.  October seems like a fine month to rectify that.  Below are links to the research I presented at the Cognitive Neuroscience Society meeting in Montreal and at the Organization for Human Brain Mapping meeting in Barcelona.  Both meetings were fantastic &#8211; I got to meet a lot of new people and experience all the awesomeness that Montreal and Barcelona have to offer.  </p>
<p><strong>* How reliable are the results from fMRI?</strong><br />
Conference Poster: [<a href="http://prefrontal.org/files/posters/Bennett-Reliability-2010.pdf">PDF</a>] [<a href="http://prefrontal.org/files/posters/Bennett-Reliability-2010.jpg">JPEG</a>]</p>
<p><strong>* A device for the parametric application of thermal and<br />
tactile stimulation during fMRI</strong><br />
Conference Poster: [<a href="http://prefrontal.org/files/posters/Bennett-ThermotactileDevice-2010.pdf">PDF</a>] [<a href="http://prefrontal.org/files/posters/Bennett-ThermotactileDevice-2010.jpg">JPEG</a>]</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/10/spring-2010-conference-posters/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>APS Conference &#8211; Presentation Slides</title>
		<link>http://prefrontal.org/blog/2010/05/aps-conference-presentation-slides/</link>
		<comments>http://prefrontal.org/blog/2010/05/aps-conference-presentation-slides/#comments</comments>
		<pubDate>Sat, 29 May 2010 19:34:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Emotion]]></category>
		<category><![CDATA[Interoception]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=1136</guid>
		<description><![CDATA[I have wanted to attend the Association for Psychological Science annual convention for a number of years, but I was always frustrated by the number of other conferences I had to attend during the spring. All that changed early this year when I was offered the opportunity to give a presentation on interoceptive development. I [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://prefrontal.org/blog/wp-content/uploads/2010/05/aps.png" alt="" title="aps" width="135" height="83" align="right">I have wanted to attend the Association for Psychological Science annual convention for a number of years, but I was always frustrated by the number of other conferences I had to attend during the spring.  All that changed early this year when I was offered the opportunity to give a presentation on interoceptive development.  I suddenly had a very good reason to free up some time and hop on a plane!</p>
<p>I want to thank everyone who attended my address this morning.  After untold amounts of airline trouble getting to Boston it was a real pleasure to have the chance to talk about the insula and interoceptive development.</p>
<p>If you are interested you can download a copy of my presentation slides <a href="http://prefrontal.org/files/presentations/Bennett-APS-2010.pdf">here</a>.  Send me an email if you have any questions or comments. Thanks!</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/05/aps-conference-presentation-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PAPER: How reliable are the results from functional magnetic resonance imaging?</title>
		<link>http://prefrontal.org/blog/2010/02/paper-how-reliable-are-the-results-from-functional-magnetic-resonance-imaging/</link>
		<comments>http://prefrontal.org/blog/2010/02/paper-how-reliable-are-the-results-from-functional-magnetic-resonance-imaging/#comments</comments>
		<pubDate>Fri, 26 Feb 2010 05:02:51 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=893</guid>
		<description><![CDATA[- Current Citation: Bennett CM, Miller MB. (in press). How reliable are the results from functional magnetic resonance imaging? Annals of the New York Academy of Sciences. - Abstract: Functional magnetic resonance imaging is one of the most important methods for in vivo investigation of cognitive processes in the human brain. Within the last two [...]]]></description>
			<content:encoded><![CDATA[<p><strong>- Current Citation:</strong><br />
Bennett CM, Miller MB. (in press). How reliable are the results from functional magnetic resonance imaging?  <em>Annals of the New York Academy of Sciences.</em></p>
<p><strong>- Abstract:</strong><br />
Functional magnetic resonance imaging is one of the most important methods for in vivo investigation of cognitive processes in the human brain.  Within the last two decades an explosion of research has emerged using fMRI, revealing the underpinnings of everything from motor and sensory processes to the foundations of social cognition.  While these results have revealed the potential of neuroimaging, important questions regarding the reliability of these results remain unanswered.  In this chapter we take a close look at what is currently known about the reliability of fMRI findings.  First, we examine the many factors that influence the quality of acquired fMRI data.  We also conduct a review of the existing literature to determine if some measure of agreement has emerged regarding the reliability of fMRI.  Finally, we provide commentary on ways to improve fMRI reliability and what questions remain unanswered.  Reliability is the foundation on which scientific investigation is based.  How reliable are the results from fMRI?</p>
<p><strong>- Downloadable Versions:</strong><br />
[<a href="http://prefrontal.org/files/papers/Bennett-NYAS-2010.pdf">Manuscript PDF</a>]<br />
[<a href="http://www3.interscience.wiley.com/journal/123326312/abstract">Link to Journal PDF</a>]</p>
<p><span id="more-893"></span><br />
<strong>- Full Text:</strong></p>
<p>Reliability is the cornerstone of any scientific enterprise. Issues of research validity and significance are relatively meaningless if the results of our experiments are not trustworthy.  It is the case that reliability can vary greatly depending on the tools being used and what is being measured. Therefore, it is imperative that any scientific endeavor be aware of the reliability of its measurements. </p>
<p>Surprisingly, most fMRI researchers have only a vague idea of how reliable their results are.  Reliability is not a typical topic of conversation between most investigators and only a small fraction of papers investigating fMRI reliability have been published.  This became an important issue in 2009 as a paper by Vul, Harris, Winkielman, and Pashler set the stage for debate (2009).  Their paper, originally entitled “Voodoo Correlations in Social Neuroscience”, was focused on a statistical problem known as the ‘non-independence error’.  Critical to their argument was the reliability of functional imaging results.  Vul et al. argued that test-retest variability of fMRI results placed an ‘upper bound’ on the strength of possible correlations between fMRI data and behavioral measures:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq1.png" alt="" title="Bennett-Reliability-Eq1" width="390" height="23" class="alignnone size-full wp-image-1052" /></center></p>
<p>This calculation reflects that the strength of a correlation between two measures is a product of the measured relationship and the reliability of the measurements (Nunnally, 1970; Vul et al., 2009).  Vul et al. specified that behavioral measures of personality and emotion have a reliability of around 0.8 and that fMRI results have a reliability of around 0.7.  Not everyone agreed.  Across several written exchanges multiple research groups debated what the “actual reliability” of fMRI was.  Jabbi et al. stated that the reliability of fMRI could be as high as 0.98 (2009).  Lieberman et al. split the difference and argued that fMRI reliability was likely around 0.90 (2009).  While much ink was spilled debating the reliability of fMRI results, very little consensus was reached regarding an appropriate approximation of its value.</p>
<p>The difficulty of detecting signal (what we are trying to measure) from amongst a sea of noise (everything else we don’t care about) is a constant struggle for all scientists.  It influences what effects can be examined and is directly tied to the reliability of research results.  What follows in this chapter is a multifaceted examination of fMRI reliability.  We examine why reliability is a critical metric of fMRI data, discuss what factors influence the quality of the blood oxygen level dependent (BOLD) signal, and investigate the existing reliability literature to determine if some measure of agreement has emerged across studies.  Fundamentally, there is one critical question that this chapter seeks to address: if you repeat your fMRI experiment, what is the likelihood you will get the same result?</p>
<p><b>Pragmatics of Reliability</b></p>
<p>Why worry about reliability at all?  As long as investigators are following accepted statistical practices and being conservative in the generation of their results, why should the field be bothered with how reproducible the results might be?  There are, at least, four primary reasons why test-retest reliability should be a concern for all fMRI researchers.</p>
<p><u>Scientific truth.</u>  While it is a simple statement that can be taken straight out of an undergraduate research methods course, an important point must be made about reliability in research studies: it is the foundation on which scientific knowledge is based.  Without reliable, reproducible results no study can effectively contribute to scientific knowledge.  After all, if a researcher obtains a different set of results today than they did yesterday, what has really been discovered?  To ensure the long-term success of functional neuroimaging it is critical to investigate the many sources of variability that impact reliability.  It is a strong statement, but if results do not generalize from one set of subjects to another or from one scanner to another then the findings are of little value scientifically.</p>
<p><u>Clinical and Diagnostic Applications.</u>  The longitudinal assessment of changes in regional brain activity is becoming increasingly important for the diagnosis and treatment of clinical disorders.  One potential use of fMRI is for the localization of specific cognitive functions before surgery.  A good example is the localization of language function prior to tissue resection for epilepsy treatment (Fernandez et al., 2003).  This is truly a case where an investigator does not want a slightly different result each time they conduct the scan.  If fMRI is to be used for surgical planning or clinical diagnostics then any issues of reliability must be quantified and addressed.  </p>
<p><u>Evidentiary Applications.</u>  The results from functional imaging are increasingly being submitted as evidence into the United States legal system.  For example, results from a commercial company called No Lie MRI (San Diego, CA; http://www.noliemri.com/) were introduced into a juvenile sex abuse case in San Diego during the spring of 2009.  The defense was attempting to introduce the fMRI results as scientific justification of their client’s claim of innocence.  A concerted effort from imaging scientists, including in-person testimony from Marc Raichle, eventually forced the defense to withdraw the request.  While the fMRI results never made it into this case, it is clear that fMRI evidence will be increasingly common in the courtroom.  What are the larger implications if the reliability of this evidence is not as trustworthy as we assume?</p>
<p><u>Scientific Collaboration.</u>  A final pragmatic dimension of fMRI reliability is the ability to share data between researchers.  This is already a difficult challenge, as each scanner has its own unique sources of error that become part of the data (Jovicich et al., 2006).  Early evidence has indicated that the results from a standard cognitive task can be quite similar across scanners (Casey et al., 1998; Friedman et al., 2008).  Still, concordance of results remains an issue that must be addressed for large-scale, collaborative inter-center investigations. The ultimate level of reliability is the reproducibility of results from any equivalent scanner around the world and the ability to integrate this data into larger investigations.</p>
<p><center><br />
<h4>- What Factors Influence fMRI Reliability? -</h4>
<p></center></p>
<p>The ability of fMRI to detect meaningful signals is limited by a number of factors that add error to each measurement.  Some of these factors include thermal noise, system noise in the scanner, physiological noise from the subject, non-task related cognitive processes, and changes in cognitive strategy over time (Huettel et al., 2008; Kruger and Glover, 2001).  The concept of reliability is, at its core, a representation of the ability to routinely detect relevant signals from this background of meaningless noise.  If a voxel timeseries contains a large amount of signal then the primary sources of variability are actual changes in blood flow related to neural activity within the brain.  Conversely, in a voxel containing a large amount of noise the measurements are dominated by error and would not contain meaningful information.  By increasing the amount of signal, or decreasing the amount of noise, a researcher can effectively increase the quality and reliability of acquired data.  </p>
<p>The quality of data in magnetic resonance imaging is typically measured using the signal-to-noise ratio (SNR) of the acquired images.  The goal is to maximize this ratio.  Two kinds of SNR are important for functional MRI.  The first is the image SNR.  It is related to the quality of data acquired in a single fMRI volume. Image SNR is typically computed as the mean signal value of all voxels divided by the standard deviation of all voxels in a single image:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq2.png" alt="" title="Bennett-Reliability-Eq2" width="173" height="26" class="alignnone size-full wp-image-1054" /></center></p>
<p>Increasing the image SNR will improve the quality of data at a single point in time.  However, most important for functional neuroimaging is the amount of signal present in the data across time.  This makes the temporal SNR (tSNR) perhaps the most important metric of data for functional MRI.  It represents the signal-to-noise ratio of the timeseries at each voxel:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq3.png" alt="" title="Bennett-Reliability-Eq3" width="220" height="26" class="alignnone size-full wp-image-1055" /></center></p>
<p>The tSNR is not the same across all voxels in the brain.  Some regions will have higher or lower tSNR depending on location and constitution.  For example, there are documented differences in tSNR between gray matter and white matter (Bodurka et al., 2005).  The typical tSNR of fMRI can also vary depending on the same factors that influence image SNR.  </p>
<p>Another metric of data quality is the contrast-to-noise ratio (CNR).  This refers to the ability to maximize differences between signal intensity in different areas in an image (image CNR) or to maximize differences between different points in time (temporal CNR).  With regard to functional neuroimaging, the temporal CNR represents the maximum relative difference in signal intensity that is represented within a single voxel.  In a voxel with low CNR there would be very little difference between two conditions of interest.  Conversely, in a voxel with high CNR there would be relatively large differences between two conditions of interest.  The image CNR is not critical to fMRI, but having a high temporal CNR is very important for detecting task effects.</p>
<p>It is generally accepted that fMRI is a rather noisy measurement with a characteristically low tSNR, requiring extensive signal averaging to achieve effective signal detection (Murphy et al., 2007).  The following sections provide greater detail on the influence of specific factors on the SNR/tSNR of functional MRI data.  We break these factors down by the influence of differences in image acquisition, the image analysis pipeline, and the contribution of the subjects themselves.</p>
<p><b>SNR influences of MRI acquisition</b></p>
<p>The typical high-field MRI scanner is a precision superconducting device constructed to very exact manufacturing tolerances.  Still, the images it produces can be somewhat variable depending on a number of hardware and software variables.  With regard to hardware, one well-known influence on the signal to noise ratio of MRI is the strength of the primary B0 magnetic field (Bandettini et al., 1994; Ogawa et al., 1993).  Doubling this field, such as moving from 1.5 Tesla to a 3.0 Tesla field strength, can theoretically double the SNR of the data.  The B0 field strength is especially important for fMRI, which relies on magnetic susceptibility effects to create the blood oxygen level dependent (BOLD) signal (Turner et al., 1993).  Hoenig et al. showed that, relative to a 1.5 Tesla magnet, a 3.0 Tesla fMRI acquisition had 60-80% more significant voxels (2005).  They also demonstrated that the CNR of the results was 1.3 times higher than those obtained at 1.5 Tesla.  The strength and slew rate of the gradient magnets can have a similar impact on SNR.  Advances in head coil design are also notable, as parallel acquisition head coils have increased radiofrequency reception sensitivity.</p>
<p>It is important to note that there are negative aspects of higher field strength as well.  Artifacts due to physiological effects and susceptibility are all increasingly pronounced at higher fields.  The increased contribution of physiological noise reduces the expected gains in SNR at high field (Kruger and Glover, 2001).  The increasing contribution of susceptibility artifacts can virtually wipe out areas of orbital prefrontal cortex and inferior temporal cortex (Jezzard and Clare, 1999).  Also, in terms of tSNR there are diminishing returns with each step up in B0 field strength.  At typical fMRI spatial resolution values tSNR approaches an asymptotic limit between 3 Tesla and 7 Tesla (Kruger and Glover, 2001; Triantafyllou et al., 2005).</p>
<p>Looking beyond the scanner hardware, the parameters of the fMRI acquisition can also have a significant impact on the SNR/CNR of the final images.  For example, small changes in the voxel size of a sequence can dramatically alter the final SNR.  Moving from 1.5 mm3 voxels to 3.0 mm3 voxels can potentially increase the acquisition SNR by a factor of eight, but at a cost of spatial resolution.  Some other acquisition variables that will influence the acquired SNR/CNR are : repetition time (TR), echo time (TE), bandwidth, slice gap, and k-space trajectory.  For example, Moser et al. found that optimizing the flip angle of their acquisition could approximately double the SNR of their data in a visual stimulation task (1996).  Further, the effect of each parameter varies according to the field strength of the magnet (Triantafyllou et al., 2005).  The optimal parameter set for a 3 Tesla system may not be optimal with a 7 Tesla system.</p>
<p>The ugly truth is that any number of factors in the control room or magnet suite can increase noise in the images.  A famous example from one imaging center was when the broken filament from a light bulb in a distant corner of the magnet suite started causing visible sinusoidal striations in the acquired EPI images.  This is an extreme example, but it makes the point that the scanner is a precision device that is designed to operate in a narrow set of well-defined circumstances.  Any deviation from those circumstances will increase noise, thereby reducing SNR and reliability.</p>
<p><b>SNR considerations of analysis methods</b></p>
<p>The methods used to analyze fMRI data will affect the reliability of the final results.  In particular, those steps taken to reduce known sources of error are critical to increasing the final SNR/CNR of preprocessed images.  For example, spatial realignment of the EPI data can have a dramatic effect on lowering movement-related variance and has become a standard part of fMRI preprocessing (Oakes et al., 2005; Zhilkin and Alexander, 2004).  Recent algorithms can also help remove remaining signal variability due to magnetic susceptibility induced by movement (Andersson et al., 2001).  Temporal filtering of the EPI timeseries can reduce undesired sources of noise by frequency.  The use of a high-pass filter is a common method to remove low-frequency noise, such as signal drift due to the scanner (Kiebel and Holmes, 2007).  Spatial smoothing of the data can also improve the SNR/CNR of an image.  There is some measure of random noise added to the true signal of each voxel during acquisition.  Smoothing across voxels can help to average out error across the area of the smoothing filter (Mikl et al., 2008).  It can also help account for local differences in anatomy across subjects.  Smoothing is most often done using a Gaussian kernel of approximately 6-12 mm3 FWHM.  </p>
<p>There has been some degree of standardization regarding preprocessing and statistical approaches in fMRI.  For instance, Mumford and Nichols found that approximately 92% of group fMRI results were computed using an ordinary least squares (OLS) estimation of the general linear model (2009).  Comparison studies with carefully standardized processing procedures have shown that the output of standard software packages can be very similar (Gold et al., 1998; Morgan et al., 2007).  However, in actual practice the diversity of tools and approaches in fMRI increases the variability between sets of results.  The functional imaging analysis contest (FIAC) in 2005 demonstrated that prominent differences existed between fMRI results generated by different groups using the same original dataset.  On reviewing the results the organizers concluded that brain regions exhibiting robust signal changes could be quite similar across analysis techniques, but the detection of areas with lower signal was highly variable (Poline et al., 2006).  It remains the case that decisions made by the researcher regarding how to analyze the data will impact what results are found.</p>
<p>Strother et al. have done a great deal of research into the influence of image processing pipelines using a predictive modeling framework (2004; 2002; Zhang et al., 2009).  They found that small changes in the processing pipeline of fMRI images have a dramatic impact on the final statistics derived from that data.  Some steps, such as slice timing correction, were found to have little influence on the results from experiments with a block design.  This is logical, given the relative insensitivity of block designs to small temporal shifts.  However, the steps of motion correction, high-pass filtering, and spatial smoothing were found to significantly improve the analysis.  They reported that the optimization of preprocessing pipelines improved both intra-subject and between-subject reproducibility of results (Zhang et al., 2009).   Identifying an optimal set of processing steps and parameters can dramatically improve the sensitivity of an analysis.</p>
<p><b>SNR influences of participants</b></p>
<p>The MRI system and fMRI analysis methods have received a great deal of attention with regard to SNR.  However, one area that may have the greatest contribution to fMRI reliability is how stable/unstable the patterns of activity within a single subject can be.  After all, a test-retest methodology involving human beings is akin to hitting a moving target.  Any discussion of test-retest reliability in fMRI has to take into consideration the fact that the cognitive state of a subject is variable over time.</p>
<p>There are two important ways that a subject can influence reliability within a test-retest experimental design.  The first involves within-subject changes that take place over the course of a single session.  For instance, differences in attention and arousal can significantly modulate subsequent responses to sensory stimulation (Munneke et al., 2008; Peyron et al., 1999; Sterr et al., 2007).  Variability can also be caused by evolving changes in cognitive strategy used during tasks like episodic retrieval (Miller et al., 2001; Miller et al., 2002).  If a subject spontaneously shifts to a new decision criterion midway during a session then the resulting data may reflect the results of two different cognitive processes.  Finally, learning will take place with continued task experience, shifting the pattern of activity as brain regions are engaged and disengaged during task-relevant processing (Grafton et al., 1995; Poldrack et al., 1999; Rostami et al., 2009).  For studies investigating learning this is a desired effect, but for others this is an undesired source of noise.</p>
<p>The second influence on reliability is related to physiological and cognitive changes that may take place within a subject between the test and retest sessions.  Within 24 hours an infinite variety of reliability-reducing events can take place.  All of the above factors may show changes over the days, weeks, months, or years between scans.  These changes may be even more dramatic depending on the amount of time between scanning sessions.</p>
<p><center><br />
<h4>- Estimates of fMRI Reliability -</h4>
<p></center></p>
<p>A diverse array of methods have been created for measuring the reliability of fMRI.  What differs between them is the specific facet of reliability they are intended to quantify.  Some methods are only concerned with significant voxels.  Other methods address similarity in the magnitude of estimated activity across all voxels.  The choice of how to calculate reliability often comes down to which aspect of the results are desired to remain stable over time.</p>
<p><b>Measuring stability of super-threshold extent.</b></p>
<p>Do you want the voxels that are significant during the test scan to still be significant during the retest scan?  This would indicate that super-threshold voxels are to remain above the threshold during subsequent sessions.  The most prevalent method to quantify this reliability is the cluster overlap method.  The cluster overlap method is a measure revealing what set of voxels are considered to be super-threshold during both test and retest sessions.  </p>
<p>Two approaches have been used to calculate cluster overlap.  The first, and by far most prevalent, is a measure of similarity known as the Dice coefficient.  It was first used to calculate fMRI cluster overlap by Rombouts et al. and has become a standard measure of result similarity (1997).  It is typically calculated by the following equation:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq4.png" alt="" title="Bennett-Reliability-Eq4" width="231" height="28" class="alignnone size-full wp-image-1057" /></center></p>
<p>Results from the Dice equation can be interpreted as the number of voxels that will overlap divided by the average number of significant voxels across sessions.  Another approach to calculating similarity is the Jaccard index.  The Jaccard index has the advantage of being readily interpretable as the percent of voxels that are shared, but is infrequently used in the investigation of reliability.  It is typically calculated by the following equation:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq5.png" alt="" title="Bennett-Reliability-Eq5" width="254" height="27" class="alignnone size-full wp-image-1058" /></center></p>
<p>Results from the Jaccard equation can be interpreted as the number of overlapping voxels divided by the total number of unique voxels in all sessions.  For both the Dice and Jaccard methods a value of 1.0 would indicate that all super-threshold voxels identified during the test scan were also active in the retest scan, and vice-versa.  A value of 0.0 would indicate that no voxels in either scan were shared between the test and retest sessions.  See Figure 1 for a graphical representation of overlapping results from two runs in an example dataset.</p>
<p><center><br />
<table width='500'>
<tr>
<td align="center">
<a href="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Figure1LG.png"><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Figure1SM.png" alt="" title="Bennett-Reliability-Figure1SM" width="250" height="213" class="alignnone size-full wp-image-1027" /></a>
</td>
</tr>
<tr>
<td>
Figure 1. Visualization of cluster overlap using two runs of data from a two-back working memory task. The regions in red represent significant clusters from the first run and regions in blue represent significant clusters from the second run. The crosshatched region represents the overlapping voxels that were significant in both runs. Important to note is that not all significant voxels remained significant across the two runs. One cluster in the cerebellum did not replicate at all. Data is from Bennett, Guerin, and Miller (2009).
</td>
</tr>
</table>
<p></center>&nbsp;</p>
<p>The main limitation of all cluster overlap methods is that they are highly dependent on the statistical threshold used to define what is ‘active’.  Duncan et al. demonstrated that the reported reliability of the cluster overlap method decreases as the significance threshold is increased (2009).  Similar results were reported by Rombouts et al., who found nonlinear changes in cluster overlap reliability across multiple levels of significance (1998).</p>
<p>These overlap statistics seek to represent the proportion of voxels that remain significant across repetitions relative to the proportion that are significant in only a subset of the results.  Another, similar approach would be to conduct a formal conjunction analysis between the repetitions.  The goal of this approach would be to uniquely identify those voxels that are significant in all sessions.  One example of this approach would be the ‘Minimum Statistic compared to the Conjunction Null’ (MS/CN) of Nichols et al (2005).  Using this approach a researcher could threshold the results, allowing for the investigation of reliability with a statistical criterion.</p>
<p>A method similar to cluster overlap, called voxel counting, was reported in early papers.  The use of voxel counting simply evaluated the total number of activated voxels in the test and retest images.  This has proven to be a suboptimal approach for the examination of reliability, as it is done without regard to the spatial location of significant voxels (Cohen and DuBois, 1999).  An entirely different set of results could be observed in each image yet they could contain the same number of significant voxels.  As a result this method is no longer used.</p>
<p><b>Measuring stability of activity in significant clusters.</b></p>
<p>Do you want the estimated magnitude of activity in each cluster to be stable between the test scan and the retest scan?  This is a more stringent criteria than simple extent reliability, as it is necessary to replicate the exact degree of activation and not simply what survives thresholding.  The most standard method to quantify this reliability is through an intra-class correlation (ICC) of the time1-time2 cluster values.  The intra-class correlation is different from the traditional Pearson product-moment correlation as it is specialized for data of one type, or class.  While there are many versions of the ICC, it is typically taken to be a ratio of the variance of interest divided by the total variance (Bartko, 1966; Shrout and Fleiss, 1979).  The ICC can be computed as follows:</p>
<p><center><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Eq6.png" alt="" title="Bennett-Reliability-Eq6" width="245" height="30" class="alignnone size-full wp-image-1059" /></center></p>
<p>One of the best reviews of the ICC was completed by Shrout and Fleiss, who detailed six types of ICC calculation and when each is appropriate to use (1979).  One advantage of the ICC is that it can be interpreted similarly to the Pearson correlation.  A value of 1.0 would indicate near-perfect agreement between the values of the test and retest sessions, as there would be no influence of within-subject variability.  A value of 0.0 would indicate that there was no agreement between the values of the test and retest sessions, since within-subject variability would dominate the equation.</p>
<p>Studies examining reliability using intra-class correlations are often computed based on summary values from regions of interest (ROIs).  Caceras et al. compared four methods commonly used to compute ROI reliability using intraclass correlations (2009).  The median(ICC) is the median of the ICC values from within a ROI.  ICCmed is the median ICC of the contrast values.  ICCmax is the calculation of ICC values at the peak activated voxel within an activated cluster.  ICCv is defined the intra-voxel reliability, a measure of the total variability that can be explained by the intra-voxel variance.</p>
<p>There are several notable weaknesses to the use of ICC in calculating reliability.  First, the generalization of ICC results is limited because calculation is specific to the dataset under investigation.  An experiment with high inter-subject variability could have different ICC values relative to an experiment with low inter-subject variability, even if the stability of values over time is the same.  As discussed later in this chapter, this can be particularly problematic when comparing the reliability of clinical disorders to that of normal controls.  Second, because of the variety of ICC subtypes there can often be confusion regarding which one to use.  Using an incorrect subtype can result in quite different reliability estimates (Muller and Buttner, 1994).</p>
<p><b>Measuring voxelwise reliability of the whole brain.</b></p>
<p>Do you want to know the reliability of results on a whole-brain, voxelwise basis?  Completing a voxelwise calculation would indicate that the level of activity in all voxels should remain consistent between the test and retest scans.  This is the strictest criterion for reliability.  It yields a global measure of concordance that indicates how effectively activity across the whole brain is represented in each test-retest pairing. Very few studies have examined reliability using this approach, but it may be one of the most valuable metrics of fMRI reliability.  This is one of the few methods that gives weight to the idea that the estimated activity should remain consistent between test and retest, even if the level of activity is close to zero.</p>
<p><center><br />
<table width='500'>
<tr>
<td align="center">
<a href="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Figure2LG.png"><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Figure2SM.png" alt="" title="Bennett-Reliability-Figure2SM" width="250" height="169" class="alignnone size-full wp-image-1029" /></a>
</td>
</tr>
<tr>
<td>
Figure 2. Histogram showing the frequency of voxelwise ICC values during a two-back working memory task. The histogram was computed from a dataset of sixteen subjects using 100 bins between ICC values of 1.0 and -1.0. The distribution of values is negatively skewed, with a mean ICC value of ICC = 0.44 and the most frequently occurring value of ICC = 0.57. Data is from Bennett, Guerin, and Miller (2009).
</td>
</tr>
</table>
<p></center>&nbsp;</p>
<p>Figure 2 is an example histogram plot from our own data that shows the frequency of ICC values for all voxels across the whole brain during a two-back working memory task (Bennett et al., 2009).  The mean and mode of the distribution is plotted.  It is quickly apparent that there is a wide range of ICC reliability values across the whole brain, with some voxels having almost no reliability and others approaching near perfect reliability.</p>
<p><b>Other reliability methods.</b></p>
<p>Numerous other methods have also been used to measure the reliability of estimated activity.  Some of these include maximum likelihood (ML), coefficient of variation (CV), and variance decomposition.  While these methods are in the minority by frequency of use, this does not diminish their utility in examining reliability.  This is especially true with regard to identifying the sources of test-retest variability that can influence the stability of results.</p>
<p>One particularly promising approach for the quantification of reliability is predictive modeling.  Predictive modeling measures the ability of a training set of data to predict the structure of a testing set of data.  One of the best established modeling techniques within functional neuroimaging is the nonparametric prediction, activation, influence, and reproducibility sampling (NPAIRS) approach by Strother et al. (2004; 2002).  Within the NPAIRS modeling framework separate metrics of prediction and reproducibility are generated (Zhang et al., 2008).  The first, prediction accuracy, evaluates classification in the temporal domain, predicting which condition of the experiment each scan belongs to.  The second metric, reproducibility, evaluates the model in the spatial domain, comparing patterns of regional brain activity over time.  While this approach is far more complicated than the relatively simple cluster overlap or ICC metrics, predictive modeling does not suffer from many of the drawbacks that these methods have.  NPAIRS, and other predictive modeling approaches, enable a much more thorough examination of fMRI reliability.</p>
<p>Some studies have investigated fMRI reliability using the Pearson product-moment (r) correlation.  Intuitively this is a logical method to use, as it measures the relationship between two variables.  However, it is generally held that the Pearson product-moment correlation is not an ideal measure of test-retest reliability.  Safrit identified three reasons why the product-moment correlation should not be used to calculate reliability (1976).  First, the Pearson product-moment correlation is setup to determine the relationship between two variables, not the stability of a single variable.  Second, it is difficult to measure reliability with the Pearson product-moment correlation beyond a single test-retest pair.  It becomes increasingly awkward to quantify reliability with two or more retest sessions.  One can try to average over multiple pairwise Pearson product-moment correlations between the multiple sessions, but it is far easier to take the ANOVA approach of the ICC and examine it from the standpoint of between- and within-subject variability.  Third, the Pearson product-moment correlation cannot detect systematic error.  This would be the case when the retest values deviate by a similar degree, such as adding a constant value to all of the original test values.  The Pearson product-moment correlation would remain the same, while an appropriate ICC would indicate that the test-retest agreement is not exact.  While the use of ICC measures has its own set of issues, it is generally a more appropriate tool for the investigation of test-retest reliability.</p>
<p><center><br />
<h4>- Review of Existing Reliability Estimates -</h4>
<p></center></p>
<p>Since the advent of fMRI some results have been common and quite easily replicated.  For example, activity in primary visual cortex during visual stimulation has been thoroughly studied.  Other fMRI results have been somewhat difficult to replicate.  What does the existing literature have to say regarding the reliability of fMRI results?</p>
<p>There have been a number of individual studies investigating the test-retest reliability of fMRI results, but few articles have reviewed the entire body of literature to find trends across studies.  To obtain a more effective estimate of fMRI reliability we conducted a survey of the existing literature on fMRI reliability.  To find papers for this investigation we searched for “test-retest fMRI” using the NCBI PubMed database (www.pubmed.gov).  This search yielded a total of 183 papers, 37 of which used fMRI as a method of investigation, used a general linear model to compute their results, and provided test-retest measures of reliability.  To broaden the scope of the search we then went through the reference section of the 37 papers found using PubMed to look for additional works not identified in the initial search. There were 26 additional papers added to the investigation through this secondary search method.  The total number of papers retrieved was 63.  Each paper was examined with regard to the type of cognitive task, kind of fMRI design, number of subjects, and basis of reliability calculation.  </p>
<p>We have separated out the results into three groups: those that used the voxel overlap method, those that used intraclass correlation, and papers that used other calculation methods.  The results of this investigation can be seen in Tables 1, 2, and 3.  In the examination of cluster overlap values in the literature we attempted to only include values that were observed at a similar significance threshold across all of the papers.  The value we chose as the standard was p(uncorrected) < 0.001.  Other deviations from this standard approach are noted in the tables.</p>
<p><center><br />
<table width='500'>
<tr>
<td align='center'>
<a href='http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Table1.pdf'><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Icon-PDF.png" alt="" title="Icon-PDF" width="61" height="70" class="alignnone size-full wp-image-1043" /><br />
Table1</a>
</td>
<td align='center'>
<a href='http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Table2.pdf'><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Icon-PDF.png" alt="" title="Icon-PDF" width="61" height="70" class="alignnone size-full wp-image-1043" /><br />
Table2</a>
</td>
<td align='center'>
<a href='http://prefrontal.org/blog/wp-content/uploads/2010/02/Bennett-Reliability-Table3.pdf'><img src="http://prefrontal.org/blog/wp-content/uploads/2010/02/Icon-PDF.png" alt="" title="Icon-PDF" width="61" height="70" class="alignnone size-full wp-image-1043" /><br />
Table3</a>
</td>
</tr>
</table>
<p></center>&nbsp;</p>
<p><b>Conclusions From the Reliability Review</b></p>
<p>What follows are some general points that can be taken away from the reliability survey.  Some of the conclusions that follow are quantitative results from the review and some are qualitative descriptions of trends that were observed as we conducted the review.</p>
<p><u>A diverse collection of methods have been used to assess fMRI reliability.</u>  The first finding mirrors the above discussion on reliability calculation.  A very diverse collection of methods has been used to investigate fMRI reliability.  This list includes: intra-class correlation (ICC), cluster overlap, voxel counts, receiver operating characteristic (ROC) curves, maximum likelihood (ML), conjunction analysis, Cohen’s kappa index, coefficient of variation (CV), Kendall’s W, laterality index (LI), variance component decomposition, Pearson correlation, predictive modeling, and still others.  While this diversity of methods has created converging evidence of fMRI reliability, it has also limited the ability to compare and contrast the results of existing reliability studies.</p>
<p><u>Intra-class correlation and cluster overlap methods dominate the calculation of test-retest reliability.</u> While there have been a number of methods used to investigate reliability, the two that stand out by frequency of use are cluster overlap and intra-class correlation.  One advantage of these methods is that they are easy to calculate.  The equations are simple to understand, easy to implement, and fast to process.  A second advantage of these methods is their easy interpretation by other scientists.  Even members of the general public can understand the concept behind the overlapping of clusters and most everyone is familiar with correlation values.  While these techniques certainly have limitations and caveats, they seem to be the emerging standard for the analysis of fMRI reliability.</p>
<p><u>Most previous studies of reliability and reproducibility have been done with relatively few subjects.</u>  What sample size is necessary to conduct effective reliability research?  Most of the studies that were reviewed used less than 10 subjects to calculate their reliability measures, with 11 subjects being the overall average across the investigation.  Should reliability studies have more subjects?  Since a large amount of the error variance is coming from subject-specific factors it may be wise to use larger sample sizes when assessing study reliability, as a single anomalous subject could sway study reliability in either direction.  Another notable factor is that a large percentage of studies using fMRI are completed with a restricted range of subjects.  Most samples will typically be recruited from a pool of university undergraduates.  These samples may have a different reliability than a sample pulled at random from the larger population.  Because of sample restriction the results of most test-retest investigations may not reflect the true reliability of other populations, such as children, the elderly, and individuals with clinical disorders.</p>
<p><u>Reliability varies by test-retest interval.</u>  Generally, increased amounts of time between the initial test scan and the subsequent retest scan will lower reliability.  Still, even back-to-back scans are not perfectly reliable.  The average Jaccard overlap of studies where the test and retest scans took place within the same hour was 33%.  Many studies with intervals lasting three months or more had a lower overlap percentage.  This is a somewhat loose guideline though.  Notably, the results reported by Aron et al. had one of the longest test-retest intervals but also possessed the highest average ICC score (2006).</p>
<p><u>Reliability varies by cognitive task and experimental design.</u>  Motor and sensory tasks seem to have greater reliability than tasks involving higher cognition.  Caceras et al. found that the reliability of an N-back task was generally higher than that of an auditory target detection task (2009).  Differences in the design of an fMRI experiment also seem to affect the reliability of results.  Specifically, block designs appear to have a slight advantage over event-related designs in terms of reliability.  This may be a function of the greater statistical power inherent in a block design and its increased SNR. </p>
<p><u>Significance is related to reliability, but it is not a strong correlation.</u>  Several studies have illustrated that super-threshold voxels are not necessarily more reliable than sub-threshold voxels.  Caceras et al. examined the joint probability distribution of significance and reliability (2009).  They found that there were some highly activated ROIs with low reliability and some sub-threshold regions that had high reliability. These ICC results fit in well with the data from cluster overlap studies.  The average cluster overlap was 29%.  This means that, across studies, the average number of significant voxels that will replicate is roughly one-third.  This evidence speaks against the assumption that significant voxels will be far more reliable in an investigation of test-retest reliability.</p>
<p><u>An optimal threshold of reliability has not been established.</u>  There is no consensus value regarding what constitutes an acceptable level of reliability in fMRI.  Is an ICC value of 0.50 enough?  Should studies be required to achieve an ICC of 0.70?  All of the studies in the review simply reported what the reliability values were.  Few studies proposed any kind of criteria to be considered a ‘reliable’ result.  Cicchetti and Sparrow did propose some qualitative descriptions of data based on the ICC-derived reliability of results (1981).  They proposed that results with an ICC above 0.75 be considered ‘excellent’, results between 0.59 and 0.75 be considered ‘good’, results between .40 and .58 be considered ‘fair’, and results lower than 0.40 be considered ‘poor’.  More specifically to neuroimaging, Eaton et al. (2008) used a threshold of ICC > 0.4 as the mask value for their study while Aron et al. (2006) used an ICC cutoff of ICC > 0.5 as the mask value. </p>
<p><u>Inter-individual variability is consistently greater than intra-individual variability.</u>  Many studies reported both within-subject and between-subject reliability values in their results.  In every case the within-subject reliability far exceeded the between-subjects reliability.  Miller et al. explicitly examined variability across subjects and concluded that there are large-scale, stable differences between individuals on almost any cognitive task (2001; 2002).  More recently, Miller et al. directly contrasted within- and between-subject variability (2009).  They concluded that between-subject variability was far higher than any within-subject variability.  They further demonstrated that the results from one subject completing two different cognitive tasks are typically more similar than the data from two subjects doing the same task.  These results are mirrored by those of Costafreda et al. who found that well over half (57%) of the variability in their fMRI data was due to between-subject variation (2007).  It seems to be the case that within-subject measurements over time may vary, but they vary far less than differences in the overall pattern of activity between individuals.</p>
<p><u>There is little agreement regarding the true reliability of fMRI results.</u>  While we mention this as a final conclusion from the literature review, it is perhaps the most important point.  Some studies have estimated the reliability of fMRI data to be quite high, or even close to perfect for some tasks and brain regions (Aron et al., 2006; Maldjian et al., 2002; Raemaekers et al., 2007).  Other studies have been less enthusiastic, showing fMRI reliability to be relatively low (Duncan et al., 2009; Rau et al., 2007).  Across the survey of fMRI test-retest reliability we found that the average ICC value was 0.50 and the average cluster overlap value was 29% of voxels (Dice overlap = 0.45, Jaccard overlap = 0.29).  This represents an average across many different cognitive tasks, fMRI experimental designs, test-retest time periods, and other variables.  While these numbers may not be representative of any one experiment, they do provide an effective overview of fMRI reliability.</p>
<p><center><br />
<h4>- Other Issues and Comparisons -</h4>
<p></center></p>
<p><b>Test-Retest Reliability in Clinical Disorders</b></p>
<p>There have been few examinations of test-retest reliability in clinical disorders relative to the number of studies with normal controls.  A contributing factor to this problem may be that the scientific understanding of brain disorders is still in its infancy.  It may be premature to examine clinical reliability if there is only a vague understanding of anatomical and functional abnormalities in the brain.  Still, some investigators have taken significant steps forward in the clinical realm.  These few investigations suggest that reliability in clinical disorders is typically lower than the reliability of data from normal controls.  Some highlights of these results are listed below, categorized by disorder.</p>
<p><u>Epilepsy.</u>  Functional imaging has enormous potential to aid in the clinical diagnosis of epileptiform disorders.   Focusing on fMRI, research by Di Bonaventura et al. found that the spatial extent of activity associated with fixation off sensitivity (FOS) was stable over time in epileptic patients (2005).  Of greater research interest for epilepsy has been the reliability of combined EEG/fMRI imaging.  Symms et al. reported that they could reliably localize interictal epileptiform discharges using EEG-triggered fMRI (1999).  Waites et al. also reported the reliable detection of discharges with combined EEG/fMRI at levels significantly above chance (2005).  Functional imaging also has the potential to assist in the localization of cognitive function prior to resection for epilepsy treatment.  One possibility would be to use noninvasive fMRI measures to replace cerebral sodium amobarbital anesthetization (Wada Test).  Fernandez et al. reported good reliability of lateralization indices (whole-brain test-retest r = 0.82) and cluster overlap measures (Dice overlap = .43, Jaccard overlap = 0.27) (2003).</p>
<p><u>Stroke.</u>  Many aspects of stroke recovery can impact the results of functional imaging data.  The lesion location, size, and time elapsed since the stroke event each have the potential to alter function within the brain.  These factors can also lead to increased between-subject variability relative to groups of normal controls.  This is especially true when areas proximal to the lesion location contribute to specific aspects of information processing, such as speech production.  Kimberley et al. found that stroke patients had generally higher ICC values relative to normal controls (2008).  This mirrors the findings of Eaton et al., who showed that the average reliability of aphasia patients was approximately equal to that of normal controls as measured by ICC (2008).  These results may be indicative of equivalent fMRI reliability in stroke victims, or it may be an artifact of the ICC calculation.  Kimberly et al. state that increased between-subject variability of stroke patients can lead to inflated ICC estimates (2008).  They argue that fMRI reliability in stroke patients likely falls within the moderate range of values (0.4 < ICC < 0.6).</p>
<p><u>Schizophrenia.</u>  Schizophrenia is a multidimensional mental disorder characterized by a wide array of cognitive and perceptual dysfunctions (Freedman, 2003; Morrison and Murray, 2005).  While there have been a number of studies on the reliability of anatomical measures in schizophrenia there have been few that have focused on function.  Manoach et al. demonstrated that the fMRI results from schizophrenic patients on a working memory task were less reliable overall than that of normal controls (2001).  The reliability of significant ROIs in the schizophrenic group ranged from ICC values of -0.20 to 0.57.  However, the opposite effect was found by Whalley et al. in a group of subjects at high genetic risk for schizophrenia (no psychotic symptoms) (2009).  The ICC values for these subjects were equally reliable relative to normal controls on a sentence completion task.  More research is certainly needed to find consensus on reliability in schizophrenia.</p>
<p><u>Aging.</u>  The anatomical and functional changes that take place during aging can increase the variability of fMRI results at all levels (MacDonald et al., 2006).  Clement et al. reported that cluster overlap percentages and the cluster-wise ICC values were not significantly different between normal elderly controls and patients with mild cognitive impairment (MCI) (2009).  On an episodic retrieval task healthy controls had ICC values averaging 0.69 while patients diagnosed with MCI had values averaging 0.70.  However, they also reported that all values for the older samples were lower than those reported for younger adults on similar tasks.  Marshall et al. found that while the qualitative reproducibility of results was high, the reliability of activation magnitude during aging was quite low (2004).</p>
<p>It is clear that the use of intra-class correlations in clinical research must be approached carefully.  As mentioned by Bosnell et al. and Kimberly et al., extreme levels of between-subject variability will artificially inflate the resulting ICC reliability estimate (Bosnell et al., 2008; Kimberley et al., 2008).  Increased between-subject variability is a characteristic found in many clinical populations.  Therefore, it may be the case that comparing two populations with different levels of between-subject variability may be impossible when using an ICC measure.</p>
<p><b>Reliability Across Scanners / Multicenter Studies</b></p>
<p>One area of increasing research interest is the ability to combine the data from multiple scanners into larger, integrative data sets (Van Horn and Toga, 2009).  There are two areas of reliability that are important for such studies.  The first is subject-level reliability, or how stable the activity of one person will be scan-to-scan.  The second is group-level reliability, or how stable the group fMRI results will be from one set of subjects to another or from one scanner to another.  Given the importance of multi-center collaboration it is critical to evaluate how results will differ when the data comes from a heterogeneous group of MRI scanners as opposed to a single machine.  Generally, the concordance of fMRI results from center to center is quite good, but not perfect.  </p>
<p>Casey et al. was one of the first groups to examine the reliability of results across scanners (1998).  Between three imaging centers they found a ‘strong similarity’ in the location and distribution of significant voxel clusters.  More recently, Friedman et al. found that inter-center reliability was somewhat worse than test-retest reliability across several centers with an identical hardware configuration (2008).  The median ICC of their inter-center results was ICC = 0.22.  Costafreda et al. also examined the reproducibility of results from identical fMRI setups (2007).   Using a variance components analysis they determined that the MR system accounted for roughly 8% of the variation in the BOLD signal.  This compares favorably relative to the level of between-subject variability (57%).</p>
<p>The reliability of results from one scanner to another seems to be approximately equal to or slightly less than the values of test-retest reliability with the same MRI hardware.  Special calibration and quality control steps can be taken to ensure maximum concordance across scanners.  For instance, before conducting anatomical MRI scans in the Alzheimer’s Disease Neuroimaging Initiative (ADNI, http://www.loni.ucla.edu/ADNI/) a special MR phantom is typically scanned.  This allows for correction of magnet-specific field inhomogeneity and maximizes the ability to compare data from separate scanners.  Similar calibration measures are being discussed for functional MRI (Chiarelli et al., 2007; Friedman and Glover, 2006; Thomason et al., 2007).  It may be the case that as calibration becomes standardized it will lead to increased inter-center reliability. </p>
<p><b>Other Statistical Issues in fMRI</b></p>
<p>It is important to note that a number of important fMRI statistical issues have gone unmentioned in this chapter.  First, there is the problem of conducting thousands of statistical comparisons without an appropriate threshold adjustment.  Correction for multiple comparisons is a necessary step in fMRI analysis that is often skipped or ignored (Bennett et al., in press).  Another statistical issue in fMRI is temporal autocorrelation in the acquired timeseries.  This refers to the fact that any single timepoint of data is not necessarily independent of the acquisitions that came before and after (Smith et al., 2007; Woolrich et al., 2001).  Autocorrelation correction is widely available, but is not implemented by most investigators.  Finally, throughout the last year the ‘non-independence error’ has been discussed at length.  Briefly, this refers to selecting a set of voxels to create a region of interest (ROI) and then using the same measure to evaluate some statistical aspect of that region.  Ideally, an independent data set should be used after the ROI has been initially defined.  It is important to address these issues because they are still debated within the field and often ignored in fMRI analysis.  Their correction can have a dramatic impact on how reproducible the results will be from study to study.</p>
<p><center><br />
<h4>- Conclusions -</h4>
<p></center></p>
<p><b>How can a researcher improve fMRI reliability?</b></p>
<p>The generation of highly reliable results requires that sources of error be minimized across a wide array of factors.  An issue within any single factor can significantly reduce reliability.  Problems with the scanner, a poorly designed task, or an improper analysis method could each be extremely detrimental.  Conversely, elimination of all such issues is necessary for high reliability.  A well maintained scanner, well designed tasks, and effective analysis techniques are all prerequisites for reliable results.</p>
<p>There are a number of practical ways that fMRI researchers can improve the reliability of their results.  For example, Friedman and Glover reported that simply increasing the number of fMRI runs improved the reliability of their results from ICC = 0.26 to ICC = 0.58 (2006).  That is quite a large jump for an additional ten or fifteen minutes of scanning.  Below are some general areas where reliability can be improved.</p>
<p><u>Increase the SNR and CNR of the acquisition.</u>  One area of attention is to improve the signal-to-noise and contrast-to-noise ratios of the data collection.  An easy way to do this would be to simply acquire more data.  It is a zero-sum game, as increasing the number of TRs that are acquired will help improve the SNR but will also increase the task length.  Subject fatigue, scanner time limitations, and the diminishing returns with each duration increase will all play a role in limiting the amount of time that can be dedicated to any one task.  Still, a researcher considering a single six-minute EPI scan for their task might add additional data collection to improve the SNR of the results.  With regard to the magnet, every imaging center should verify acquisition quality before scanning.  Many sites conduct quality assurance scans (QA) at the beginning of each day to ensure stable operation.  This has proven to be an effective method of detecting issues with the MR system before they cause trouble for investigators.  It is a hassle to cancel a scanning session when there are subtle artifacts present, but this is a better option than acquiring noisy data that does not make a meaningful contribution to the investigation.  As a final thought, research groups can always start fundraising to purchase a new magnet with improved specifications.  If data acquisition is being done on a 1.5 Tesla magnet with a quadrature head coil enormous gains in SNR can be made by moving to 3.0 Tesla or higher and using a parallel-acquisition head coil (Simmons et al., 2009; Zou et al., 2005).</p>
<p><u>Minimize individual differences in cognitive state, both across subjects and over time.</u>  Because magnet time is expensive and precious the critical component of effective task instruction can often be overlooked.  Researchers would rather be acquiring data as opposed to spending additional time giving detailed instructions to a subject.  However, this is a very easy way to improve the quality of the final data set.  If it takes ten trials for the participant to really ‘get’ the task then those trials have been wasted, adding unnecessary noise to the final results.  Task training in a separate laboratory session in conjunction with time in a mock MRI scanner can go a long way toward homogenizing the scanner experience for subjects.  It may not always be possible to fully implement these steps, but they should not be avoided simply to reduce the time spent per subject.  </p>
<p>For multi-session studies steps can be taken to help stabilize intra-subject changes over time.  Scanning test and retest session at the same time of day can help due to circadian changes in hormone level and cognitive performance (Carrier and Monk, 2000; Huang et al., 2006; Salthouse et al., 2006).  A further step to consider is minimizing the time between sessions to help stabilize the results.  Much more can change over the course of a month than over the course of a week.</p>
<p><u>Maximize the experiment’s statistical power.</u>  Power represents the ability of an experiment to reject the null hypothesis when the null hypothesis is indeed false (Cohen, 1977).  For fMRI this ability is often discussed in terms of the number of subjects that will be scanned and the design of the task that will be administered, including how many volumes of data will be acquired from each subject.  More subjects and more volumes almost always contribute to increasing power, but there are occasions when one may improve power more than the other.  For example, Mumford and Nichols demonstrated that, when scanner time was limited, different combinations of subjects and trials could be used to achieve high levels of power (2008).  For their hypothetical task it would take only five 15 second blocks to achieve 80% power if there were 23 subjects, but it would take 25 blocks if there were only 18 subjects.  These kinds of power estimations are quite useful in determining the best use of available scanner time.  Tools like fmripower (http://fmripower.org) can utilize data from existing experiments to yield new information on how many subjects and scans a new experiment will require to reach a desired power level (Mumford and Nichols, 2008; Mumford et al., 2007 2007; Van Horn et al., 1998).</p>
<p>The structure of the stimulus presentation has a strong influence on an experiment’s statistical power.  The dynamic interplay between stimulus presentation and inter-stimulus jitter are important, as is knowing what contrasts will be completed once the data has been acquired.  Each of these parameters can influence the power and efficiency of the experiment, later impacting the reliability of the results.  Block designs tend to have greater power relative to event-related designs.  One can also increase power by increasing block length, but care should be exercised not to make blocks so long that they approach the low frequencies associated with scanner drift.  There are several good software tools available that will help researchers create an optimal design for fMRI experiments.  OptSeq is a program that helps to maximize the efficiency of an event-related fMRI design (1999).  OptimizeDesign is a set of Matlab scripts that utilize a genetic search algorithm to maximize specific aspects of the design (Wager and Nichols, 2003).  Researchers can separately weight statistical power, HRF estimation efficiency, stimulus counterbalancing, and maintenance of stimulus frequency.  These two programs, and others like them, are valuable tools for ensuring that the ability to detect meaningful signals is effectively maximized.</p>
<p>It is important to state that the reliability of a study in no way implies that an experiment has accurately assessed a specific cognitive process.  The validity of a study can be quite orthogonal to its reliability – it is possible to have very reliable results from a task that mean little with regard to the cognitive process under investigation.  No increase in SNR or optimization of event timing can hope to improve an experiment that is testing for the wrong thing.  This makes task selection of paramount importance in the planning of an experiment.  It also places a burden on the researcher in terms of effective interpretation of fMRI results once the analysis is done. </p>
<p><b>Where does neuroimaging go next?</b></p>
<p>In many ways cognitive neuroscience is still at the beginning of fMRI as a research tool.  Looking back on the last two decades it is clear that functional MRI has made enormous gains in both statistical methodology and popularity.  However, there is still much work to do.  With specific regard to reliability, there are some specific next steps that must be taken for the continued improvement of this method.</p>
<p><u>Better Characterization of the Factors that Influence Reliability.</u>  Additional research is necessary to effectively understand what factors influence the reliability of fMRI results.  The field has a good grasp of the acquisition and analysis factors that influence SNR.  Still, there is relatively little knowledge regarding how stable individuals are over time and what influences that stability.  Large-scale studies specifically investigating reliability and reproducibility should therefore be conducted across several cognitive domains.  The end goal of this research would be to better characterize the reliability of fMRI across multiple dimensions of influence within a homogeneous set of data.  Such a study would also create greater awareness of fMRI reliability in the field as a whole.  The direct comparison of reliability analysis methods, including predictive modeling, should also be completed.</p>
<p><u>Meta/Mega Analysis.</u>  The increased pooling of data from across multiple studies can give a more generalized view of important cognitive processes.  One method, meta-analysis, refers to pooling the statistical results of numerous studies to identify those results that are concordant and discordant with others.  For example, one could obtain the MNI coordinates of significant clusters from several studies having to do with response inhibition and plot them in the same stereotaxic space to determine their concordance.  One popular method of performing such an analysis is the creation of an Activation Likelihood Estimate, or ALE (Eickhoff et al., 2009; Turkeltaub et al., 2002).  This method allows for the statistical thresholding of meta-analysis results, making it a powerful tool to examine the findings of many studies at once.  Another method, mega-analysis, refers to reprocessing the raw data from numerous studies in a new statistical analysis with much greater power.  Using this approach any systematic error introduced by any one study will contribute far less to the final statistical result (Costafreda, in press).  Mega-analyses are far more difficult to implement since the raw imaging data from multiple studies must be obtained and reprocessed.  Still, the increase in detection power and the greater generalizability of the results are strong reasons to engage in such an approach.</p>
<p>One roadblock to collaborative multi-center studies is the lack of data provenance in functional neuroimaging.  Provenance refers to complete detail regarding the origin of a dataset and the history of operations that have been preformed on the data.  Having a complete history of the data enables analysis by other researchers and provides information that is critical for replication studies (Mackenzie-Graham et al., 2008).  Moving forward there will be an additional focus on provenance to enable increased understanding of individual studies and facilitate integration into larger analyses.</p>
<p><u>New Emphasis on Replication.</u>  The non-independence debate of 2009 was less about effect sizes and more about reproducibility.  The implicit argument made about studies that were ‘non-independent’ was that if researchers ran a non-independent study over again the resulting correlation would be far lower with a new, independent dataset.  There should be a greater emphasis on the replicability of studies in the future.  This can be frustrating because it is expensive and time consuming to acquire and process a replication study.  However, moving forward this may become increasingly important to validate important results and conclusions.</p>
<p><b>General Conclusions</b></p>
<p>One thing is abundantly clear: fMRI is an effective research tool that has opened broad new horizons of investigation to scientists around the world.  However, the results from fMRI research may be somewhat less reliable than many researchers implicitly believe.  While it may be frustrating to know that fMRI results are not perfectly replicable, it is beneficial to take a longer-term view regarding the scientific impact of these studies.  In neuroimaging, as in other scientific fields, errors will be made and some results will not replicate.  Still, over time some measure of truth will accrue.  This chapter is not intended to be an accusation against fMRI as a method.  Quite the contrary, it is meant to increase the understanding of how much each fMRI result can contribute to scientific knowledge.  If only 30% of the significant voxels in a cluster will replicate then that value represents an important piece of contextual information to be aware of.  Likewise, if the magnitude of a voxel is only reliable at a level of ICC = 0.50 then that value represents important information when examining scatter plots comparing estimates of activity against a behavioral measure.</p>
<p>There are a variety of methods that can be used to evaluate reliability, and each can provide information on unique aspects of the results.  Our findings speak strongly to the question of why there is no agreed-upon average value for fMRI reliability.  There are so many factors spread out across so many levels of influence that it is almost impossible to summarize the reliability of fMRI with a single value.  While our average ICC value of 0.50 and our average overlap value of 30% are effective summaries of fMRI as a whole, these values may be higher or lower on a study-to-study basis.  The best characterization of fMRI reliability would be to give a window within which fMRI results are typically reliable.  Breaking up the range of 0.0 to 1.0 into thirds, it is appropriate to say that most fMRI results are reliable in the ICC = 0.33 to 0.66 range.</p>
<p>To conclude, functional neuroimaging with fMRI is no longer in its infancy.  Instead it has reached a point of adolescence, where knowledge and methods have made enormous progress but there is still much development left to be done.  Our growing pains from this point forward are going to be a more complete understanding of its strengths, weaknesses, and limitations.  A working knowledge of fMRI reliability is key to this understanding.  The reliability of fMRI may not be the high relative to other scientific measures, but it is presently the best tool available for the in vivo investigation of brain function.  </p>
<p><center><br />
<h4>- References -</h4>
<p></center></p>
<p>Andersson, J.L., Hutton, C., Ashburner, J., Turner, R., Friston, K., 2001. Modeling geometric deformations in EPI time series. Neuroimage 13, 903-919.</p>
<p>Aron, A.R., Gluck, M.A., Poldrack, R.A., 2006. Long-term test-retest reliability of functional MRI in a classification learning task. Neuroimage 29, 1000-1006.</p>
<p>Bandettini, P.A., Wong, E.C., Jesmanowicz, A., Hinks, R.S., Hyde, J.S., 1994. Spin-echo and gradient-echo EPI of human brain activation using BOLD contrast: a comparative study at 1.5 T. NMR Biomed 7, 12-20.</p>
<p>Bartko, J., 1966. The intraclass correlation coefficient as a measure of reliability. Psychological Reports 19, 3-11.</p>
<p>Bennett, C.M., Guerin, S.A., Miller, M.B., 2009. The impact of experimental design on the detection of individual variability in fMRI. Cognitive Neuroscience Society, San Francisco, CA.</p>
<p>Bennett, C.M., Wolford, G.L., Miller, M.B., in press. The principled control of false positives in neuroimaging. Social Cognitive and Affective Neuroscience.</p>
<p>Bodurka, J., Ye, F., Petridou, N., Bandettini, P.A., 2005. Determination of the brain tissue-specific temporal signal to noise limit of 3 T BOLD-weighted time course data., Proc. Intl. Soc. Mag. reson. Med., Miami.</p>
<p>Bosnell, R., Wegner, C., Kincses, Z.T., Korteweg, T., Agosta, F., Ciccarelli, O., De Stefano, N., Gass, A., Hirsch, J., Johansen-Berg, H., Kappos, L., Barkhof, F., Mancini, L., Manfredonia, F., Marino, S., Miller, D.H., Montalban, X., Palace, J., Rocca, M., Enzinger, C., Ropele, S., Rovira, A., Smith, S., Thompson, A., Thornton, J., Yousry, T., Whitcher, B., Filippi, M., Matthews, P.M., 2008. Reproducibility of fMRI in the clinical setting: implications for trial designs. Neuroimage 42, 603-610.</p>
<p>Caceres, A., Hall, D.L., Zelaya, F.O., Williams, S.C., Mehta, M.A., 2009. Measuring fMRI reliability with the intra-class correlation coefficient. Neuroimage 45, 758-768.</p>
<p>Carrier, J., Monk, T.H., 2000. Circadian rhythms of performance: new trends. Chronobiol Int 17, 719-732.</p>
<p>Casey, B.J., Cohen, J.D., O&#8217;Craven, K., Davidson, R.J., Irwin, W., Nelson, C.A., Noll, D.C., Hu, X., Lowe, M.J., Rosen, B.R., Truwitt, C.L., Turski, P.A., 1998. Reproducibility of fMRI results across four institutions using a spatial working memory task. Neuroimage 8, 249-261.</p>
<p>Chen, E.E., Small, S.L., 2007. Test-retest reliability in fMRI of language: group and task effects. Brain Lang 102, 176-185.</p>
<p>Chiarelli, P.A., Bulte, D.P., Wise, R., Gallichan, D., Jezzard, P., 2007. A calibration method for quantitative BOLD fMRI based on hyperoxia. Neuroimage 37, 808-820.</p>
<p>Cicchetti, D., Sparrow, S., 1981. Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. Am J Ment Defic 86, 127-137.</p>
<p>Clement, F., Belleville, S., 2009. Test-retest reliability of fMRI verbal episodic memory paradigms in healthy older adults and in persons with mild cognitive impairment. Hum Brain Mapp.</p>
<p>Cohen, J., 1977. Statistical power analysis for the behavioral sciences., (revised edition) ed. Academic Press, New York, NY.</p>
<p>Cohen, M.S., DuBois, R.M., 1999. Stability, repeatability, and the expression of signal magnitude in functional magnetic resonance imaging. J Magn Reson Imaging 10, 33-40.</p>
<p>Costafreda, S.G., in press. Pooling fMRI data: meta-analysis, mega-analysis and multi-center studies. . Frontiers in Neuroinformatics.</p>
<p>Costafreda, S.G., Brammer, M.J., Vencio, R.Z., Mourao, M.L., Portela, L.A., de Castro, C.C., Giampietro, V.P., Amaro, E., Jr., 2007. Multisite fMRI reproducibility of a motor task using identical MR systems. J Magn Reson Imaging 26, 1122-1126.</p>
<p>Dale, A., 1999. Optimal Experimental Design for Event-Related fMRI. Human Brain Mapping 8, 109-114.</p>
<p>Di Bonaventura, C., Vaudano, A.E., Carni, M., Pantano, P., Nucciarelli, V., Garreffa, G., Maraviglia, B., Prencipe, M., Bozzao, L., Manfredi, M., Giallonardo, A.T., 2005. Long-term reproducibility of fMRI activation in epilepsy patients with Fixation Off Sensitivity. Epilepsia 46, 1149-1151.</p>
<p>Duncan, K.J., Pattamadilok, C., Knierim, I., Devlin, J.T., 2009. Consistency and variability in functional localisers. Neuroimage 46, 1018-1026.</p>
<p>Eaton, K.P., Szaflarski, J.P., Altaye, M., Ball, A.L., Kissela, B.M., Banks, C., Holland, S.K., 2008. Reliability of fMRI for studies of language in post-stroke aphasia subjects. Neuroimage 41, 311-322.</p>
<p>Eickhoff, S.B., Laird, A.R., Grefkes, C., Wang, L.E., Zilles, K., Fox, P.T., 2009. Coordinate-based activation likelihood estimation meta-analysis of neuroimaging data: a random-effects approach based on empirical estimates of spatial uncertainty. Hum Brain Mapp 30, 2907-2926.</p>
<p>Feredoes, E., Postle, B.R., 2007. Localization of load sensitivity of working memory storage: quantitatively and qualitatively discrepant results yielded by single-subject and group-averaged approaches to fMRI group analysis. Neuroimage 35, 881-903.</p>
<p>Fernandez, G., Specht, K., Weis, S., Tendolkar, I., Reuber, M., Fell, J., Klaver, P., Ruhlmann, J., Reul, J., Elger, C.E., 2003. Intrasubject reproducibility of presurgical language lateralization and mapping using fMRI. Neurology 60, 969-975.</p>
<p>Freedman, R., 2003. Schizophrenia. N Engl J Med 349, 1738-1749.</p>
<p>Freyer, T., Valerius, G., Kuelz, A.K., Speck, O., Glauche, V., Hull, M., Voderholzer, U., 2009. Test-retest reliability of event-related functional MRI in a probabilistic reversal learning task. Psychiatry Res.</p>
<p>Friedman, L., Glover, G.H., 2006. Reducing interscanner variability of activation in a multicenter fMRI study: controlling for signal-to-fluctuation-noise-ratio (SFNR) differences. Neuroimage 33, 471-481.</p>
<p>Friedman, L., Stern, H., Brown, G.G., Mathalon, D.H., Turner, J., Glover, G.H., Gollub, R.L., Lauriello, J., Lim, K.O., Cannon, T., Greve, D.N., Bockholt, H.J., Belger, A., Mueller, B., Doty, M.J., He, J., Wells, W., Smyth, P., Pieper, S., Kim, S., Kubicki, M., Vangel, M., Potkin, S.G., 2008. Test-retest and between-site reliability in a multicenter fMRI study. Hum Brain Mapp 29, 958-972.</p>
<p>Gold, S., Christian, B., Arndt, S., Zeien, G., Cizadlo, T., Johnson, D.L., Flaum, M., Andreasen, N.C., 1998. Functional MRI statistical software packages: a comparative analysis. Hum Brain Mapp 6, 73-84.</p>
<p>Gountouna, V.E., Job, D.E., McIntosh, A.M., Moorhead, T.W., Lymer, G.K., Whalley, H.C., Hall, J., Waiter, G.D., Brennan, D., McGonigle, D.J., Ahearn, T.S., Cavanagh, J., Condon, B., Hadley, D.M., Marshall, I., Murray, A.D., Steele, J.D., Wardlaw, J.M., Lawrie, S.M., 2009. Functional Magnetic Resonance Imaging (fMRI) reproducibility and variance components across visits and scanning sites with a finger tapping task. Neuroimage.</p>
<p>Grafton, S., Hazeltine, E., Ivry, R., 1995. Functional mapping of sequence learning in normal humans. Journal of Cognitive Neuroscience 7, 497-510.</p>
<p>Harrington, G.S., Buonocore, M.H., Farias, S.T., 2006a. Intrasubject reproducibility of functional MR imaging activation in language tasks. AJNR Am J Neuroradiol 27, 938-944.</p>
<p>Harrington, G.S., Tomaszewski Farias, S., Buonocore, M.H., Yonelinas, A.P., 2006b. The intersubject and intrasubject reproducibility of FMRI activation during three encoding tasks: implications for clinical applications. Neuroradiology 48, 495-505.</p>
<p>Havel, P., Braun, B., Rau, S., Tonn, J.C., Fesl, G., Bruckmann, H., Ilmberger, J., 2006. Reproducibility of activation in four motor paradigms. An fMRI study. J Neurol 253, 471-476.</p>
<p>Hoenig, K., Kuhl, C.K., Scheef, L., 2005. Functional 3.0-T MR assessment of higher cognitive function: are there advantages over 1.5-T imaging? Radiology 234, 860-868.</p>
<p>Huang, J., Katsuura, T., Shimomura, Y., Iwanaga, K., 2006. Diurnal changes of ERP response to sound stimuli of varying frequency in morning-type and evening-type subjects. J Physiol Anthropol 25, 49-54.</p>
<p>Huettel, S.A., Song, A.W., McCarthy, G., 2008. Functional Magnetic Resonance Imaging, 2nd ed. Sinauer Associates, Sunderland, MA.</p>
<p>Jabbi, M., Keysers, C., Singer, T., Stephan, K.E., 2009. Response to &#8220;Voodoo Correlations in Social Neuroscience&#8221; by Vul et al.</p>
<p>Jansen, A., Menke, R., Sommer, J., Forster, A.F., Bruchmann, S., Hempleman, J., Weber, B., Knecht, S., 2006. The assessment of hemispheric lateralization in functional MRI&#8211;robustness and reproducibility. Neuroimage 33, 204-217.</p>
<p>Jezzard, P., Clare, S., 1999. Sources of distortion in functional MRI data. Hum Brain Mapp 8, 80-85.</p>
<p>Johnstone, T., Somerville, L.H., Alexander, A.L., Oakes, T.R., Davidson, R.J., Kalin, N.H., Whalen, P.J., 2005. Stability of amygdala BOLD response to fearful faces over multiple scan sessions. Neuroimage 25, 1112-1123.</p>
<p>Jovicich, J., Czanner, S., Greve, D., Haley, E., van der Kouwe, A., Gollub, R., Kennedy, D., Schmitt, F., Brown, G., Macfall, J., Fischl, B., Dale, A., 2006. Reliability in multi-site structural MRI studies: effects of gradient non-linearity correction on phantom and human data. Neuroimage 30, 436-443.</p>
<p>Kiebel, S., Holmes, A., 2007. The general linear model. In: Friston, K., Ashburner, J., Kiebel, S., Nichols, T., Penny, W. (Eds.), Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press, London.</p>
<p>Kiehl, K.A., Liddle, P.F., 2003. Reproducibility of the hemodynamic response to auditory oddball stimuli: a six-week test-retest study. Hum Brain Mapp 18, 42-52.</p>
<p>Kimberley, T.J., Khandekar, G., Borich, M., 2008. fMRI reliability in subjects with stroke. Exp Brain Res 186, 183-190.</p>
<p>Kong, J., Gollub, R.L., Webb, J.M., Kong, J.T., Vangel, M.G., Kwong, K., 2007. Test-retest study of fMRI signal change evoked by electroacupuncture stimulation. Neuroimage 34, 1171-1181.</p>
<p>Kruger, G., Glover, G.H., 2001. Physiological noise in oxygenation-sensitive magnetic resonance imaging. Magn Reson Med 46, 631-637.</p>
<p>Leontiev, O., Buxton, R.B., 2007. Reproducibility of BOLD, perfusion, and CMRO2 measurements with calibrated-BOLD fMRI. Neuroimage 35, 175-184.</p>
<p>Lieberman, M.D., Berkman, E.T., Wager, T.D., 2009. Correlations in social neuroscience aren&#8217;t voodoo: Commentary on Vul et al. (2009). Perspectives on Psychological Science 4.</p>
<p>Liou, M., Su, H.R., Savostyanov, A.N., Lee, J.D., Aston, J.A., Chuang, C.H., Cheng, P.E., 2009. Beyond p-values: averaged and reproducible evidence in fMRI experiments. Psychophysiology 46, 367-378.</p>
<p>Liu, J.Z., Zhang, L., Brown, R.W., Yue, G.H., 2004. Reproducibility of fMRI at 1.5 T in a strictly controlled motor task. Magn Reson Med 52, 751-760.</p>
<p>Loubinoux, I., Carel, C., Alary, F., Boulanouar, K., Viallard, G., Manelfe, C., Rascol, O., Celsis, P., Chollet, F., 2001. Within-session and between-session reproducibility of cerebral sensorimotor activation: a test&#8211;retest effect evidenced with functional magnetic resonance imaging. J Cereb Blood Flow Metab 21, 592-607.</p>
<p>MacDonald, S.W., Nyberg, L., Backman, L., 2006. Intra-individual variability in behavior: links to brain structure, neurotransmission and neuronal activity. Trends Neurosci 29, 474-480.</p>
<p>Machielsen, W.C., Rombouts, S.A., Barkhof, F., Scheltens, P., Witter, M.P., 2000. FMRI of visual encoding: reproducibility of activation. Hum Brain Mapp 9, 156-164.</p>
<p>Mackenzie-Graham, A.J., Van Horn, J.D., Woods, R.P., Crawford, K.L., Toga, A.W., 2008. Provenance in neuroimaging. Neuroimage 42, 178-195.</p>
<p>Magon, S., Basso, G., Farace, P., Ricciardi, G.K., Beltramello, A., Sbarbati, A., 2009. Reproducibility of BOLD signal change induced by breath holding. Neuroimage 45, 702-712.</p>
<p>Maitra, R., 2009. Assessing certainty of activation or inactivation in test-retest fMRI studies. Neuroimage 47, 88-97.</p>
<p>Maitra, R., Roys, S.R., Gullapalli, R.P., 2002. Test-retest reliability estimation of functional MRI data. Magn Reson Med 48, 62-70.</p>
<p>Maldjian, J.A., Laurienti, P.J., Driskill, L., Burdette, J.H., 2002. Multiple reproducibility indices for evaluation of cognitive functional MR imaging paradigms. AJNR Am J Neuroradiol 23, 1030-1037.</p>
<p>Manoach, D.S., Halpern, E.F., Kramer, T.S., Chang, Y., Goff, D.C., Rauch, S.L., Kennedy, D.N., Gollub, R.L., 2001. Test-retest reliability of a functional MRI working memory paradigm in normal and schizophrenic subjects. Am J Psychiatry 158, 955-958.</p>
<p>Marshall, I., Simonotto, E., Deary, I.J., Maclullich, A., Ebmeier, K.P., Rose, E.J., Wardlaw, J.M., Goddard, N., Chappell, F.M., 2004. Repeatability of motor and working-memory tasks in healthy older volunteers: assessment at functional MR imaging. Radiology 233, 868-877.</p>
<p>Mayer, A.R., Xu, J., Pare-Blagoev, J., Posse, S., 2006. Reproducibility of activation in Broca&#8217;s area during covert generation of single words at high field: a single trial FMRI study at 4 T. Neuroimage 32, 129-137.</p>
<p>McGonigle, D.J., Howseman, A.M., Athwal, B.S., Friston, K.J., Frackowiak, R.S., Holmes, A.P., 2000. Variability in fMRI: an examination of intersession differences. Neuroimage 11, 708-734.</p>
<p>Meindl, T., Teipel, S., Elmouden, R., Mueller, S., Koch, W., Dietrich, O., Coates, U., Reiser, M., Glaser, C., 2009. Test-retest reproducibility of the default-mode network in healthy individuals. Hum Brain Mapp.</p>
<p>Miki, A., Liu, G.T., Englander, S.A., Raz, J., van Erp, T.G., Modestino, E.J., Liu, C.J., Haselgrove, J.C., 2001. Reproducibility of visual activation during checkerboard stimulation in functional magnetic resonance imaging at 4 Tesla. Jpn J Ophthalmol 45, 151-155.</p>
<p>Miki, A., Raz, J., van Erp, T.G., Liu, C.S., Haselgrove, J.C., Liu, G.T., 2000. Reproducibility of visual activation in functional MR imaging and effects of postprocessing. AJNR Am J Neuroradiol 21, 910-915.</p>
<p>Mikl, M., Marecek, R., Hlustik, P., Pavlicova, M., Drastich, A., Chlebus, P., Brazdil, M., Krupa, P., 2008. Effects of spatial smoothing on fMRI group inferences. Magn Reson Imaging 26, 490-503.</p>
<p>Miller, M.B., Donovan, C.L., Van Horn, J.D., German, E., Sokol-Hessner, P., Wolford, G.L., 2009. Unique and persistent individual patterns of brain activity across different memory retrieval tasks. Neuroimage 48, 625-635.</p>
<p>Miller, M.B., Handy, T.C., Cutler, J., Inati, S., Wolford, G.L., 2001. Brain activations associated with shifts in response criterion on a recognition test. Can J Exp Psychol 55, 162-173.</p>
<p>Miller, M.B., Van Horn, J.D., Wolford, G.L., Handy, T.C., Valsangkar-Smyth, M., Inati, S., Grafton, S., Gazzaniga, M.S., 2002. Extensive individual differences in brain activations associated with episodic retrieval are reliable over time. J Cogn Neurosci 14, 1200-1214.</p>
<p>Morgan, V.L., Dawant, B.M., Li, Y., Pickens, D.R., 2007. Comparison of fMRI statistical software packages and strategies for analysis of images containing random and stimulus-correlated motion. Comput Med Imaging Graph 31, 436-446.</p>
<p>Morrison, P.D., Murray, R.M., 2005. Schizophrenia. Curr Biol 15, R980-984.</p>
<p>Moser, E., Teichtmeister, C., Diemling, M., 1996. Reproducibility and postprocessing of gradient-echo functional MRI to improve localization of brain activity in the human visual cortex. Magn Reson Imaging 14, 567-579.</p>
<p>Muller, R., Buttner, P., 1994. A critical discussion of intraclass correlation coefficients. Stat Med 13, 2465-2476.</p>
<p>Mumford, J.A., Nichols, T., 2009. Simple group fMRI modeling and inference. Neuroimage 47, 1469-1475.</p>
<p>Mumford, J.A., Nichols, T.E., 2008. Power calculation for group fMRI studies accounting for arbitrary design and temporal autocorrelation. Neuroimage 39, 261-268.</p>
<p>Mumford, J.A., Poldrack, R.A., Nichols, T., 2007. FMRIpower: A Power Calculation Tool for 2-Stage fMRI models. Human Brain Mapping, Chicago, IL.</p>
<p>Munneke, J., Heslenfeld, D.J., Theeuwes, J., 2008. Directing attention to a location in space results in retinotopic activation in primary visual cortex. Brain Res 1222, 184-191.</p>
<p>Murphy, K., Bodurka, J., Bandettini, P.A., 2007. How long to scan? The relationship between fMRI temporal signal to noise ratio and necessary scan duration. Neuroimage 34, 565-574.</p>
<p>Neumann, J., Lohmann, G., Zysset, S., von Cramon, D.Y., 2003. Within-subject variability of BOLD response dynamics. Neuroimage 19, 784-796.</p>
<p>Nichols, T., Brett, M., Andersson, J., Wager, T., Poline, J.B., 2005. Valid conjunction inference with the minimum statistic. Neuroimage 25, 653-660.</p>
<p>Nunnally, J., 1970. Introduction to psychological measurement. McGraw Hill, New York.</p>
<p>Oakes, T.R., Johnstone, T., Ores Walsh, K.S., Greischar, L.L., Alexander, A.L., Fox, A.S., Davidson, R.J., 2005. Comparison of fMRI motion correction software tools. Neuroimage 28, 529-543.</p>
<p>Ogawa, S., Menon, R.S., Tank, D.W., Kim, S.G., Merkle, H., Ellermann, J.M., Ugurbil, K., 1993. Functional brain mapping by blood oxygenation level-dependent contrast magnetic resonance imaging. A comparison of signal characteristics with a biophysical model. Biophys J 64, 803-812.</p>
<p>Peelen, M.V., Downing, P.E., 2005. Within-subject reproducibility of category-specific visual activation with functional MRI. Hum Brain Mapp 25, 402-408.</p>
<p>Peyron, R., Garcia-Larrea, L., Gregoire, M.C., Costes, N., Convers, P., Lavenne, F., Mauguiere, F., Michel, D., Laurent, B., 1999. Haemodynamic brain responses to acute pain in humans: sensory and attentional networks. Brain 122 ( Pt 9), 1765-1780.</p>
<p>Phan, K.L., Liberzon, I., Welsh, R.C., Britton, J.C., Taylor, S.F., 2003. Habituation of rostral anterior cingulate cortex to repeated emotionally salient pictures. Neuropsychopharmacology 28, 1344-1350.</p>
<p>Poldrack, R.A., Prabhakaran, V., Seger, C.A., Gabrieli, J.D., 1999. Striatal activation during acquisition of a cognitive skill. Neuropsychology 13, 564-574.</p>
<p>Poline, J.B., Strother, S.C., Dehaene-Lambertz, G., Egan, G.F., Lancaster, J.L., 2006. Motivation and synthesis of the FIAC experiment: Reproducibility of fMRI results across expert analyses. Hum Brain Mapp 27, 351-359.</p>
<p>Raemaekers, M., Vink, M., Zandbelt, B., van Wezel, R.J., Kahn, R.S., Ramsey, N.F., 2007. Test-retest reliability of fMRI activation during prosaccades and antisaccades. Neuroimage 36, 532-542.</p>
<p>Ramsey, N., Tallent, K., van Gelderen, P., Frank, J., Moonen, C., Weinberger, D., 1996. Reproducibility of Human 3D fMRI Brain Maps Acquired During a Motor Task. Human Brain Mapping 4, 113-121.</p>
<p>Rau, S., Fesl, G., Bruhns, P., Havel, P., Braun, B., Tonn, J.C., Ilmberger, J., 2007. Reproducibility of activations in Broca area with two language tasks: a functional MR imaging study. AJNR Am J Neuroradiol 28, 1346-1353.</p>
<p>Rombouts, S.A., Barkhof, F., Hoogenraad, F.G., Sprenger, M., Scheltens, P., 1998. Within-subject reproducibility of visual activation patterns with functional magnetic resonance imaging using multislice echo planar imaging. Magn Reson Imaging 16, 105-113.</p>
<p>Rombouts, S.A., Barkhof, F., Hoogenraad, F.G., Sprenger, M., Valk, J., Scheltens, P., 1997. Test-retest analysis with functional MR of the activated area in the human visual cortex. AJNR Am J Neuroradiol 18, 1317-1322.</p>
<p>Rostami, M., Hosseini, S.M., Takahashi, M., Sugiura, M., Kawashima, R., 2009. Neural bases of goal-directed implicit learning. Neuroimage 48, 303-310.</p>
<p>Rutten, G.J., Ramsey, N.F., van Rijen, P.C., van Veelen, C.W., 2002. Reproducibility of fMRI-determined language lateralization in individual subjects. Brain Lang 80, 421-437.</p>
<p>Safrit, M., 1976. Reliability theory. American Alliance for Health, Physical Education, and Recreation, Washington, DC.</p>
<p>Salli, E., Korvenoja, A., Visa, A., Katila, T., Aronen, H.J., 2001. Reproducibility of fMRI: effect of the use of contextual information. Neuroimage 13, 459-471.</p>
<p>Salthouse, T.A., Nesselroade, J.R., Berish, D.E., 2006. Short-term variability in cognitive performance and the calibration of longitudinal change. J Gerontol B Psychol Sci Soc Sci 61, P144-151.</p>
<p>Schunck, T., Erb, G., Mathis, A., Jacob, N., Gilles, C., Namer, I.J., Meier, D., Luthringer, R., 2008. Test-retest reliability of a functional MRI anticipatory anxiety paradigm in healthy volunteers. J Magn Reson Imaging 27, 459-468.</p>
<p>Shehzad, Z., Kelly, A.M., Reiss, P.T., Gee, D.G., Gotimer, K., Uddin, L.Q., Lee, S.H., Margulies, D.S., Roy, A.K., Biswal, B.B., Petkova, E., Castellanos, F.X., Milham, M.P., 2009. The resting brain: unconstrained yet reliable. Cereb Cortex 19, 2209-2229.</p>
<p>Shrout, P., Fleiss, J., 1979. Intraclass Correlations: Uses in Assessing Rater Reliability. Psychological Bulletin 86, 420-428.</p>
<p>Simmons, W.K., Reddish, M., Bellgowan, P.S., Martin, A., 2009. The Selectivity and Functional Connectivity of the Anterior Temporal Lobes. Cereb Cortex.</p>
<p>Smith, A.T., Singh, K.D., Balsters, J.H., 2007. A comment on the severity of the effects of non-white noise in fMRI time-series. Neuroimage 36, 282-288.</p>
<p>Smith, S.M., Beckmann, C.F., Ramnani, N., Woolrich, M.W., Bannister, P.R., Jenkinson, M., Matthews, P.M., McGonigle, D.J., 2005. Variability in fMRI: a re-examination of inter-session differences. Hum Brain Mapp 24, 248-257.</p>
<p>Specht, K., Willmes, K., Shah, N.J., Jancke, L., 2003. Assessment of reliability in functional imaging studies. J Magn Reson Imaging 17, 463-471.</p>
<p>Stark, R., Schienle, A., Walter, B., Kirsch, P., Blecker, C., Ott, U., Schafer, A., Sammer, G., Zimmermann, M., Vaitl, D., 2004. Hemodynamic effects of negative emotional pictures &#8211; a test-retest analysis. Neuropsychobiology 50, 108-118.</p>
<p>Sterr, A., Shen, S., Zaman, A., Roberts, N., Szameitat, A., 2007. Activation of SI is modulated by attention: a random effects fMRI study using mechanical stimuli. Neuroreport 18, 607-611.</p>
<p>Strother, S., La Conte, S., Kai Hansen, L., Anderson, J., Zhang, J., Pulapura, S., Rottenberg, D., 2004. Optimizing the fMRI data-processing pipeline using prediction and reproducibility performance metrics: I. A preliminary group analysis. Neuroimage 23 Suppl 1, S196-207.</p>
<p>Strother, S.C., Anderson, J., Hansen, L.K., Kjems, U., Kustra, R., Sidtis, J., Frutiger, S., Muley, S., LaConte, S., Rottenberg, D., 2002. The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework. Neuroimage 15, 747-771.</p>
<p>Swallow, K.M., Braver, T.S., Snyder, A.Z., Speer, N.K., Zacks, J.M., 2003. Reliability of functional localization using fMRI. Neuroimage 20, 1561-1577.</p>
<p>Symms, M.R., Allen, P.J., Woermann, F.G., Polizzi, G., Krakow, K., Barker, G.J., Fish, D.R., Duncan, J.S., 1999. Reproducible localization of interictal epileptiform discharges using EEG-triggered fMRI. Phys Med Biol 44, N161-168.</p>
<p>Tegeler, C., Strother, S.C., Anderson, J.R., Kim, S.G., 1999. Reproducibility of BOLD-based functional MRI obtained at 4 T. Hum Brain Mapp 7, 267-283.</p>
<p>Thomason, M.E., Foland, L.C., Glover, G.H., 2007. Calibration of BOLD fMRI using breath holding reduces group variance during a cognitive task. Hum Brain Mapp 28, 59-68.</p>
<p>Triantafyllou, C., Hoge, R.D., Krueger, G., Wiggins, C.J., Potthast, A., Wiggins, G.C., Wald, L.L., 2005. Comparison of physiological noise at 1.5 T, 3 T and 7 T and optimization of fMRI acquisition parameters. Neuroimage 26, 243-250.</p>
<p>Turkeltaub, P.E., Guinevere, F.E., Jones, K.M., Zeffiro, T.A., 2002. Meta-Analysis of the Functional Neuroanatomy of Single-Word Reading: Method and Validation. Neuroimage 16, 765-780.</p>
<p>Turner, R., Jezzard, P., Wen, H., Kwong, K.K., Le Bihan, D., Zeffiro, T., Balaban, R.S., 1993. Functional mapping of the human visual cortex at 4 and 1.5 tesla using deoxygenation contrast EPI. Magn Reson Med 29, 277-279.</p>
<p>Van Horn, J.D., Ellmore, T.M., Esposito, G., Berman, K.F., 1998. Mapping voxel-based statistical power on parametric images. Neuroimage 7, 97-107.</p>
<p>Van Horn, J.D., Toga, A.W., 2009. Multisite neuroimaging trials. Curr Opin Neurol 22, 370-378.</p>
<p>Vul, E., Harris, C., Winkielman, P., Pashler, H., 2009. Puzzlingly high correlations in fMRI studies of emotion, personality, and social cognition. Perspectives on Psychological Science 4.</p>
<p>Wager, T.D., Nichols, T., 2003. Optimization of experimental design in fMRI: a general framework using a genetic algorithm. Neuroimage 18, 293-309.</p>
<p>Wagner, K., Frings, L., Quiske, A., Unterrainer, J., Schwarzwald, R., Spreer, J., Halsband, U., Schulze-Bonhage, A., 2005. The reliability of fMRI activations in the medial temporal lobes in a verbal episodic memory task. Neuroimage 28, 122-131.</p>
<p>Waites, A.B., Shaw, M.E., Briellmann, R.S., Labate, A., Abbott, D.F., Jackson, G.D., 2005. How reliable are fMRI-EEG studies of epilepsy? A nonparametric approach to analysis validation and optimization. Neuroimage 24, 192-199.</p>
<p>Waldvogel, D., van Gelderen, P., Immisch, I., Pfeiffer, C., Hallett, M., 2000. The variability of serial fMRI data: correlation between a visual and a motor task. Neuroreport 11, 3843-3847.</p>
<p>Wei, X., Yoo, S.S., Dickey, C.C., Zou, K.H., Guttmann, C.R., Panych, L.P., 2004. Functional MRI of auditory verbal working memory: long-term reproducibility analysis. Neuroimage 21, 1000-1008.</p>
<p>Whalley, H.C., Gountouna, V.E., Hall, J., McIntosh, A.M., Simonotto, E., Job, D.E., Owens, D.G., Johnstone, E.C., Lawrie, S.M., 2009. fMRI changes over time and reproducibility in unmedicated subjects at high genetic risk of schizophrenia. Psychol Med 39, 1189-1199.</p>
<p>White, T., O&#8217;Leary, D., Magnotta, V., Arndt, S., Flaum, M., Andreasen, N.C., 2001. Anatomic and functional variability: the effects of filter size in group fMRI data analysis. Neuroimage 13, 577-588.</p>
<p>Woolrich, M.W., Ripley, B.D., Brady, M., Smith, S.M., 2001. Temporal autocorrelation in univariate linear modeling of FMRI data. Neuroimage 14, 1370-1386.</p>
<p>Yetkin, F.Z., McAuliffe, T.L., Cox, R., Haughton, V.M., 1996. Test-retest precision of functional MR in sensory and motor task activation. AJNR Am J Neuroradiol 17, 95-98.</p>
<p>Yoo, S.S., O&#8217;Leary, H.M., Lee, J.H., Chen, N.K., Panych, L.P., Jolesz, F.A., 2007. Reproducibility of trial-based functional MRI on motor imagery. Int J Neurosci 117, 215-227.</p>
<p>Yoo, S.S., Wei, X., Dickey, C.C., Guttmann, C.R., Panych, L.P., 2005. Long-term reproducibility analysis of fMRI using hand motor task. Int J Neurosci 115, 55-77.</p>
<p>Zandbelt, B.B., Gladwin, T.E., Raemaekers, M., van Buuren, M., Neggers, S.F., Kahn, R.S., Ramsey, N.F., Vink, M., 2008. Within-subject variation in BOLD-fMRI signal changes across repeated measurements: quantification and implications for sample size. Neuroimage 42, 196-206.</p>
<p>Zhang, J., Anderson, J.R., Liang, L., Pulapura, S.K., Gatewood, L., Rottenberg, D.A., Strother, S.C., 2009. Evaluation and optimization of fMRI single-subject processing pipelines with NPAIRS and second-level CVA. Magn Reson Imaging 27, 264-278.</p>
<p>Zhang, J., Liang, L., Anderson, J.R., Gatewood, L., Rottenberg, D.A., Strother, S.C., 2008. A Java-based fMRI processing pipeline evaluation system for assessment of univariate general linear model and multivariate canonical variate analysis-based pipelines. Neuroinformatics 6, 123-134.</p>
<p>Zhilkin, P., Alexander, M.E., 2004. Affine registration: a comparison of several programs. Magn Reson Imaging 22, 55-66.</p>
<p>Zou, K.H., Greve, D.N., Wang, M., Pieper, S.D., Warfield, S.K., White, N.S., Manandhar, S., Brown, G.G., Vangel, M.G., Kikinis, R., Wells, W.M., 3rd, 2005. Reproducibility of functional MR imaging: preliminary results of prospective multi-institutional study performed by Biomedical Informatics Research Network. Radiology 237, 781-789.</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/02/paper-how-reliable-are-the-results-from-functional-magnetic-resonance-imaging/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Quote of the Week &#8211; Pashler</title>
		<link>http://prefrontal.org/blog/2010/01/quote-of-the-week-pashler/</link>
		<comments>http://prefrontal.org/blog/2010/01/quote-of-the-week-pashler/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 02:38:48 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Quotes]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=960</guid>
		<description><![CDATA[“It’s hellishly complicated, this data analysis, and that creates great opportunity for inadvertent mischief.” &#8211; Hal Pashler (As seen in Science News)]]></description>
			<content:encoded><![CDATA[<p>“It’s hellishly complicated, this data analysis, and that creates great opportunity for inadvertent mischief.” &#8211; <a href="http://www.pashler.com/">Hal Pashler</a> (As seen in <a href="http://www.sciencenews.org/view/feature/id/50295/title/Trawling_the_brain">Science News</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2010/01/quote-of-the-week-pashler/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>PAPER: The Principled Control of False Positives in Neuroimaging</title>
		<link>http://prefrontal.org/blog/2009/12/paper-the-principled-control-of-false-positives-in-neuroimaging/</link>
		<comments>http://prefrontal.org/blog/2009/12/paper-the-principled-control-of-false-positives-in-neuroimaging/#comments</comments>
		<pubDate>Thu, 17 Dec 2009 13:30:19 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[MRI]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=861</guid>
		<description><![CDATA[- Current Citation: Bennett CM, Wolford GL, Miller MB. (in press). The Principled Control of False Positives in Neuroimaging. Social Cognitive and Affective Neuroscience. - Abstract: An incredible amount of data is generated in the course of a functional neuroimaging experiment. The quantity of data gives us improved temporal and spatial resolution with which to [...]]]></description>
			<content:encoded><![CDATA[<p><strong>- Current Citation:</strong><br />
Bennett CM, Wolford GL, Miller MB. (in press). The Principled Control of False Positives in Neuroimaging.  <em>Social Cognitive and Affective Neuroscience.</em></p>
<p><strong>- Abstract:</strong><br />
An incredible amount of data is generated in the course of a functional neuroimaging experiment.  The quantity of data gives us improved temporal and spatial resolution with which to evaluate our results.  It also creates a staggering multiple testing problem.  A number of methods have been created that address the multiple testing problem in neuroimaging in a principled fashion.  These methods place limits on either the familywise error rate (FWER) or the false discovery rate (FDR) of the results.  These principled approaches are well established in the literature and are known to properly limit the amount of false positives across the whole brain. However, a minority of papers are still published every month using methods that are improperly corrected for the number of tests conducted.  These latter methods place limits on the voxelwise probability of a false positive and yield no information on the global rate of false positives in the results.  In this commentary we argue in favor of a principled approach to the multiple testing problem &#8211; one that places appropriate limits on the rate of false positives across the whole brain and gives the reader the information they need to properly evaluate the results.</p>
<p><strong>- Downloadable Versions:</strong><br />
[<a href="http://prefrontal.org/files/papers/Bennett-SCAN-2009.pdf">Manuscript PDF</a>]<br />
[<a href="http://scan.oxfordjournals.org/content/4/4/417.abstract">Link to Journal PDF</a>]</p>
<p><span id="more-861"></span><br />
<strong>- Full Text:</strong><br />
The struggle between the appropriate treatment of false positives and false negatives is a fine line that every scientist must walk.  If our criteria are too conservative we will not have the power to detect meaningful results.  If our thresholds are too liberal our results will become contaminated by an excess of false positives.  Ideally, we hope to maximize the number of true positives (hits) while minimizing false reports.</p>
<p>It is a statistical necessity that we must adapt our threshold criteria to the number of statistical tests completed on the same dataset.  This multiple testing problem is not unique to neuroimaging; it affects many areas of modern science.  Ask an economist about finding market correlations between 10,000 stocks or a geneticist about testing across 100,000 SNPs and you will quickly understand the pervasiveness of the multiple testing problem throughout scientific research (Storey and Tibshirani 2003; Taleb 2004).  </p>
<p>In this paper we argue for the use of principled corrections when dealing with the large number of comparisons typical of neuroimaging data.  By principled, we mean a correction that definitively identifies for the reader the probability or the proportion of false positives that could be expected in the reported results.  Ideally, the correction would be easy for the reader to understand.  Many researchers have avoided principled correction due to the perception that such methods are too conservative.  In theory and in practice, there is no reason for a principled correction to be either liberal or conservative.  The degree of &#8216;conservativeness&#8217; generally can be adjusted by setting a parameter, maintaining accurate knowledge about the prevalence of false positives.  Later in the commentary, we will outline familywise error rate correction (FWER) and false discovery rate correction (FDR) as two examples of principled correction.</p>
<p><strong>The Problem</strong></p>
<p>Many published fMRI papers use arbitrary, uncorrected statistical thresholds.  A commonly chosen threshold is p < 0.001 with a minimum voxel clustering value of 10 voxels.  For a few datasets this threshold may strike an appropriate balance between sensitivity and specificity; and in a few cases it might be possible to specify the probability of a false positive with this threshold.  However, this uncorrected cutoff cannot be valid for the diverse array of situations in which it is used. The same threshold has been used with data comprising 10,000 voxels and with data comprising 60,000 voxels – this simply cannot be appropriate.  The two situations have very different probabilities of false positives.  The use of a principled procedure would yield the same expected probability or proportion of false positives for any number of voxels under investigation.</p>
<p>In a recent survey of all articles published in six major neuroimaging journals during the year 2008 we found that between 25-30% of fMRI articles in each journal used uncorrected thresholds in their analysis (Bennett, Baird et al. Under Review).  This percentage speaks to the fact that the majority of published research uses principled correction.  However, the meta-analysis also highlights that a quarter to a third of published papers do not use principled correction, and that such papers continue to be published in high-impact, specialized journals.  The proportion of studies using uncorrected thresholds is even higher within the realm of conference posters and presentations.  In a survey of posters presented at a recent neuroscience conference we found that 80% of the presentations used uncorrected thresholds.  In these unprincipled cases the reader is unlikely to have an accurate idea about the true likelihood of false positives in the results.  </p>
<p>The prevalence of unprincipled correction in the literature is a serious issue.  During an examination of familywise error correction methods in neuroimaging, Nichols and Hayasaka (2003) compared techniques that included Gaussian Random Field Theory, Bonferroni, FDR, Šidák, and permutation.  They found that only 8 out of 11 fMRI and PET studies had any significant voxels after familywise correction had been completed, leaving three studies with no significant voxels at all.   Based on this data it is quite likely that results comprised wholly of false positives are present in the current literature.  Despite this fact new studies reporting uncorrected statistics are published every month.</p>
<p>False positives can be costly in a number of ways.  One example of the negative consequences of false positives can be illustrated in a study completed by one of the current authors (MBM) in graduate school.   He conducted an fMRI study investigating differential activations between false memories and true memories using the Roediger and McDermott word paradigm (1995).  At the same time Schacter and colleagues were conducting a PET study using the same approach.  Using a liberal uncorrected threshold Schacter and colleagues found a few small regions of interest in the medial temporal lobe and superior temporal sulcus (Schacter, Reiman et al. 1996).  In their own results Miller and colleagues found two very different small clusters in the frontal and parietal cortex.  When the Miller et al. study was presented at the Society for Neuroscience conference (Miller, Erickson et al. 1996) it was made clear that multiple testing correction was necessary.  None of the results survived correction and the study was never released, while the uncorrected Schacter results were published in a major neuroimaging journal.   Since that time there has been a scattering of studies reporting different patterns of brain activations for false memories and for true memories.  Virtually all of them have used uncorrected thresholds and have proven difficult to replicate.  This situation raises two issues.  The first issue is the amount of time and resources that have been spent trying to extend results that may never have existed in the first place. The second issue is the prevailing skewed view of the literature that brain activations can be reliably discerned between false and true memories because only reports with positive results will be published.</p>
<p>Less rigorous control of Type I errors would not be so bad if inferences based on false positives were easily correctable.  However, this does not seem to be the case within the current model of publication.  If researchers failed to reproduce the results of a currently published study it would be quite difficult to disseminate their null findings.  This forms one of the most profound differences between Type I and Type II error: false negatives are correctable in future publications while false positives are difficult to refute once established in the literature.  </p>
<p>This imbalance in the propagation of Type I and Type II errors contributes to an issue known as the ‘File Drawer Problem’ (Rosenthal 1979).  This refers to the publication bias that ensues because the probability of a study being published is directly tied to the significance of a result.  While presentation of null results is not unheard of (see Baker, Hutchinson, 2007) such publications are generally considered the exception and not the rule.</p>
<p>Another important cautionary tale is our recent investigation of false positives during the acquisition of fMRI data from a dead Atlantic salmon (Bennett, Baird et al. 2009; Bennett, Baird et al. Under Review).  Using standard acquisition, preprocessing, and analysis techniques we were able to show that active voxel clusters could be observed in the dead salmon’s brain when using uncorrected statistical thresholds.  If any form of correction for multiple testing was applied these false positives were no longer present.  While the dead salmon study can only speak to the role of principled correction in a single subject, we believe it effectively illustrates the dangers of false positives in any neuroimaging analysis.</p>
<p>A bit of clarification may be important at this point.  Our goal should not be to completely eliminate false positives.  To be completely certain that all of our results are true positives would require obscenely high statistical thresholds that would eliminate all but the very strongest of our legitimate results.  Therefore we must accept that there will always be some risk of false positives in our reports.  At the same time, it is critical that we be able to specify how probable false positives are in our data in a way that is readily communicated to the reader.  </p>
<p>In this discussion of false positives it is also important that we not minimize the danger of high false negative rates.  Being over-conservative regarding the control of Type I error comes at the expense of missing true positives.  Perhaps for this reason there have been some voices in the imaging community that argue against principled correction due to the resulting loss of statistical power.  Again, a principled correction does not necessarily lead to a loss of power.  The researcher can set a liberal criterion in FDR or FWE and the readers can use their precise knowledge of the false positive rate to evaluate the reported results.  </p>
<p><strong>Our Argument</strong></p>
<p>There is a single key argument that we wish to make regarding proper protection against Type I error in fMRI.  <strong>All researchers should use statistical methods that provide information on the Type I error rate across the whole brain.</strong>  It doesn’t matter what method you use to accomplish this.  You can report the false discovery rate (Benjamini and Hochberg 1995), or use one of several methods to control for the familywise error rate (Nichols and Hayasaka 2003).  You can even do a back-of-the-napkin calculation and use a Bonferroni-corrected threshold if you wish.  The end goal is the same: giving the reader information on the prevalence of false positives across the entire family of statistical tests.</p>
<p><center><br />
<table width='500'>
<tr>
<td>
<a href="http://prefrontal.org/blog/wp-content/uploads/2009/12/Bennett-2009-SCAN-Figure1LG.jpg"><img src="http://prefrontal.org/blog/wp-content/uploads/2009/12/Bennett-2009-SCAN-Figure1.jpg" alt="Bennett-2009-SCAN-Figure1" title="Bennett-2009-SCAN-Figure1" width="500" height="127"></a>
</td>
</tr>
<tr>
<td>
Figure 1.  Example figure of a hybrid corrected/uncorrected data presentation.  Areas that are significant under an uncorrected threshold of p < 0.001 with a 10 voxel extent criteria are shaded in blue.  Areas that are significant under a corrected threshold of FDR = 0.05 are shaded in orange.
</td>
</tr>
</table>
<p></center></p>
<p>We would further argue that an investigator could still use an uncorrected threshold for their data as long as proper corrected values detailing the prevalence of false positives are also provided.  In this manner you could threshold your data at p < 0.001 with a 10 voxel extent as long as you presented what FDR or FWE threshold would be required for the results to stay significant.  One example can be seen in the above figure.  In this image voxels that survive an uncorrected threshold are depicted in cool colors while voxels that survive FDR correction are depicted in warm colors.  This allows a researcher to ‘have their cake and eat it too’.  Again, the key to our argument is not that we need to use correction simply for correction’s sake, just that our readers are made aware of the false positive rate across the whole brain.</p>
<p><strong>Techniques For Principled Correction</strong></p>
<p>There are a wide variety of methods that can be used to hold the false positive rate at specified levels across the whole brain.  One approach is to place limits on the familywise error rate (FWER).  Using this method a criterion value of 0.05 would mean that there is a 5% chance of one or more false positives across the entire set of tests.  This yields a 95% confidence level that there are no false positives in your results.  There are many methods that can be used to control the FWER in neuroimaging data: the Bonferroni correction, the use of Gaussian Random Field Theory (Worsley, Evans et al. 1992), and nonparametric permutation correction techniques (Nichols and Holmes 2002).  Nichols and Hayasaka (2003) have authored an excellent article reviewing these techniques.  The Bonferroni correction is typically seen as too conservative for functional neuroimaging since it does not take into account spatial correlation between voxels.  Gaussian RFT adapts to spatial smoothness of the data, but was shown to be quite conservative at low levels of smoothness.  The use of permutation-based techniques to control the FWER emerged as an ideal choice for adequate correction while maintaining high sensitivity.</p>
<p>Another approach to principled correction is to place limits on the false discovery rate (Benjamini and Hochberg 1995; Genovese, Lazar et al. 2002).  Using this method a criterion value of 0.05 would mean that on average 5% of the observed results would be false positives.  The goal of this approach is not to completely eliminate familywise errors, but to control how pervasive false positives are in the results.  This is a weaker control to the multiple testing problem, but one that still provides precise estimates of the percentage of false positives. </p>
<p>The advantages and disadvantages of each correction approach are illustrated graphically using simulated data in Figure 2.  The simulated data is set up so that the uncorrected results have a power of 0.80.  Controlling for the familywise error rate with the criterion p(FWE) = 0.05 can be seen to virtually eliminate false positives while dramatically reducing the amount of detected signal.  In this example power is reduced to 0.16.  Controlling the false discovery rate with the criterion FDR = 0.05 increases the number of false positives relative to FWER techniques, but also increases the ability to detect meaningful signal.  In this example power is increased to 0.54.</p>
<p><center><br />
<table width='500'>
<tr>
<td>
<a href="http://prefrontal.org/blog/wp-content/uploads/2009/12/Bennett-2009-SCAN-Figure2LG.jpg"><img src="http://prefrontal.org/blog/wp-content/uploads/2009/12/Bennett-2009-SCAN-Figure2.jpg" alt="Bennett-2009-SCAN-Figure2" title="Bennett-2009-SCAN-Figure2" width="500" height="508"></a>
</td>
</tr>
<tr>
<td>
Figure 2.  Demonstration of correction methods for the multiple testing problem.  a) A raw image of the simulated data used in this example.  A field of Gaussian random noise was added to a 100&#215;100 image with a 50&#215;50 square section of signal in the center.  b) Thresholded image of the simulated data using a pixelwise statistical test.  The threshold for this test was p < 0.05.  Power is high at 0.80, but a number of false positives can be observed.  c) Thresholded image of the simulated data using a Bonferroni FWER correction.  The probability of a familywise error was set to 0.05.  There are no false positives across the entire set of tests, but power is reduced to 0.16.  d) Thresholded image of the simulated data while controlling the false discovery rate.  The FDR for this example was set to 0.05.  Out of the results 4.9% are known to be false positives, but power is increased to 0.54.
</td>
</tr>
</table>
<p></center></p>
<p>If you are concerned about power, you can appropriately adjust the cutoff in FWE or FDR.  For instance, it isn’t strictly necessary to use 0.05 in either FWE or FDR.  It might yield a better balance of power and false positive protection to use 0.10 or even something higher.  You will be more likely to find true sources of activation and the reader will still have a precise idea about the prevalence of false positives.</p>
<p>It is important to understand the appropriate use of the correction method you select.  For instance, one commonly used approach is the small-volume correction (SVC) method in SPM (http://www.fil.ion.ucl.ac.uk/spm/).  The use of SVC allows researchers to conduct principled correction using Gaussian Random Field Theory within a predefined region of interest.  Ideally this would be a region defined by anatomical boundaries or a region identified in a previous, independent dataset.  However, many researchers implement SVC incorrectly, choosing to first conduct a whole-brain exploratory analysis and then using SVC on the resulting clusters (cf. Loring, Meador et al. 2002; Poldrack and Mumford 2009).  This is an inappropriate approach that does not yield a principled correction.  Another method that is often incorrectly used is the AlphaSim tool included in AFNI (http://afni.nimh.nih.gov/afni/).  For effective false positive control AlphaSim requires that an estimate of the spatial correlation across voxels be modeled using the program 3dFWHM.  Many researchers simply input the amount of Gaussian smoothing that was applied during preprocessing, leading to incorrect clustering thresholds as output.  Errors during estimation of the spatial smoothness can also lead to incorrect values.</p>
<p>In the future we may have statistical methods that are better able to address the multiple testing problem.  Hierarchical Bayes models have been offered as one approach (Lindquist and Gelman 2009).  We may even move away from the binary decision of significance and begin to examine effect sizes in earnest (Wager 2009).  Still, we must examine the balance of Type I and Type II error in the context of where our analysis techniques are today.  At present the general linear model is by far the most prevalent method of analysis in fMRI.  Mumford and Nichols (2009) found that approximately 92% of group fMRI results were computed using an ordinary least squares (OLS) estimation of the general linear model.  This percentage is unlikely to shift dramatically in the next 12-36 months.  Our focus should remain on how to improve OLS methods in the near term as we move toward new analysis techniques in the future.</p>
<p><strong>Predetermined cluster size as a partial correction</strong></p>
<p>In neuroimaging we often rely on the fact that legitimate results tend to spatially cluster together.  The assumption being that voxel clustering provides some assurance against Type I errors.  While predefined thresholds in combination with predetermined clustering requirements may represent a sufficient approximation of a proper threshold, it is in general an unprincipled approach to the control of Type I error rates.</p>
<p>Many authors justify this approach by referring to the results of Forman et al. (1995), who examined clustering behavior of voxels in fMRI.  The results of Forman et al. suggest that a threshold of p < 0.001 combined with a 10 voxel extent requirement should more than adequately control for the prevalence of false positives.  However, the Forman et al. data was only computed across two-dimensional slices, not in 3D volumes.  The findings of Forman et al. simply do not apply to modern fMRI data.</p>
<p>It should also be noted that we are not arguing that p < 0.001 with a 10 voxel threshold is wholly inappropriate.  For example, Cooper and Knutson (2008) used the AlphaSim utility in AFNI to determine that a corrected threshold of p < 0.001 with a 10 voxel extent threshold would be appropriate to keep the familywise error rate at 5% in their particular dataset.  The problem is that this threshold is specific to the parameters of their dataset, and may be inappropriate in other datasets. Arnott et al. (2008) used the same AFNI routine and estimated that an 81 voxel extent was required to ensure that familywise error was kept below 5%.  It is possible to use the combination of a p value and a cluster size in a principled way, but it requires computing the proper values for each and every analysis.  The cluster size criteria can change quite substantially from dataset to dataset.  Further, it can be the case that required cluster sizes become so large that legitimate results with a smaller volume are missed.</p>
<p><strong>Conclusions</strong></p>
<p>The topic of proper Type I error protection is not a new element of discussion in the field of neuroimaging.  The need to correct for thousands of statistical tests has been recognized since the early PET imaging days (Worsley, Evans et al. 1992).  It is uncertain why uncorrected thresholds have lingered so long.  Perhaps many researchers simply recognized it as an accepted, arbitrary threshold in the same manner p < 0.05 is an accepted, arbitrary threshold throughout other scientific fields.  This approach may have been acceptable in the past, but within the last decade we, as a field, have come under increased scrutiny from the public and from other scientists.  At a time when so many are looking for us to slip up we believe it is time to set a new standard of quality with regard to our data acquisition and analysis.  </p>
<p>The fundamental question that that all researchers must face is whether their results will replicate in a new study.  The prevalence of false positives in your results will directly influence this ability.  We are all aware that the multiple testing problem is a major issue in neuroimaging.  How you correct for this problem can be debated, but principled protection against Type I error is an absolute necessity moving forward.</p>
<p><strong>References</strong></p>
<p>Arnott, S. R., J. S. Cant, et al. (2008). &#8220;Crinkling and crumpling: an auditory fMRI study of material properties.&#8221; Neuroimage 43(2): 368-78.</p>
<p>Benjamini, Y. and Y. Hochberg (1995). &#8220;Controlling the false discovery rate: A practical and powerful approach to multiple testing.&#8221; J. Roy. Statist. Soc. Ser. B 57: 289-300.</p>
<p>Bennett, C. M., A. A. Baird, et al. (2009). Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction   15th Annual Meeting of the Organization for Human Brain Mapping. San Francisco, CA.</p>
<p>Bennett, C. M., A. A. Baird, et al. (Under Review). &#8220;Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction   &#8220;.</p>
<p>Cooper, J. C. and B. Knutson (2008). &#8220;Valence and salience contribute to nucleus accumbens activation.&#8221; Neuroimage 39(1): 538-47.</p>
<p>Forman, S. D., J. D. Cohen, et al. (1995). &#8220;Improved assessment of significant activation in functional magnetic resonance imaging (fMRI): use of a cluster-size threshold.&#8221; Magn Reson Med 33(5): 636-47.</p>
<p>Genovese, C. R., N. A. Lazar, et al. (2002). &#8220;Thresholding of statistical maps in functional neuroimaging using the false discovery rate.&#8221; Neuroimage 15(4): 870-8.</p>
<p>Lindquist, M. A. and A. Gelman (2009). &#8220;Correlations and Multiple Comparisons in Functional Imaging: A Statistical Perspective (Commentary on Vul et al., 2009).&#8221; Perspectives on Psychological Science 4(3): 310-313.</p>
<p>Loring, D. W., K. J. Meador, et al. (2002). &#8220;Now you see it, now you don&#8217;t: statistical and methodological considerations in fMRI.&#8221; Epilepsy Behav 3(6): 539-547.</p>
<p>Miller, E. K., C. A. Erickson, et al. (1996). &#8220;Neural mechanisms of visual working memory in prefrontal cortex of the macaque.&#8221; J Neurosci 16(16): 5154-67.</p>
<p>Mumford, J. A. and T. Nichols (2009). &#8220;Simple group fMRI modeling and inference.&#8221; Neuroimage 47(4): 1469-75.</p>
<p>Nichols, T. and S. Hayasaka (2003). &#8220;Controlling the familywise error rate in functional neuroimaging: a comparative review.&#8221; Stat Methods Med Res 12(5): 419-46.</p>
<p>Nichols, T. E. and A. P. Holmes (2002). &#8220;Nonparametric permutation tests for functional neuroimaging: a primer with examples.&#8221; Hum Brain Mapp 15(1): 1-25.</p>
<p>Poldrack, R. A. and J. A. Mumford (2009). &#8220;Independence in ROI analysis: where is the voodoo?&#8221; Soc Cogn Affect Neurosci 4(2): 208-13.</p>
<p>Roediger, H. L., 3rd and K. B. McDermott (1995). &#8220;Creating false memories: remembering words not presented in lists.&#8221; Journal of Experimental Psychology: Learning, Memory, and Cognition 21: 803-814.</p>
<p>Rosenthal, R. (1979). &#8220;The file drawer problem and tolerance for null results.&#8221; Psychological Bulletin 83(3): 638-641.</p>
<p>Schacter, D. L., E. Reiman, et al. (1996). &#8220;Neuroanatomical correlates of veridical and illusory recognition memory: evidence from positron emission tomography.&#8221; Neuron 17(2): 267-74.</p>
<p>Storey, J. D. and R. Tibshirani (2003). &#8220;Statistical significance for genomewide studies.&#8221; Proc Natl Acad Sci U S A 100(16): 9440-5.</p>
<p>Taleb, N. (2004). Fooled by randomness: the hidden role of chance in lafe and in the market. New York, Thompson/Texere.</p>
<p>Wager, T. D. (2009). If neuroimaging is the answer, what is the question? Estimating Effects and Correlations in Neuroimaging Data Workshop, Columbia University, New York, NY.</p>
<p>Worsley, K. J., A. C. Evans, et al. (1992). &#8220;A three-dimensional statistical analysis for CBF activation studies in human brain.&#8221; Journal of Cerebral Blood Flow &#038; Metabolism 12(6): 900-918.</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2009/12/paper-the-principled-control-of-false-positives-in-neuroimaging/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Holiday Presents for a Neurogeek</title>
		<link>http://prefrontal.org/blog/2009/12/holiday-presents-for-a-neurogeek/</link>
		<comments>http://prefrontal.org/blog/2009/12/holiday-presents-for-a-neurogeek/#comments</comments>
		<pubDate>Thu, 17 Dec 2009 13:15:24 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Miscellany]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=898</guid>
		<description><![CDATA[I know this post might be a bit late in the season to make much of an impact on your shopping plans, but if your loved ones can&#8217;t get enough neuroscience then here are some thoughts for great gifts. Some are specific to neuroscience, while others are more general and appropriate for any academic. Enjoy! [...]]]></description>
			<content:encoded><![CDATA[<p>I know this post might be a bit late in the season to make much of an impact on your shopping plans, but if your loved ones can&#8217;t get enough neuroscience then here are some thoughts for great gifts.  Some are specific to neuroscience, while others are more general and appropriate for any academic.  Enjoy!</p>
<hr />
General Neuroscience.</p>
<p>- Book: <a href="http://www.amazon.com/gp/product/0878932860?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0878932860">Functional Magnetic Resonance Imaging</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0878932860" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, by Huettel, Song, and McCarthy.  ~$75<br />
I picked this up a few weeks ago since I heard it had a good section on signal and noise in fMRI.  What I found was, far and away, the best single introduction to fMRI that I have run across.  If I am ever fortunate enough to run my own lab then I will see to it that all new lab members are handed this book as soon as they step in the door.  It&#8217;s that good.</p>
<p>- Plush: <a href="http://www.thinkgeek.com/geektoys/plush/bc01/">Neuron</a> or set of <a href="http://www.thinkgeek.com/geektoys/plush/a55e/">Neurons</a>.  ~$12-$24<br />
How much cute can a few dollars buy?  Quite a bit, apparently.  I have a set of plush neurons in my office.  The best part is that they can slot into each other, forming neural networks!  I love it.</p>
<p>- T-Shirt: <a href="http://yellowibis.spreadshirt.com/yellowibis-com-medical-one-liners-men-s-unisex-heavyweight-t-i-love-brains-color-choice-A4609540">I &#x2665; Brains</a>.  ~$20<br />
Don&#8217;t hide your love, share it with the world.  While there may be other organs in the body , the brain is where it&#8217;s at.  </p>
<p>- Poster: <a href="http://www.orkposters.com/brain.html">Think Hard Print</a> (Map of the brain&#8217;s surface).  ~$18<br />
The folks at Ork Posters are well-known for their city neighborhood posters.  In this case they turned their creative talent to the neighborhoods of the brain, and created a great piece of art.  It&#8217;s even anatomically correct.</p>
<p>- Book: <a href="http://www.amazon.com/gp/product/0452288525?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0452288525">This Is Your Brain on Music</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0452288525" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />. ~$11<br />
I purchased this book on a whim two years ago and was very pleasantly surprised at how good it is.  Music (and dance) are a key part of the human condition.  With this book you can learn more about what makes music so special within the brain.</p>
<p>- Book: <a href="http://www.amazon.com/gp/product/0064603067?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0064603067">The Human Brain Coloring Book</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0064603067" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />.  ~$15<br />
What coloring books do your kids have?  Disney?  Pokemon?  Upgrade them to something better &#8211; something that even med school students use to help learn neuroanatomy.  I purchased my first brain coloring book when I was an undergrad.  It was great then, and it remains great now.</p>
<p>- Tool: <a href="http://www.amazon.com/gp/product/012373603X?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=012373603X">Atlas of the Human Brain</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=012373603X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />. ~$180+<br />
When you start getting serious about the brain then you are going to need a serious map to help guide you.  My personal standby is the Atlas of the Human Brain by Mai, Paxinos, and Assheuer.  It is a great reference book with excellent illustrations.  As a bonus the atlas comes with a DVD containing PDFs of all the book material.  Copy the DVD to your laptop and you will have your atlas with you everywhere you go.</p>
<p>- Tool: <a href="http://www.carolina.com/product/somso+human+brain+model,+8+parts.do">Somso Human Brain Model</a>. ~$LOTS<br />
One day someone will explain to me why plastic models of the human brain must cost hundreds of dollars.  For now I am a bit lost regarding their exorbitant cost.  Still, these models are incredibly handy to have around when discussing brain anatomy or function.  The link goes to one example of a human brain model, but there are many variations on the theme available.  It is not impossible to spend $1000+ on a really good version.  </p>
<hr />
General Academia.</p>
<p>- Writing Tool: <a href="http://www.amazon.com/gp/product/8883701135?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=8883701135">Moleskine</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=8883701135" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> notebooks and <a href="http://www.dickblick.com/products/copic-multiliner-sp-pens/">Copic Multiliner SP</a> pens.  ~$8-$15<br />
There are times when academics are out there, on the front line.  Lab meetings.  Department presentations.  Lunch with a collaborator.  Conferences.  In these battles you need the best weapons you can get.  Don&#8217;t get caught with your pants down &#8211; always have solid instruments along with you.  It has taken years of careful testing, but I have settled on Moleskine notebooks and the Copic Multiliner SP pen.  Get the Moleskine with graph paper, and get the 0.35 mm tip Multiliner.  Make sure to get the SP series, because you <em>deserve</em> a rugged aluminum body.</p>
<p>- Writing Tool: <a href="http://www.amazon.com/gp/product/B0000W4MYI?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B0000W4MYI">Any</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B0000W4MYI" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> <a href="http://www.amazon.com/gp/product/B000KA4UYC?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000KA4UYC">kitchen</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B000KA4UYC" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> <a href="http://www.amazon.com/gp/product/B000I9LDXG?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000I9LDXG">timer</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B000I9LDXG" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> ~$15<br />
Sometimes I long for a typewriter when I am writing a new manuscript.  Part of the allure is the romance &#8211; feeding the paper in and hearing the click-clack of the hammers striking the page.  The biggest advantage though?  THERE IS NO INTERNET ON A TYPEWRITER.  If you know someone who is as distractible as I am then drop the $15 and buy them a kitchen timer.  Tell them to set it for twenty minutes and make sure to work for that length of time.  Then, when time has elapsed, you get ten minutes to do whatever you want.  This &#8216;dash&#8217; method has saved my bacon, and it is well worth the small cost to give it a try.  Learn more <a href="http://www.43folders.com/2005/09/08/kick-procrastinations-ass-run-a-dash">here</a>.</p>
<p>- Presentation Tool: <a href="http://www.amazon.com/gp/product/B000FPIUAW?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000FPIUAW">Kensington Wireless Clicker</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B000FPIUAW" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />.  ~$35<br />
From the audience it can be a bit humorous when the speaker can&#8217;t seem to get their Powerpoint slides to advance.  Conversely, it is hell when forty pairs of eyes are watching you fumble around at the podium.  If you are presenting in the near future, get a clicker that you can trust.  I have found this Kensington model to be worthy.  You can get this clicker with a <a href="http://www.amazon.com/gp/product/B000FPGP4U?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000FPGP4U">laser pointer</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B000FPGP4U" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> built-in as well, but I prefer the standard model.  Also, put new batteries in every time you give a talk &#8211; it is worth the three dollars.</p>
<p>- Book: PhD Comics, <a href="http://www.amazon.com/gp/product/0972169504?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0972169504">first</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0972169504" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, <a href="http://www.amazon.com/gp/product/0972169520?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0972169520">second</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0972169520" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, <a href="http://www.amazon.com/gp/product/0972169539?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0972169539">third</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0972169539" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />, or <a href="http://www.amazon.com/gp/product/0972169547?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0972169547">fourth</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=0972169547" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> releases.  ~$8-$14<br />
Let&#8217;s get something squared away right off the bat: Jorge Cham saves lives.  His creation, <a href="http://www.phdcomics.com/comics/aboutcomics.html">PhD comics</a>, details the everyday insanity that every grad student must deal with.  Take a few minutes and surf over to the <a href="http://www.phdcomics.com/comics.php">website</a> and read a few panels, just to get a feel for it.  If you know anyone who has ever struggled with the soul-crushing madness of grad school then any one of these books will be a cathartic experience.  Also check out the PhD Comics <a href="http://www.phdcomics.com/store/mojostore.php">online store</a>.</p>
<p>- Software: <a href="http://mekentosj.com/papers/">Papers</a>, the personal research library (Mac OS X). ~$42<br />
I have several thousand PDF files on my computer.  Now, suppose I need to find ONE of them.  In the bad old days I would have the PDFs organized by topic in a series of folders on my computer.  To find the right one I would have to remember what topic it might be under, or else face the time-sucking wrath of the Finder&#8217;s search tool.  Now, enter Papers, the iTunes of PDF articles.  It will properly store and organize all your academic PDF files.  Want to see all the articles for a specific author?  Done.  Want to see all articles you have from a specific journal?  Done.  Need to build a list of articles that will be useful for your next paper?  Done and done.  A simple and beautiful program.  Try it out for 30 days and decide if it works for you.  They even give an academic discount!</p>
<p>- Reading Gadget: <a href="http://www.amazon.com/gp/product/B0015TCML0?ie=UTF8&#038;tag=prefrontalorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B0015TCML0">Amazon Kindle DX Reader</a><img src="http://www.assoc-amazon.com/e/ir?t=prefrontalorg-20&#038;l=as2&#038;o=1&#038;a=B0015TCML0" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />.  ~$500<br />
When the Kindle first came out I quickly dismissed it as a device with a lot of promise, but limited by various hardware and software shortcomings.  No longer.  With the Kindle DX things start getting really interesting for academics.  The device natively support the PDF file format, which means that all of the journal articles we have downloaded can be opened.  Further, the screen is large enough to be able to read those articles pretty comfortably.  The Kindle might be a unique solution if you are looking to go all-digital.</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2009/12/holiday-presents-for-a-neurogeek/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Quote of the Week &#8211; Logothetis</title>
		<link>http://prefrontal.org/blog/2009/12/quote-of-the-week-logothetis/</link>
		<comments>http://prefrontal.org/blog/2009/12/quote-of-the-week-logothetis/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 22:03:00 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[MRI]]></category>
		<category><![CDATA[Quotes]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=855</guid>
		<description><![CDATA[“fMRI is a measure of mass action. You almost have to be a professional moron to think you’re saying something profound about the neural mechanisms. You’re nowhere close to explaining what’s happening, but you have a nice framework, an excellent starting point.&#8221; ~ Nikos Logothetis (As seen in Science News)]]></description>
			<content:encoded><![CDATA[<p>“fMRI is a measure of mass action.  You almost have to be a professional moron to think you’re saying something profound about the neural mechanisms. You’re nowhere close to explaining what’s happening, but you have a nice framework, an excellent starting point.&#8221;  ~ <a href="http://www.kyb.mpg.de/~nikos">Nikos Logothetis</a> (As seen in <a href="http://www.sciencenews.org/view/feature/id/50295/title/Trawling_the_brain">Science News</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2009/12/quote-of-the-week-logothetis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Live Sectioning of HM&#8217;s Brain</title>
		<link>http://prefrontal.org/blog/2009/12/live-sectioning-of-hms-brain/</link>
		<comments>http://prefrontal.org/blog/2009/12/live-sectioning-of-hms-brain/#comments</comments>
		<pubDate>Wed, 02 Dec 2009 22:07:32 +0000</pubDate>
		<dc:creator>prefrontal</dc:creator>
				<category><![CDATA[CogNeuro]]></category>
		<category><![CDATA[Psychology]]></category>

		<guid isPermaLink="false">http://prefrontal.org/blog/?p=846</guid>
		<description><![CDATA[The Brain Observatory at UCSD is doing a live feed of the histological sectioning of patient HM&#8217;s brain today. The feed will continue for the next two days while they slice through HM&#8217;s brain by fractions of a millimeter at a time. You can view the feed yourself at the following link: http://thebrainobservatory.ucsd.edu/hm_live.php. The studies [...]]]></description>
			<content:encoded><![CDATA[<p>The Brain Observatory at UCSD is doing a live feed of the histological sectioning of patient HM&#8217;s brain today.  The feed will continue for the next two days while they slice through HM&#8217;s brain by fractions of a millimeter at a time.  You can view the feed yourself at the following link:<br />
<a href="http://thebrainobservatory.ucsd.edu/hm_live.php">http://thebrainobservatory.ucsd.edu/hm_live.php</a>.</p>
<p>The studies done with HM revolutionized our understanding of human memory.  His case remains one of the most important in the history of psychology and cognitive science.  If you aren&#8217;t familiar with patient HM then take a few minutes and go through the Wikipedia article on him:<br />
<a href="http://en.wikipedia.org/wiki/HM_(patient)">http://en.wikipedia.org/wiki/HM_(patient)</a></p>
<p><img src="http://prefrontal.org/blog/wp-content/uploads/2009/12/MicrotomeHM.jpg" alt="MicrotomeHM" title="MicrotomeHM" width="600" height="251" class="alignnone size-full wp-image-847" /></p>
]]></content:encoded>
			<wfw:commentRss>http://prefrontal.org/blog/2009/12/live-sectioning-of-hms-brain/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

