EditorialMy focus is on the second paper, because (1) I can't say anything about 2-arachidonoylglycerol, and (2) getting a contaminated drug from a vendor is not the authors' fault.
Nature Neuroscience - 10, 1 (2007)
Setting the record straight
The discovery of serious errors in two recent papers in the journal leads to lessons for authors, referees and editors.
. . .
The first correction involves a Brief Communication (Makara et al., 2005) reporting that inhibition of the enzyme that breaks down the endocannabinoid 2-arachidonoylglycerol enhances retrograde signaling in the hippocampus. The authors concluded that 2-arachidonoylglycerol is important for synaptic plasticity and that the enzyme is a possible drug target, in part because one of the putative inhibitors tested appeared to be specific for the enzyme. They subsequently discovered that the commercial preparation of this drug was contaminated. When the contaminant was eliminated, the effect disappeared. . . .
The second correction is more complex. The original article (Grill-Spector et al., 2006) reported high-resolution fMRI measurements in the fusiform face area (FFA), a region of the visual cortex that responds more to faces than to other visual stimuli. The authors drew two conclusions: that the FFA is heterogeneous, in that the degree of selectivity varies over the region, and—more remarkably—that the FFA contains some voxels that are highly selective for object categories other than faces. After the paper was published, two groups wrote to point out flaws in the analysis. One letter (Simmons et al., 2007) noted that the authors used a formula for selectivity that erroneously assigns high selectivity values to voxels with negative responses to nonpreferred categories, causing a substantial overestimate in selectivity for all object categories.
Another group (Baker et al., 2007) spotted a more subtle flaw: the analysis used to demonstrate selectivity for particular categories did not distinguish between random variation and replicable effects reflecting neural tuning. Random variation can cause some voxels to respond more to some categories than to others. To demonstrate that such differences reflect neural selectivity requires an appropriate statistical analysis, for instance cross-validation across independent datasets. The original paper seemed to report the results of such an analysis—that voxel selectivity was highly correlated between even and odd scans. However, communication with the authors revealed that this analysis had excluded voxels whose responses were negatively correlated across the two sets of scans, a detail that was omitted from the paper. This restriction could falsely increase consistency across scans. Indeed, when the authors redid their analysis without it, the selectivity for nonface objects was not replicated from one set of scans to the next.
The authors of the article acknowledge both errors in their correction (Grill-Spector et al., 2007). When these errors are fixed, the most interesting conclusion of the paper—that the FFA contains voxels highly selective for nonface objects—is no longer supported.However, the editors refuse to retract the paper:
In both cases, after considerable discussion with colleagues, we have decided to publish a correction to the original paper rather than a retraction, even though it seems likely that neither paper would have been published in Nature Neuroscience had the errors been identified and corrected during the review process. Retractions were deemed inappropriate because they would have removed from the record some valid data and conclusions that are likely to be useful to specialists in the field, and it seemed unlikely that the authors would be able to publish these data elsewhere.Well, boo hoo!! The authors can't publish their compromised data anywhere except in Nature Neuroscience. Let's all send our unpublishable data to Nature Neuroscience! We can also appeal to the archenemy journal, Science, to publish results that contradict articles in the Nature family of journals...
...the ultimate responsibility for recruiting referees with appropriate expertise lies with the editors, and in this case we clearly should have consulted referees with stronger mathematical expertise.How many other prominent articles are of dubious accuracy? The saga continues...
Nonetheless, it is common practice in functional imaging (and indeed in other areas of neuroscience) to analyze experiments by selecting data according to some criteria and then plotting the average response, without testing an independent data set to ensure that the selection criteria have not merely picked out random variation in a particular direction.