For several years now I have heard fellow scientists worry that the dialogue around open and reproducible science could be used against science – to discredit results that people find inconvenient and even to de-fund science. And this has not just been fretting around the periphery. I have heard these concerns raised by scientists who hold policymaking positions in societies and journals.
A recent article by Ed Yong talks about this concern in the present political climate.
In this environment, many are concerned that attempts to improve science could be judo-flipped into ways of decrying or defunding it. “It’s been on our minds since the first week of November,” says Stuart Buck, Vice President of Research Integrity at the Laura and John Arnold Foundation, which funds attempts to improve reproducibility.
The worry is that policy-makers might ask why so much money should be poured into science if so many studies are weak or wrong? Or why should studies be allowed into the policy-making process if they’re inaccessible to public scrutiny? At a recent conference on reproducibility run by the National Academies of Sciences, clinical epidemiologist Hilda Bastian says that she and other speakers were told to consider these dangers when preparing their talks.
One possible conclusion is that this means we should slow down science’s movement toward greater openness and reproducibility. As Yong writes, “Everyone I spoke to felt that this is the wrong approach. Continue reading →
Joe Simmons, Leif Nelson, and Uri Simonsohn have written a 5-years-later retrospective on their “false-positive psychology” paper. It is for an upcoming issue of Perspectives on Psychological Science dedicated to the most-cited articles from APS publications. A preprint is now available.
It’s a short and snappy read with some surprises and gems. For example, footnote 2 notes that the Journal of Consumer Research declined to adopt their disclosure recommendations because they might “dull … some of the joy scholars may find in their craft.” No, really.
For the youngsters out there, they do a good job of capturing in a sentence a common view of what we now call p-hacking: “Everyone knew it was wrong, but they thought it was wrong the way it’s wrong to jaywalk. We decided to write ‘False-Positive Psychology’ when simulations revealed it was wrong the way it’s wrong to rob a bank.”
The retrospective also contains a review of how the paper has been cited in 3 top psychology journals. About half of the citations are from researchers following the original paper’s recommendations, but typically only a subset of them. The most common citation practice is to justify having barely more than 20 subjects per cell, which they now describe as a “comically low threshold” and take a more nuanced view on.
Continue reading →
PSY 607: Everything is Fucked
Prof. Sanjay Srivastava
Class meetings: Mondays 9:00 – 10:50 in 257 Straub
Office hours: Held on Twitter at your convenience (@hardsci)
In a much-discussed article at Slate, social psychologist Michael Inzlicht told a reporter, “Meta-analyses are fucked” (Engber, 2016). What does it mean, in science, for something to be fucked? Fucked needs to mean more than that something is complicated or must be undertaken with thought and care, as that would be trivially true of everything in science. In this class we will go a step further and say that something is fucked if it presents hard conceptual challenges to which implementable, real-world solutions for working scientists are either not available or routinely ignored in practice.
The format of this seminar is as follows: Each week we will read and discuss 1-2 papers that raise the question of whether something is fucked. Our focus will be on things that may be fucked in research methods, scientific practice, and philosophy of science. The potential fuckedness of specific theories, research topics, etc. will not be the focus of this class per se, but rather will be used to illustrate these important topics. To that end, each week a different student will be assigned to find a paper that illustrates the fuckedness (or lack thereof) of that week’s topic, and give a 15-minute presentation about whether it is indeed fucked.
20% Attendance and participation
30% In-class presentation
50% Final exam
If you are an academic and on social media, then over the last weekend your feed was probably full of mentions of an article by economist Justin Wolfers in the New York Times titled “A Family-Friendly Policy That’s Friendliest to Male Professors.”
It describes a study by three economists of the effects of parental tenure extension policies, which give an extra year on the tenure clock when people become new parents. The conclusion is that tenure extension policies do make it easier for men to get tenure, but they unexpectedly make it harder for women. The finding has a counterintuitive flavor – a policy couched in gender-neutral terms and designed to help families actually widens a gender gap.
Except there are a bunch of odd things that start to stick out when you look more closely at the details, and especially at the original study.
Let’s start with the numbers in the NYT writeup:
The policies led to a 19 percentage-point rise in the probability that a male economist would earn tenure at his first job. In contrast, women’s chances of gaining tenure fell by 22 percentage points. Before the arrival of tenure extension, a little less than 30 percent of both women and men at these institutions gained tenure at their first jobs.
Over the last five years psychologists have been paying more and more attention to issues that could be diminishing the quality of our published research — things like low power, p-hacking, and publication bias. We know these things can affect reproducibility, but it can be hard to gauge their practical impact. The Reproducibility Project: Psychology (RPP), published last year in Science, was a massive, coordinated effort to produce an estimate of where several of the field’s top journals stood in 2008 before all the attention and concerted improvement began.
The RPP is not perfect, and the paper is refreshingly frank about its limitations and nuanced about its conclusions. But all science proceeds on fallible evidence (there isn’t any other kind), and it has been welcomed by many psychologists as an informative examination of the reliability of our published findings.
Welcomed by many, but not welcomed by all.
In a technical commentary released today in Science, Dan Gilbert, Gary King, Stephen Pettigrew, and Tim Wilson take exception to the conclusions that the RPP authors and many scientists who read it have reached. They offer re-analyses of the RPP, some incorporating outside data. They maintain that the RPP authors’ conclusions are wrong, and on re-examination the data tell us that “the reproducibility of psychological science is quite high.” (The RPP authors published a reply.)
What should we make of it? Continue reading →
Yesterday I put up a post about David Peterson’s ethnography The Baby Factory, an ethnography of 3 baby labs that discusses Peterson’s experience as a participant observer. My post was mostly excerpts, with a short introduction at the beginning and a little discussion at the end. That was mostly to encourage people to go read it. (It’s open-access!)
Today I’d like to say a little more.
How you approach the article probably depends a lot on what background and context you come to it with. It would be a mistake to look to an ethnography for a generalizable estimate of something about a population, in this case about how common various problematic practices are. That’s not what ethnography is for. But at this point in history, we are not lacking for information about the ways we need to improve psychological science. There have been surveys and theoretical analyses and statistical analyses and single-lab replications and coordinated many-lab replications and all the rest. It’s getting harder and harder to claim that the evidence is cherry-picked without seriously considering the possibility that you’re in the middle of a cherry orchard. As Simine put it so well:
Continue reading →
I don’t know how else to put it. David Peterson, a sociologist, recently published an ethnographic study of 3 infant cognition labs. Titled “The Baby Factory: Difficult Research Objects, Disciplinary Standards, and the Production of Statistical Significance,” it recounts his time spend as a participant observer in those labs, attending lab meetings and running subjects.
In his own words, Peterson “shows how psychologists produce statistically significant results under challenging circumstances by using strategies that enable them to bridge the distance between an uncontrollable research object and a professional culture that prizes methodological rigor.” The account of how the labs try to “bridge the distance” reveals one problematic practice after another, in a way that sometimes makes them seem like normal practice and no big deal to the people in the labs. Here are a few examples.
Protocol violations that break blinding and independence:
…As a routine part of the experiments, parents are asked to close their eyes to prevent any unconscious influence on their children. Although this was explicitly stated in the instructions given to parents, during the actual experiment, it was often overlooked; the parents’ eyes would remain open. Moreover, on several occasions, experimenters downplayed the importance of having one’s eyes closed. One psychologist told a mother, “During the trial, we ask you to close your eyes. That’s just for the journals so we can say you weren’t directing her attention. But you can peek if you want to. Continue reading →
This site aggregates blogs and popular press articles about personality psychology. If you are an ARP member who writes a blog, or whose research has been featured in a recent popular press article, email us at email@example.com to have your work added to the meta-blog.