PSY 607: Everything is Fucked
Prof. Sanjay Srivastava
Class meetings: Mondays 9:00 – 10:50 in 257 Straub
Office hours: Held on Twitter at your convenience (@hardsci)
In a much-discussed article at Slate, social psychologist Michael Inzlicht told a reporter, “Meta-analyses are fucked” (Engber, 2016). What does it mean, in science, for something to be fucked? Fucked needs to mean more than that something is complicated or must be undertaken with thought and care, as that would be trivially true of everything in science. In this class we will go a step further and say that something is fucked if it presents hard conceptual challenges to which implementable, real-world solutions for working scientists are either not available or routinely ignored in practice.
The format of this seminar is as follows: Each week we will read and discuss 1-2 papers that raise the question of whether something is fucked. Our focus will be on things that may be fucked in research methods, scientific practice, and philosophy of science. The potential fuckedness of specific theories, research topics, etc. will not be the focus of this class per se, but rather will be used to illustrate these important topics. To that end, each week a different student will be assigned to find a paper that illustrates the fuckedness (or lack thereof) of that week’s topic, and give a 15-minute presentation about whether it is indeed fucked.
20% Attendance and participation
30% In-class presentation
50% Final exam
If you are an academic and on social media, then over the last weekend your feed was probably full of mentions of an article by economist Justin Wolfers in the New York Times titled “A Family-Friendly Policy That’s Friendliest to Male Professors.”
It describes a study by three economists of the effects of parental tenure extension policies, which give an extra year on the tenure clock when people become new parents. The conclusion is that tenure extension policies do make it easier for men to get tenure, but they unexpectedly make it harder for women. The finding has a counterintuitive flavor – a policy couched in gender-neutral terms and designed to help families actually widens a gender gap.
Except there are a bunch of odd things that start to stick out when you look more closely at the details, and especially at the original study.
Let’s start with the numbers in the NYT writeup:
The policies led to a 19 percentage-point rise in the probability that a male economist would earn tenure at his first job. In contrast, women’s chances of gaining tenure fell by 22 percentage points. Before the arrival of tenure extension, a little less than 30 percent of both women and men at these institutions gained tenure at their first jobs.
Over the last five years psychologists have been paying more and more attention to issues that could be diminishing the quality of our published research — things like low power, p-hacking, and publication bias. We know these things can affect reproducibility, but it can be hard to gauge their practical impact. The Reproducibility Project: Psychology (RPP), published last year in Science, was a massive, coordinated effort to produce an estimate of where several of the field’s top journals stood in 2008 before all the attention and concerted improvement began.
The RPP is not perfect, and the paper is refreshingly frank about its limitations and nuanced about its conclusions. But all science proceeds on fallible evidence (there isn’t any other kind), and it has been welcomed by many psychologists as an informative examination of the reliability of our published findings.
Welcomed by many, but not welcomed by all.
In a technical commentary released today in Science, Dan Gilbert, Gary King, Stephen Pettigrew, and Tim Wilson take exception to the conclusions that the RPP authors and many scientists who read it have reached. They offer re-analyses of the RPP, some incorporating outside data. They maintain that the RPP authors’ conclusions are wrong, and on re-examination the data tell us that “the reproducibility of psychological science is quite high.” (The RPP authors published a reply.)
What should we make of it? Continue reading →
Yesterday I put up a post about David Peterson’s ethnography The Baby Factory, an ethnography of 3 baby labs that discusses Peterson’s experience as a participant observer. My post was mostly excerpts, with a short introduction at the beginning and a little discussion at the end. That was mostly to encourage people to go read it. (It’s open-access!)
Today I’d like to say a little more.
How you approach the article probably depends a lot on what background and context you come to it with. It would be a mistake to look to an ethnography for a generalizable estimate of something about a population, in this case about how common various problematic practices are. That’s not what ethnography is for. But at this point in history, we are not lacking for information about the ways we need to improve psychological science. There have been surveys and theoretical analyses and statistical analyses and single-lab replications and coordinated many-lab replications and all the rest. It’s getting harder and harder to claim that the evidence is cherry-picked without seriously considering the possibility that you’re in the middle of a cherry orchard. As Simine put it so well:
Continue reading →
I don’t know how else to put it. David Peterson, a sociologist, recently published an ethnographic study of 3 infant cognition labs. Titled “The Baby Factory: Difficult Research Objects, Disciplinary Standards, and the Production of Statistical Significance,” it recounts his time spend as a participant observer in those labs, attending lab meetings and running subjects.
In his own words, Peterson “shows how psychologists produce statistically significant results under challenging circumstances by using strategies that enable them to bridge the distance between an uncontrollable research object and a professional culture that prizes methodological rigor.” The account of how the labs try to “bridge the distance” reveals one problematic practice after another, in a way that sometimes makes them seem like normal practice and no big deal to the people in the labs. Here are a few examples.
Protocol violations that break blinding and independence:
…As a routine part of the experiments, parents are asked to close their eyes to prevent any unconscious influence on their children. Although this was explicitly stated in the instructions given to parents, during the actual experiment, it was often overlooked; the parents’ eyes would remain open. Moreover, on several occasions, experimenters downplayed the importance of having one’s eyes closed. One psychologist told a mother, “During the trial, we ask you to close your eyes. That’s just for the journals so we can say you weren’t directing her attention. But you can peek if you want to. Continue reading →
There are 3 ways to approach the replicability discussion/debate in science.
#1 is as a logic problem. There are correct answers, and the challenge is to work them out. The goal is to be right.
#2 is as a culture war. There are different sides with different motives, values, or ideologies. Some are better than others. So the goal is win out over the other side.
#3 is as a social movement. Scientific progress is a shared value. Recently accumulated knowledge and technology have given us better ways to achieve it, but institutions and practices are slow to change. So the goal is to get everybody on board to make things better. Continue reading →
Style manuals sound like they ought to be boring things, full of arcane points about commas and whatnot. But Wikipedia’s style manual has an interesting admonition: Be bold. The idea is that if you see something that could be improved, you should dive in and start making it better. Don’t wait until you are ready to be comprehensive, don’t fret about getting every detail perfect. That’s the path to paralysis. Wikipedia is an ongoing work in progress, your changes won’t be the last word but you can make things better.
In a new editorial at Psychological Science, interim editor Stephen Lindsay is clearly following the be bold philosophy. He lays out a clear and progressive set of principles for evaluating research. Beware the “troubling trio” of low power, surprising results, and just-barely-significant results. Look for signs of p-hacking. Care about power and precision. Don’t confuse nonsignificant for null.
To people who have been paying attention to the science reform discussion of the last few years (and its longstanding precursors), none of this is new. Continue reading →
This site aggregates blogs and popular press articles about personality psychology. If you are an ARP member who writes a blog, or whose research has been featured in a recent popular press article, email us at firstname.lastname@example.org to have your work added to the meta-blog.