Category Archives: sometimes i’m wrong

fifty million frenchmen can eat it – Simine Vazire (sometimes i'm wrong)

IMG_6900 the struggle is real.

this blog post is an attempt to lay out my reasoning about why i think it's safe to conclude that p-hacking is a big problem, and false positives are a big problem, as clearly and bluntly as i can. there have been grumblings online about a new Registered Replication Report (RRR) about to come out showing that the meta-analytic result of 20 pre-registered replications of an ego depletion study is pretty much zero. it might seem like jumping the gun to write a blog post about it before it’s come out.  that's because it is jumping the gun. but i’m doing it anyway, because i think the most important conclusion is not about ego depletion. the most important conclusion is that we need to accept that 50 million frenchmen can be wrong. throughout the last few years, when i have talked to people,* one of the most strongly and frequently expressed reasons i’ve heard for not panicking is that it seems impossible that p-hacking is so rampant that even a phenomenon shown in 50 or 100 studies (e.g., ego depletion) could be a false positive. if a paradigm has been used over and over again, and dozens of papers have shown the effect, then it can’t all be a house of cards. Continue reading

give the gift of data – Simine Vazire (sometimes i'm wrong)

IMG_6916

beautiful things happen when people make their data publicly available. from 1895 to 1903, one anonymous man kept track of his nocturnal emissions every day.*  in 1904, he published an analysis of seasonal effects in his eight years of data. because few statistical techniques had been invented for analyzing these types of data, the author based his conclusions mainly on a visual examination of the data (see top panel of the figure below). he concluded that his nocturnal emissions were higher in the spring and summer than in fall and winter. the author also did something else impressive: he shared the raw data with the world (in numerical form, not the actual biological specimens).  in 2012, widaman and helm** decided to reanalyze the data with six different modern quantitative techniques.  i won't get into the details of their analyses,*** but even just rescaling the y-axis and adding 95% confidence intervals to the monthly averages (bottom panel below****) shows just how weak the evidence was for the author's conclusion that there were seasonal patterns in the data.  indeed, widaman and helm's re-analyses did not show much evidence of any monthly or seasonal patterns at all. Screen Shot 2015-12-22 at 4.51.51 PM Figure 3 from Widaman & Helm (2012) the authors describe their results as anticlimactic,***** because their main conclusion is a null effect for seasonality. Continue reading

most damning result – Simine Vazire (sometimes i'm wrong)

  Gorilla

reviewer 2 is not buying it.

i've had a blog post stuck in my head* for a few months now, and the new post on datacolada.org is finally spurring me to write it. i'll admit it.  i'm reviewer 2.  i'm the one who doesn't quite believe that you're telling the whole story, or that things are as simple and neat as you make them out to be.  that's because we've all been taught to pretty things up (pp. 8-10).  it sucks for authors that reviewers are super skeptical, but it's the price we are now paying for all those years of following bem's advice.  there are going to be more and more reviewer 2s out there.  i'm pretty sure most people have already run into the skeptical reviewer who suspects you had other measures, other data, or that you didn't plan those covariates a priori.  if you haven't yet, it's just a matter of time. Continue reading

Guest Post: A Tale of Two Papers – Simine Vazire (sometimes i'm wrong)

A Tale of Two Papers

By Michael Inzlicht

  Change is afoot in psychology. After years of bickering on social media and handwringing about whether our field is or is not in serious trouble, some consensus is emerging. Although we might not agree on the severity of our problems, almost no one doubts that our field needs improvement. And we’re now seeing the field take real steps toward that, with new editors stepping in with mandates for genuine and powerful change. As an Associate Editor at a journal I’m very proud of, I have lived through some of this change. While the standards at the Journal of Experimental Psychology: General have always been high, the standards are more stringent now than when I began. Some people interpret this change as a turn toward conservatism, of valuing safe over creative work. While I appreciate this perspective, I disagree. Instead, I see this as a turn toward transparency, as a turn toward robustness. I still value creative research—and demand it of myself and of authors with whom I work, but now I also value transparency, which allows for robustness. Continue reading

super power – Simine Vazire (sometimes i'm wrong)

IMG_5314     IMG_4361     IMG_5682 lazy dog.
hi there.  i'm here to lecture you about power again.  it's what i do for fun.
collecting data is hard.  large samples take time, and resources.  i am sympathetic to the view that it's sometimes ok to have small samples.
but if you're doing a typical social/personality lab experiment, or a correlational study using questionnaire measures, then it's probably not ok.  for those types of studies, adequate power should be a basic requirement for publishing your work in a good journal.*
when i hear people push back against the call for larger samples because they are sticking up for people who use hard-to-collect data, i scratch my head.  those people are exactly why i think we absolutely need to increase the sample size of typical social/personality studies.  if some of our colleagues are busting their asses measuring cortisol four times a day for weeks, or coding couples' behavior as they discuss marital problems, and even they can get samples of 100 or 200, then the least the rest of us can do is get a couple hundred undergrads to come to our labs for an hour.
when i see a simple lab/questionnaire study that has a smaller sample size than many super-hard-to-collect studies, it makes me sad.
today, i want to introduce you to two of my favorite researchers who do some of the hardest research i know of.  i picked them because they study super important questions with incredibly rigorous methods. Continue reading

Guest Post by Laura Scherer – Simine Vazire (sometimes i'm wrong)

the post below was written by laura scherer following a brief interaction we had on the ISCON facebook page, followed by a few facebook messages back and forth.  i think this is a great example of the kind of thoughtful contribution we could be seeing more of if we could find a way to have productive and pleasant discussions online.  i realize pleasantness is not the most important factor in intellectual discussions, but the problem with unpleasantness is that it drives people away,* and then we miss out on some potentially fruitful discussions.  i don't know what the solution is,** but some food for thought.

-simine

* also there are other problems with unpleasantness.

** blogs, obviously.

 
----------- Much is being said about the Reproducibility Project’s failure to replicate the majority of 100 studies. Judging from the ensuing debate, there appears to be disagreement about virtually every aspect of the project, from whether it was properly designed and conducted to whether it has any implications for our science at all. The debate that has emerged has been heated, especially in online forums. Continue reading

does psychology have a public relations problem?* – Simine Vazire (sometimes i'm wrong)

some people worry that having a loud and public debate about the reproducibility of psychology findings may be detrimental to our public image. in this blog post, i make the bold argument that not only is this not what will happen, but if we have a public relations problem it's the opposite: we sometimes come across as too naive, not skeptical enough of our own preliminary results.
why do i believe the replicability discussion is not going to cause harm to our reputation?
i'm not generally known for my deep respect for the average person, but i do think people understand the basic concept of science - that we are getting closer and closer to the truth, but that all current knowledge is incomplete and subject to revision. in his essay The Relativity of WrongAsimov makes the point that science is all about becoming less and less wrong. undergoing the kind of critical self-examination psychology is currently in the midst of is a normal part of science.
[for a fascinating example, see kuo's march 2014 discovery, at a 'five sigma' level of confidence (i.e., p < .0000003), that the universe expanded rapidly immediately after the big bang ('chaotic inflation theory').  when the discovery was made, the guy who had come up with the theory thirty years before (andrei linde) said "If this is true, ..." which seemed super modest to me at the time.  fast forward to january 2015, and it turns out the evidence they thought they had was wrong. Continue reading

submit your papers to SPPS! – Simine Vazire (sometimes i'm wrong)

  Spps

remember how i like to tell journal editors how to do their job?  well..... i am super excited to be starting my stint as editor in chief of Social Psychological and Personality Science, a journal that belongs to a consortium of four organizations (SPSP, SESP, EASP, and ARP), is co-sponsored by two more (AASP and SASP), and is published by SAGE. as always, my blog posts reflect my own views and not those of SPPS, SAGE, or any other organization. i'm excited to take on this role for several reasons.  first, Allen McConnell, and before him Vincent Yzerbyt, built up this new journal into one of the top outlets for short reports in social/personality psychology.  the journal now receives almost 600 submissions in a typical year, publishes eight volumes per year, has a circulation of over 7,700, and has its first impact factor (2.56*). not bad for a six-year-old journal. second, i kind of love this journal. Continue reading

why p = .048 should be rare (and why this feels counterintuitive) – Simine Vazire (sometimes i'm wrong)

sometimes i read a paper with three studies, and the key results have p-values of .04, .03 and .045.  and i feel like a jerk for not believing the results.  sometimes i am skeptical even when i see a single p-value of .04.*  is that fair? mickey inzlicht asked a similar question on twitter a few weeks ago, and daniel lakens wrote an excellent blog post in response.  i just sat on my ass.  this is the highly-non-technical result of all that ass-sitting. we are used to thinking about the null distribution.  and for good reason. Continue reading

Guest Post: Excuses for Data Peeking – Simine Vazire (sometimes i'm wrong)

Excuses for Data Peeking

Guest Post by Don Moore and Liz Tenney

Good research practice says you pre-specify your sample size and you wait until the data are in before you analyze them.  Is it ever okay to peek early?  This question stimulated one of the more interesting discussions that we (Liz Tenney and Don Moore) had in our lab group at Berkeley.  Here’s the way the debate unfolded. Don staked out what he thought was the methodological high ground and argued that we shouldn’t peek.  He pointed out the perils of peeking: If the early results look encouraging then you might be tempted to stop early and declare victory.  However, we all know why this is a mortal sin: selecting your sample size conditional on obtaining the hypothesized result increases the chance of a false positive result.  However, if you don’t stop but the effect weakens in the full sample, you will be haunted by the counterfactual that you might have stopped.  The part of you that might consider using part of the sample won’t be able to help wondering about some hidden moderator.  Maybe the students were more stressed out and distracted as we approached the end of the semester?  Maybe the waning sunlight affected participants’ circadian rhythms? Continue reading