Category Archives: Sherman’s Head

Adventures in Cheater Detection – Ryne Sherman (Sherman's Head)

Some months ago I came across this blog post by Jonathan Dushoff discussing some statistical procedures for detecting cheating (e.g., copying, working together) on multiple choice exams. The method is quite elegant and the post is fairly short so you may wish to read that post. After reading about this method, I considered applying it to my old exams. In my undergraduate courses I typically give 3 midterms (50 questions each; with either 3 or 4 response options) and 1 final exam (100 questions). The exams are done using scantrons, meaning that students indicate their answers on a standardized form which is then read by a computer to detect their responses. Their responses are then compared to an answer key, scoring their exams. The scantron folks here at FAU provide the professors with a .xlsx file containing the responses the computer recorded for each student on each question. With such a file in hand, it is fairly easy to apply the code provided by Jonathan with a bit of extra work (e.g., inputting the exam key). Despite the relative ease with which this could be done, I really wasn’t that motivated to do this sort of work. That all changed a few weeks ago. Continue reading

What Are Situations? – Ryne Sherman (Sherman's Head)

Reblogged from: What were you doing yesterday at 10am? 2pm? 8pm? Why were you doing those things? A moment’s reflection on our day’s activities makes it obvious that situations impact our behavior. But what are situations actually? I’ve spent the past 9 years doing research aimed at answering this very question. This post reflects what we currently know about situations. One obvious definition of a situation is that it constitutes everything that is outside the person. That is, a person is—psychologically speaking—made up of goals, motives, values, interests, skills, abilities, etc., and situations are everything else, including other people. Continue reading

Can a Computer Judge Your Personality Better than Your Friends? – Ryne Sherman (Sherman's Head)

Yesterday, as I was standing in line in my campus bookstore, I heard someone on the radio talk about a new study published in the Proceedings of the National Academy of Sciences (PNAS) showing that a computer algorithm, relying only on the things you “Like” on Facebook, makes more accurate judgments of your personality than your friends. If you also heard about this study, you probably did not react the way I did yesterday. Having been a reviewer on this study, I had already read the paper. So my reaction was, “Yeah, the study did show that, but it isn’t as simple as this report makes it sound.” So what does the study show? I personally was intrigued by three things. 1) Clearly there is a sexy news story in saying that computers make better judgments than humans. And that is precisely how this study has been discussed so far.[1] However, the data show that self-other agreement with human judges was about r = .49 (across all Big 5 traits) while self-other agreement with computer-based judgments was about r = .56. Yes, these differences are statistically significant and NO we shouldn’t care that they are statistically significant. What these effectively mean is that if you judge yourself to be above average (median) on a trait, your friends are likely to guess that you are above average 74. Continue reading

Developing Short Measures of Broad Constructs: Issues of Reliability and Validity – Ryne Sherman (Sherman's Head)

Consider the following problem. I have a 4-item measure of a psychological construct. Let’s call it Extraversion. Here are the four items:
  • I like to go to parties
  • I am a talkative person
  • I see myself as a good leader
  • I like to take charge
It might be obvious to some, but the first two items and the last two items are more related to each other than the other combinations of items. In fact, we could say the first two items measure the “Sociability” aspect of Extraversion while the last two items measure the “Assertiveness” aspect of Extraversion. Now let’s say I am in a real bind because, although I love my 4-item measure of Extraversion, in my next study I only have time for a 2-item measure. Which two items should I choose? Let’s further say that I have collected a lot of data using our 4-item measure and know that the correlation matrix among the items looks like this: Continue reading

(Mis)Interpreting Confidence Intervals – Ryne Sherman (Sherman's Head)

In a recent paper Hoekstra, Morey, Rouder, & Wagenmakers argued that confidence intervals are just as prone to misinterpretation as tradiational p-values (for a nice summary, see this blog post). They draw this conclusion based on responses to six questions from 442 bachelor students, 34 master students, and 120 researchers (PhD students and faculty). The six questions were of True / False format and are shown here (this is taken directly from their Appendix, please don’t sue me; if I am breaking the law I will remove this without hesitation): Bubledorf Hoekstra et al. note that all six statements are false and therefore the correct response to mark each as False. [1, 2] The results were quite disturbing. The average number of statements marked True, across all three groups, was 3.51 (58.5%). Particularly disturbing is the fact that statement #3 was endorsed by 73%, 68%, and 86% of bachelor students, master students, and researchers respectively. Such a finding demonstrates that people often use confidence intervals simply to revert back to NHST (i.e., if the CI does not contain zero, reject the null). Continue reading

When Are Direct Replications Necessary? – Ryne Sherman (Sherman's Head)

We are told that replication is the heart of all sciences. As such, psychology has recently seen numerous calls for direct replication. Sanjay Srivastava says that replication provides an opportunity to falsify an idea (an important concept in science, but rarely done in psychology). Brian Nosek and Jeffrey Spies suggest that replication would help identify “manufactured effects” rapidly. And Brent Roberts proposed a three step process, the last of which is a direct replication of any unique study reported in the package of studies. Not everyone thinks that direct replications are useful though. Andrew Wilson has argued that replication will not save psychology and better theories are needed. Jason Mitchell has gone so far as to say that failed replications offer nothing to science as they are largely the result of practical mistakes on the part of the experimenters. So are direct replications necessary? My answer is a definitive: sometimes. Let’s start by considering what I gather to be some of the main arguments for direct replications.

phack: An R Function for Examining the Effects of p-hacking – Ryne Sherman (Sherman's Head)

Imagine you have a two group between-S study with N=30 in each group. You compute a two-sample t-test and the result is p = .09, not statistically significant with an effect size r = .17. Unbeknownst to you there is really no relationship between the IV and the DV. But, because you believe there is a relationship (you decided to run the study after all!), you think maybe adding five more subjects to each condition will help clarify things. So now you have N=35 in each group and you compute your t-test again. Now p = .04 with r = .21. If you are reading this blog you might recognize what happened here as an instance of p-hacking. This particular form (testing periodically as you increase N) of p-hacking was one of the many data analytic flexibility issues exposed by Simmons, Nelson, and Simonshon (2011). But what are the real consequences of p-hacking? Continue reading