Wednesday, 31 July 2013

My (belated) thoughts on pre-registration



*** Warning: Newbie Blogger ***

A lot has been written about the new pre-registration format in Cortex. Although the topic has been relatively drip-feed in nature over the last few months since its initial inception [1], a recent article in the Times Higher Education [2] voicing concerns over the format caused a bit of a twitter storm (at least in my feed, which is largely populated by science folk). Now this controversy has calmed down relative to a few days ago, I thought I would chip in with my two cents. I feel this exercise is largely for my benefit, so I can formulate my own thoughts on the matter. I hope it is also useful to others. I am mindful of a recent article by Charlie Brooker on our internet word emissions [3], but given the numbers of words he has constructively vomited onto the internet over the years I think I can indulge just this once.

The problem

The current state of affairs in psychology is far from perfect and the issues have been well documented. From over-inflated false positive rates due to ‘p-hacking’ [4] to underpowered studies [5]. In my office we often discuss the ‘post-hoc’ nature of hypotheses in fMRI papers – writing a manuscript as if you designed the study to answer the question you have unexpectedly answered, as opposed to the question you wanted to ask originally. I don’t intend to deliberate on these issues. The fact that they are issues is relatively uncontroversial. The debate largely revolves around how much of a problem these issues actually are (e.g. how prevalent they are). Some would argue that the flexibility of the current system is what makes our field so dynamic and creative. Impose undue restrictions, such as pre-registration, and we would be conducting experiments with one hand behind our back. My personal take is that these issues are important and need to be addressed. Whether pre-registration is the cure for what ails research is open to debate, but I firmly believe more needs to be done in order to promote replicability across the field.

Whether pre-registration is the cure is an empirical question

I saw this point originally made by Rolf Zwaan on twitter: “Is pre-reg better? It's an empirical question so no wholesale adoption: we need a control group”. This is an important point. At present we know there is “a problem” (with the obvious caveat that the extent of the problem is a matter of debate). Although we can debate ad nauseam about whether pre-registration is the solution to the problem, it is clearly amenable to empirical testing. We can ask questions such as, are pre-registered studies more likely to replicate than studies that were not pre-registered? It seems to me there is no a priori reason to believe that pre-registration will not work, so why not wait and review the data when we have it? Of course, this means we shouldn’t adopt the pre-registration model wholesale, but to my knowledge no-one was proposing this in the first place. In sum, why not accept its introduction, wait for the data to come in, and then judge?

‘On average’ versus ‘paper-by-paper’

Fair enough, you might say, but Chris Chambers, the scientist behind the introduction of pre-registration in Cortex, has argued that pre-registered studies have “a substantially higher truth value than regular studies” [1]. This is certainly a provocative statement, and I am uncomfortable with the use of the word ‘truth’, as it is asking for philosophical types to hijack the debate and start arguing about whether science can ever reveal ‘truth’. Instead let us discuss ‘reliability’, or even better ‘replicability’. Replicability is perhaps a more useful word as it is something we can measure (see above) and measurement is inherently ‘good’. So does pre-registration increase replicability? The short answer is: probably.  If issues such as p-hacking are real concerns, pre-registration should at least decrease the probability that a paper is p-hacked, decreasing false positive rates. This should mean that pre-registered papers are ‘on average’ more likely to be replicable than non-preregistered studies.

This issue of ‘on average’ had been nagging me for some time, but the issue was crystallised for me by James Kilner [6]. Actually, I believe this issue might get to the heart of the disagreement between the pro- and anti-preregistration camps. Pro-preregistrationers (I’m not sure that is actually a word) are primarily making an ‘on average’ argument. They argue that pre-registered articles ‘on average’ will be more likely to replicate than non-preregistered articles. This, of course, does not mean that if you take an article that has since been replicated and one that turned out to be a false positive that you can decide whether one was pre-registered and the other was not. The two ‘replicability’ distributions will undoubtedly overlap. Any individual paper must, as always, be judged ultimately on its own merits. Just as I may ‘believe’ an fMRI study where the results are significant following correction for multiple comparisons relative to a study that reports uncorrected effects, I may be more inclined to ‘believe’ a pre-registered relative to a non-preregistered study. I might still read the pre-registered study and decide it is terrible based on other criteria though! Anti-preregistrationers (definitely not a word) are, perhaps justifiably, worried that their non-preregistered studies will be automatically dismissed as ‘non-truthy’. I reality, I don’t think this is likely to happen. Just as the scientific method is messy and chaotic, the way we read and judge published studies is messy and chaotic. We all have different criteria by which we judge papers. The introduction of pre-registration seems unlikely to change these habits, so why worry?

The problem is societal

Ultimately, the issue we currently face within the fields of psychology and cognitive neuroscience is societal. On this issue I am in agreement with Micah Allen: “My position is that the "crisis" has more to do with our publish-or-perish culture”. The dubious practices used by specific individuals are primarily a product of the scientific society in which we find ourselves. Again, I do not want to go into specifics, but the phrase “publish-or-perish” sums up the problem succinctly enough. As a young researcher I have felt the undeniable pressure to publish in ‘high-impact’ journals and publish there often. This pressure has never come directly from a supervisor or colleague, but simply from the ‘mood-music’ of science – the constant conversations about who got what published where, who got what fellowship to go where, who got what prize and why. I have been lucky enough to have great supervisors from undergraduate to post-doc. Perhaps others are not quite so lucky, but the pressure is there regardless of your supervisor.

Will pre-registration address this issue? In short, no. It is primarily a tool, which could increase the replicability of a small subset of studies but is unlikely to be adopted across the board (indeed, I don’t think anyone would argue it should be adopted across the board). It may have a small effect on our scientific culture in that it could change the ‘mood-music’, increasing awareness of these issues across labs and departments. The more people give voice to the problems we face, the more likely people are to not utilise particular questionable practices. However, I would argue that this effect is largely a by-product of the debate surrounding pre-registration as opposed to pre-registration itself.

Final words

The format of Cortex’s pre-registration model has a lot to like about it. In particular, the requirement to formally state whether specific effects were predicted or came from post hoc observational analyses is great. It is difficult to argue that such a distinction should stifle scientific creativity and could readily be adopted in more journal formats. Pre-registration would ensure people had actually predicted specific effects prior to conducting the study, but let us be optimistic (just this once) and trust that scientists, when asked to conform to such a format, would be truthful (without the need to pre-register). In other words, this distinction should probably already happen and we shouldn’t need a pre-registration model to force our hand. 

In relation to replication, I was lucky enough to recently find an unexpected, and therefore exciting, result in a behavioural experiment. I’m sure I could have published this result without further investigation however I spent the next month replicating the effect using the same analysis methods that I developed in the first experiment. This isn’t to congratulate myself on being a good scientist (I’m sure I’ve committed just as many sins as the next psychologist), but it is to say replication should occur within lab. Across-lab replication is more reassuring, but if we aren’t bothering to replicate, particularly when we find something unexpected, then it is little wonder why false positives creep into the literature.

To summarise this overly long first blog post, I thought I was going to take the middle ground in this argument. In actual fact, I have persuaded myself that pre-registration is probably a good thing. What I definitely believe is that it isn’t a bad thing, and we shouldn’t fear it. I have been selective in what I have discussed, and there are more arguments both for and against. These debates are healthy and productive. That, in my mind, is perhaps the biggest boon of the introduction of the pre-registration format – the open debate that could potentially nudge our current scientific culture towards valuing reliability and replicability, without an adverse effect on the creativity within our field.

Acknowledgements: Thanks to Micah Allen and James Bisby for commenting on a first draft and therefore giving me the guts to publish.

No comments:

Post a Comment