Wednesday, 18 March 2015

Some thoughts on the UCL “Is Science Broken” debate

[Disclaimer: This was written the day after the event, and I took no notes. If I have misrepresented the opinions of any of the panel members then it is a result of my poor attention during, and poor memory following, the debate. My apologies to those involved if this is the case.]

For those that weren't aware, UCL hosted a talk and debate last night entitled “Is Science Broken? If so, how can we fix it?” Chris Chambers (@chriscd77) gave a talk about the recent introduction of Registered Reports in the journal Cortex. This was followed by a broader panel discussion on the problems facing science (and psychology in particular) and how initiatives, such as pre-registration, might be able to improve things. Alongside Chris, Dorothy Bishop (@deevybee), Sam Schwarzkopf (@sampendu), Neuroskeptic (@Neuro_Skeptic) and Sophie Scott (@sophiescott) took part in the debate, and David Shanks chaired.
First, I found Chris’ talk very informative and measured. Words such as “evangelist” are often bandied about on social media. Personally, I found him to be passionate about pre-registration but very realistic and honest about how pre-registration fits into the broader movement of “improving science”. He spend at least half of his talk answering questions that he has received following similar presentations over the last few months. I would guess about 90% of these questions were essentially logistical – “will I be able to submit elsewhere once I've collected the results?”, “couldn't a reviewer scoop my idea and publish whilst I’m data collecting?” It is obviously incumbent upon Chris, given he has introduced a new journal format, to answer these legitimate logistical questions clearly. I think he did a great job in this regard. I can’t help feeling some of these questions come from individuals who are actually ideologically opposed to the idea, trying to bring about death by a thousand cuts. Often these questions implicitly compare pre-registration to an “ideal” scenario, rather than to the current status quo. As a result, I feel Chris has to point out that their concern applies equally to the current publishing model. I may just be misreading, but if people are ideologically opposed to pre-registration I’d rather they just come out and say it instead of raising a million and one small logistical concerns.
On to the debate. This worked really well. It is rare to get five well-informed individuals on the same stage talking openly about science. There was a lot of common ground. First, everyone agreed there should be more sharing of data between labs (though the specifics of this weren't discussed in detail, so there may have been disagreement on how to go about doing this). Dorothy also raised legitimate ethical concerns about how to anonymise patient data to allow for data sharing. There was also common ground in relation to replication, though Chris and Neuroskeptic both cautioned against only replicating within-lab, and pushed for more between-lab replication efforts, relative to Sophie.
Where I think there was disagreement was in relation to the structures that we put in place to encourage good practice (or discourage bad practice). On several occasions Chris asked how we were going to ensure scientists do what they should be doing (replicating, data sharing, not p-hacking etc.). Essentially it boils down to how much scope we give individual scientists to do what they want to do. Pre-registration binds scientists (once the initial review process has been accepted) to perform an experiment in a very specific way and to perform specific statistics on the data collected. This should (though we need data, as pointed out by both Chris and Sam) decrease the prevalence of certain issues, such as p-hacking or the file drawer problem. You can’t get away from the fact that it is a way of controlling scientists though. I think some people find that uncomfortable, and to a certain extent I can understand why.  However, what is key to pre-registration is that it is the scientists themselves who are binding their own hands. It is masochistic rather than sadistic. Chris isn't telling any individual scientist how to run their experiment, he is simply asking scientists to clearly state what they are going to do before they do it. Given the huge multivariate datasets we collect in cognitive neuroscience, giving individuals a little less wiggle room is probably a good thing.
Sophie pointed out at the beginning of the debate that science isn't measured in individual papers. In one hundred years no-one will remember what we did, let alone that specific paper we published in Nature or Neuron (or Cortex). This is a reasonable point, but I couldn't quite see how it undermined the introduction of formats such as pre-registration. I don’t think anyone would claim a pre-registered paper is “truth”. The success (or failure) of pre-registration will be measured across hundreds of papers. The “unit” of science doesn't change as a result of pre-registration.

Where I found common ground with Sophie was in her emphasis on individual people rather than structures (e.g., specific journal formats). Certainly, we need to get the correct structures in place to ensure we are producing reliable replicable results. However, whilst discussing these structural changes we should never lose sight of the fact that science progresses because of the amazingly talented, enthusiastic, nerdy, focussed, well-intentioned, honest, funny, weird, clever people who design the experiments, collect the data, run the statistics and write the papers. The debate wonderfully underlined this point. We had five individuals (and a great audience) all arguing passionately about science. It is that raw enthusiasm that gives me hope about the future of science more than any change in journal format.

13 comments:

  1. I really hate online media - lost the comment I wrote because I wasn't logged in :( Trying again...

    Thanks for this summary. I'd like to point out that most of the questions & answers from Chris' talk are already published in this paper: http://orca.cf.ac.uk/59475/1/AN2.pdf

    I had in fact taken some notes about them because I don't think all the answers are satisfactory. However, the questions were supposed to be asked by the audience (and they were great!) and also I think the event wasn't supposed to be about debating pre-registration but about how to improve science in general.

    I may write a blog post (as me this time) at some point in which I discuss these points. However, I agree with you that just going on about logistical issues is just nitpicking. In the end we need to try out the preregistration concept (both the standard model and the Registered Reports model Chris talked about). You can't know what good it does without any data.

    You may be right that some concerns are driven from an ideological opposition. But at the same time I have actually heard people suggesting that unless experiments are preregistered they are worthless. I *am* ideologically opposed to that idea. I think it was pretty clear from yesterday's discussion that nobody on the panel believes this of course (and I might add that I never thought any of them did!).

    I hope perhaps the main point of my opening talk (had I not screwed up part of it :P) came across that considering that we have discussions like this the answer to the question "Is science broken" is a clear No (I believe you tweeted something to that effect too...)

    ReplyDelete
    Replies
    1. I agree, I am ideologically opposed to the idea that non-prereg papers are worthless. As you state though, I'm not sure anyone has or would argue this so it is a bit of a non-starter. I also agree with your statement about science being in (relatively) good health. Some things have probably become worse (and steps are being introduced to address those issues, e.g., prereg) and some things have improved (e.g., data sharing, open access). I think you are correct in that we need to be brave in confronting what isn't going right, whilst not losing sight of the big picture.

      Delete
  2. As I said, I have met people who honestly state that non-prereg experiments are worthless. However, I regard this mainly as trolling. Clearly nobody who actively promotes pre-reg (like Chris or EJ) has been arguing this.

    As for the new problems or the problems that aren't new but got worse (like publish & perish) I think they have largely to do with the challenges of a generally changing world. As Marty has put it in the past, there are just too many scientists. I don't agree that there can ever be too many (and I don't think he thinks so either) but I think the number of people and rapidly changing technology (both in science itself and publishing) are causing these problems. But I believe that these things are also the means with which we can solve them.

    ReplyDelete
  3. Good post!

    I've never known anyone who says that non-prereg experiments are all worthless. Certainly I don't believe that! Although I would say that the problem with non-prereg evidence is that its worth is hard to determine. A paper can present the most compelling results but you don't know how many other analyses were run before those results appeared. Maybe none, maybe a few, maybe a hundred.

    ReplyDelete
  4. Great blogpost Aiden.

    From what I heard yesterday it seems there is no 'binding' into a design/analysis in the sense that non-planned analyses are encouraged, but must be logged as non-planned. I'd personally be very happy to read a pre-reg article that says - we did the planned analyses, found nothing (short section), but then we realised... (much longer non-planned results section). That would not look bad to me. It would look honest.

    I had to leave before the wine so I didn't see discussion of:
    What happens when 4 or 5 reviewers (numbers on my last two fMRI papers) at stage 2 suggest many vast post-hoc analyses based on the novel surprising findings that were not predicted at stage 1.

    Presumable the authors have the opportunity to log which analyses were suggested by reviewers and which were their own.

    A third of the analyses in my labs' recent paper Howard et al. 2014 Current Biology were due to very helpful reviewers suggestions in response to particular results we had found. I doubt these analyses would have been suggested at stage 1 of a pre-reg without knowing the results (but I could be wrong).

    ReplyDelete
    Replies
    1. Good question. Would be interested to hear from Chris about this. My guess is that reviewers are welcome to suggest further exploratory analyses, but authors have more power to simply say no as they already have prelim acceptance. It could make that exchange more collegiate as reviewers have less power to sink a paper if authors don't want to do those further analyses.

      Delete
    2. I think Chris already answered this (or a similar question) by saying that Registered Reports will really require editors that are more proactive than what is commonly the case now (which he I think called 'clicking buttons'). If that works I think that can only be a good thing. Editors should be able to make a judgement call. As much as I appreciated that the review process at F1000 was all about the reviewers, the lack of real editorial decision making I think is holding their model back.

      But on the plus side, I think if all reviews were public this would also further encourage proactive editing. Since everyone can read the reviews they can tell if the editing was lazy or biased or whatever.

      Regarding Hugo's other point, personally I think that even with Registered Reports the balance of exploratory and confirmatory results in most good studies is likely going to be at last 50:50. I think this is one of the things that has put me off pre-reg in many discussions. I agree that it is a good idea to declare a priori what you are planning to do but I think you also need robustness tests, exploring the data from a skeptical perspective. A lot of those ideas you are almost inevitably going to get *after* you collected and seen the data.

      Delete
    3. Yes, that's right - reviewers are of course welcome to suggest additional exploratory analyses at Stage 2 but authors wouldn't be required to conduct them unless doing so was necessary to adhere to the Stage 2 review criteria (available here: http://cdn.elsevier.com/promis_misc/PROMIS%20pub_idt_CORTEX%20Guidelines_RR_29_04_2013.pdf). One of the only situations I can imagine this happening would be if the authors wanted to male a claim about the interpretation of the results that the reviewers and editors believe wasn't justified without an additional analysis (this could violate criterion 5: "Whether the authors’ conclusions are justified given the data"). This is basically Sam's points about robustness tests which I agree with completely. Under such circumstances, the authors may be required to either report the analysis or remove the claim.

      In general, however, the authors will retain control in this situation -- there is no power for a reviewer to impose a particular exploratory analysis for its own sake. As an editor I would be particularly sympathetic to the authors in any disagreement because the requirement for data sharing means that a reviewer can easily conduct such an analysis themselves if they wanted to, and if they feel it reveals something important they could submit the outcome of that analysis as a comment.

      My instinct is that the opposite situation is more likely: that reviewers may object to exploratory analyses included by authors, particularly if the authors based their conclusions on them at the expense of the pre-registered analyses. This relates to S2 criterion 4 ("Whether any unregistered post hoc analyses added by the authors are justified, methodologically sound, and informative"). Disagreements in such cases are no different from standard unregistered papers, but requiring careful and proactive editing.

      (As a footnote I must say I really enjoy editing Registered Reports - it's cool being able to watch the dialogue unfold between authors and reviewers at the outset, and then tracking this all the way through to the final outcomes -- and like I said on Tuesday night, the tone of the interactions has been very constructive so far, much more so than my experience editing unregistered papers. The whole process has helped build my faith in something Sophie said - that we do this job, above all, because we love finding out new things. Wouldn't it be great if by accepting papers in advance, we allow scientists to recapture this by freeing them from the need to get "good results" and tell stories).

      Delete
    4. I think this may be one of the key points where our views differ:

      "In general, however, the authors will retain control in this situation -- there is no power for a reviewer to impose a particular exploratory analysis for its own sake."

      I think there are many situations where they should. Of course, in the ideal situations this would have been predicted in the Stage 1 review, but I don't think that is realistically the case. Very often you just won't think of a possible confound until you see the results next to the methods. I think the two-stage review may help not just the authors but also the reviewers to think ahead more. That would certainly be a good thing. But I think except for the simplest designs there will always be things you can think of only when you see the data.

      I don't think it's generally fair to say that reviewers should just conduct their own analysis on the data if they feel something should be done. For me the threshold to reanalyse someone's data is extremely high. It is time-intensive and I already spend a lot of time on reviews as it is. I have to be seriously skeptical of some results to go that far (as with that telepathy paper). I think the primary responsibility lies with the authors to support their conclusions.

      I agree that a good editor should be able to make that judgement call though. The reason J Neuroscience banned supplementary materials was in part because reviewers kept asking for too many tangential analyses and control experiments. So perhaps there can be less of that after all. Not every paper has to answer all questions.

      Delete
    5. @Sam - I'm not sure we disagree. If the exploratory analysis that the reviewer asks for is needed to address a confound or support the author's conclusions then it would be required under criterion 5. What I'm referring to are cases ask for additional analyses for other reasons, e.g. to address additional questions that are not central to the point of the paper. This often happens in standard reviewing because a reviewer has a particular bent or interest, or wants to ask a different question of the data. In such cases where reviewers want this and authors don't, I don't think it's unreasonable to expect reviewers (as authors of comments) to perform such analyses themselves on the open data rather than expecting the authors to do it in order to get published.

      Delete
    6. Yeah as I said this sort of thing is what prompted JN to change their policies. I think anything that makes reviewers think twice about whether what they're asking for is actually necessary is a good thing.

      I think there is probably a relatively wide grey area though. This may really be what always made me wary about prereg. I feel it may encourage reviewers, editors and authors to be too conservative about post-hoc robustness exploration. But this needn't necessarily be the case. Again, this is where public reviews would come in very handy. At least if that happens it is then readily apparent and someone can spot it, use the data, and do it themselves if they feel so.

      Delete
    7. I fully agree - open peer review (even if anonymised) is always a good thing. It's something I considered trying to package along with RRs when they were launched but the reality was that it was already a massive push into the unknown for Cortex so there wasn't a lot of enthusiasm for even more unknowns (at least that was the view at the time -- these things are continually open to revision and debate).

      You're right of course about the wide grey area -- which brings me back to the role of the editor. Navigating successfully through swathes of grey is pretty much a (good) editor's job regardless of whether a paper is RR or not, and it's a sad reality of modern publishing that editors have become a bit zombified at many journals, overstretched between too many editorial boards and unable to read papers or offer any critical insight. Dorothy Bishop has a classic blog post about this: http://deevybee.blogspot.co.uk/2010/09/science-journal-editors-taxonomy.html

      Delete
    8. Oh yes totally. You need to do one thing at a time. Changing the publication structure to RR and at the same time having public reviews (I am totally happy with keeping names anonymous personally) would have been too ambitious.

      Delete