ReScience: ensuring that the original research is reproducible
Reproducibility is a cornerstone of science: the results obtained by researcher A must be identical to the results obtained by researcher B provided they follow identical protocols and use identical reagents.
In reality, multiple factors can lead to irreproducible results. They include poor training of researchers in experimental design; increased emphasis on making provocative statements rather than presenting technical details; and publications that do not report basic elements of experimental design. Therefore, the initiatives working on the reproducibility issues are indispensable for the scientific progress.
We are happy to present this guest post by Nicolas Rougier from ReScience – a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research is reproducible.
The ReScience initiative
In March 2015, Nicolas Rougier and his colleagues published a commentary into the “Frontiers in Computational Neuroscience” journal that highlighted the difficulties they encountered when trying to replicate a model from the literature. Sources were not available on a public repository (they needed to be requested from one of the author), code was not under version control, there were some factual errors and ambiguities in the description of the model and sources were 6000 lines long. In the end, only direct contact with the original authors (two of them being also authors of the commentary) allowed them to faithfully replicate the model. But the whole process took approximately three months, which is hardly acceptable. This commentary generated an unusually large amount of comments from the community that mostly shared the same feeling about the state of replication in the field of computational neuroscience. Among these people, Konrad Hinsen extended the scope of the problem to replicability in computational science and also pointed out the issue of computational platforms and their stability. After exchanging some emails, we decided to team up to explicitly address the problem and the idea of ReScience was born.
As explained on the journal website, ReScience is a peer-reviewed journal that targets computational research and encourages the explicit reproduction of already published research, promoting new and open-source implementations in order to ensure that the original research becomes reproducible. As stated by Thomas Arildsen (one of the associate editors) in his blog, ReScience is different from other journals for three reasons:
1. ReScience targets replication
Publication of the replication of some known result is rather rare in the scientific landscape, even though it recently received some attention in the context of experimental psychology with the reproducibility project by the Open Science Framework, which shows the limits of reproducibility. In computational science, the issues are somewhat different. There is no experimental error in computation, so replication should in principle be exact and verifiable. This is hardly the case though.
But this can be fixed. During the course of a PhD, students often try to replicate results from the literature as a kind of warm-up, possibly interacting with the original authors. Such replication generally lives inside the hard-drive of the student’s computer while it would be actually useful for the whole scientific community. ReScience, quite naturally, proposes to review and publish these results.
2. ReScience lives on github
The ReScience editing chain is radically different from any other traditional scientific journal. ReScience lives on github where each new implementation is made available together with the comments, explanations and tests. Each submission takes the form of a pull request that is publicly and interactively reviewed and tested in order to guarantee any researcher can re-use it.
3. ReScience is open by design:
– Editor and reviewers are known
– Anybody can interact with the review process
– Questions are asked in the open using the issue tracker
– Anything can be changed using pull requests (from website to template repository)
What is replication?
The idea behind replication is quite simple. To read a paper describing a computational result and to try to re-implement the method, model or analysis in order to get the same quantitative or qualitative results. If you fail, you may need to have a look at the original code if available and/or to contact the original authors to have further precision. Then, if you manage to replicate the results with your new implementation, this means the original research is reproducible. But how long will this implementation last in the rapidly evolving hardware/software landscape? One year? Five years? Ten years? This is why we request authors to write an accompanying article in order to explain their implementation, what they had to change, to add, to remove, etc. Then, ten years from now, the original article and this new article should be sufficient for anybody to re-implement the research. The strongest point in a ReScience article is not the code (even if incredibly useful for further understanding) but the prose.
Where do we go from here?
ReScience is a very young journal. The official start was in September 1st 2015 and there is, at the time of writing, only one paper published. This paper served mainly as a test-bed for the whole review process. Thanks to the help of Tiziano Zito, Mehdi Khamassi and Benoit Girard, we’ve been able to smooth out the whole review process, even if it may be revised later with additional feedback from editors, reviewers and authors. We also received a lot of help from the community and to date, there are already 25 reviewers who volunteered to review ReScience articles in various domains (and you can volunteer too).
Some people also corrected the website while some others fixed the submission template. Furthermore, a lot of interesting issues have been raised on the issue tracker but not all the questions have been answered because there are some tricky ones that we did not think of in the first place when we started the journal. Each domain of computational science has its own specificities that we need to be aware of in order to address the various questions. Hopefully, with the help of the community, we will be able to serve all of computational science.
Finally, ReScience seems to fill a gap in the scientific landscape. We do not have long term visibility but we do hope the journal will offer a place for collaboration between people. We tried to make it as open as possible and we are open to any suggestions. And of course, you’re free to fork the whole journal if you want to start your own using the same model.