CODECHECK tackles one of the main challenges of computational research by supporting codecheckers with a workflow, guidelines and tools to evaluate computer programs underlying scientific papers. The independent time-stamped runs conducted by codecheckers will award a “certificate of executable computation” and increase availability, discovery and reproducibility of crucial artefacts for computational sciences. See the CODECHECK paper for a full description of problems, solutions, and goals and take a look at the GitHub organisation for examples of codechecks and the CODECHECK infrastructure and tools.
The CODECHECK principles
- Codecheckers record but don’t investigate or fix.
More about this principle...The codechecker follows the author’s instructions to run the code. If instructions are unclear, or if code does not run, the codechecker tells the author. We believe that the job of the codechecker is not to fix these problems but simply to report them to the author and await a fix. The level of documentation required for third parties to reproduce a workflow is hard to get right, and too often this uncertainty leads researchers to give up and not document it at all. The conversation with a codechecker fixes this problem. Codecheckers take the pictures at a crime scene, they do not hunt the criminal.
- Communication between humans is key.
More about this principle...Some code may work without any interaction but often there are hidden dependencies that need adjusting for a particular system. Allowing the codechecker to communicate directly and openly with the author make this process as constructive as possible; routing this conversation (possibly anonymously) through a publisher would introduce delays and inhibit community building.
- Credit is given to codecheckers.
More about this principle...The value of performing a CODECHECK is comparable to that of a peer review, and it may require a similar amount of time. Therefore, the codechecker’s activity should be recorded, ideally in the published paper. The public record can be realised by publishing the certificate in a citable form (i.e., with a DOI), by listing codecheckers on the journal’s website or, ideally, by publishing the checks alongside peer review activities in public databases. Codechecks are an excellen opportunity to involve early career researchers (ECRs) or research software engineers (RSEs) in peer review.
- Workflows must be auditable.
More about this principle...The codechecker should have sufficient material to validate the workflow outputs submitted by the authors. Stark calls this "preproducibility" and the ICERM report defines the level "Auditable Research" similarly. Communities can establish their own good practices or adapt generic concepts and practical tools, such as publishing all building blocks of science in a research compendium (cf. https://research-compendium.science/) or repro-pack. A completed check means that code could be executed at least once using the provided instructions, and, therefore, all code and data was given and could be investigated more deeply or extended in the future. Ideally, this is a “one click” step, but achieving this requires particular skills and a sufficient level of documentation for third parties. Furthermore, automation may lead to people gaming the system or reliance on technology, which can often hide important details. All such aspects can reduce the understandability of the material, so we estimate our approach to codechecking, done without automation and with open human communication, to be a simple way to ensure long-term transparency and usefulness. We acknowledge that others have argued in favour of bitwise reproducibility because, in the long run, it can be automated, but until then we need CODECHECK’s approach.
- Open by default and transitional by disposition.
More about this principle...Unless there are strong reasons to the contrary (e.g., sensitive data on human subjects), all code and data, both from author and codechecker, will be made freely available when the certificate is published. Openness is not required for the paper itself, to accommodate journals in their transition to Open Access models. The code and data publication should follow community good practices. Ultimately we may find that CODECHECK activities are subsumed within peer review.
These basic principles ensure they are feasible to add in a scholarly communication process but still have a huge positive impact on the transparency and usefulness of the published material. They strike a balance between the ideals of auditable high-quality research software and the reality of publication pressure and only slowly changing academic evaluation practices. Of course, numerous requirements on openness/transparency (e.g. depositing the CODECHECK report publicly with a DOI), about software quality (tests, releases, documentation), on copyright/licensing, and regarding best practices for computer-based analyses (e.g. workflow management, data/software citation) are thinkable, but intentionally remain to be defined by implementations of the principles in each community of practice. While the CODECHECK initiators strongly support of Open Science, a CODECHECK does not exclude research not falling into your definition of Open Science.
In the future we hope to update these principles and to work together with researchers, educators, editors, and publishers to raise the bar towards higher degrees of reproducibility and openness across all domains and communities of research.
The principles can be implemented in different ways. See the process page for details about the stakeholders and dimensions of variations in CODECHECKs within a scholarly peer review. The CODECHECK community process describes a concrete realisation, including practical requirements and steps.
If you want to get involved as a codechecker in the community, or if you want to apply the CODECHECK principles in your journal or conference, please take a look at the Get Involved page.
2021-03 | F1000Research preprint
A preprint about CODECHECK was published at F1000Research and is now subject to open peer review. It presents the codechecking workflow, describes involved roles and stakeholders, presents the 25 codechecks conducted up to today, and details the experiences and tools that underpin the CODECHECK initiative. We welcome your comments!
Nüst D and Eglen SJ. CODECHECK: an Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility [version 1; peer review: awaiting peer review]. F1000Research 2021, 10:253 (https://doi.org/10.12688/f1000research.51738.1)
2020-06 | Nature News article
A Nature News article by Dalmeet Singh Chawla discussed the recent CODECHECK
#2020-010 of a simulation study, including some quotes by CODECHECK Co-PI Stephen J. Eglen and fellow Open Science and Open Software experts Neil Chue Hong (Software Sustainability Institute, UK) and Konrad Hinsen (CNRS, France).
Singh Chawla, D. (2020). Critiqued coronavirus simulation gets thumbs up from code-checking efforts. Nature. https://doi.org/10.1038/d41586-020-01685-y
2019-11 | MUNIN conference presentation
Stephen Eglen presented CODECHECK at The 14th Munin Conference on Scholarly Publishing 2019.
Follow, share, cite
To stay in touch with the project, follow the project team members on Twitter:
To give a quick overview of the project, feel free to use or extend the existing slide decks.
To cite CODECHECK in scientific publications, please use the following citation/reference:
Eglen, S., & Nüst, D. (2019). CODECHECK: An open-science initiative to facilitate the sharing of computer programs and results presented in scientific publications. Septentrio Conference Series, (1). https://doi.org/10.7557/5.4910