A process for independent reproduction of computations underlying research
CODE CHECK tackles one of the main challenges of computational research by supporting codecheckers with a workflow, guidelines and tools to evaluate computer programs underlying scientific papers. The independent time-stamped runs conducted by codecheckers will award a “certificate of reproducible computation” and increase availability, discovery and reproducibility of crucial artefacts for computational sciences. See the project page for a full description of problems, solutions, and goals and check out the GitHub organisation for code and examples.
The CODE CHECK principles
- Codecheckers record but don’t investigate or fix.
More about this principle...A codechecker is _not_ required to fix workflows, but to document the given state of documentation and executability. Of course, given a level of interested and skills, a codechecker may go beyond simple small fixes and actively collaborate with an authow to create a better research output. The codechecker's report provides helpful input to the scientific review, e.g., to help the reviewer's understanding. But a CODE CHECK does not evaluate scientific merit! A failed check does not imply the rejection of a submission. Codechecker take the pictures at a crime scene, they do not hunt the murderer.
- Communication between humans is key.
More about this principle...The priority in all documentation and metadata is that a human codechecker can understand them. The codechecker is _not_ making a scientific judgement. It is also close to impossible to make a codecheck blind. Therefore a CODE CHECK must not be anonymised must provide a two-way means of communication between author and codechecker. Codecheckers are supported by formal metadata, automation, and reproducibility infrastructure, yet the check shall not rely on them. Codechecks may be conducted by existing stakeholders in the submission process (e.g., a reviewer), but may also be handled with new roles and by people underrepresented in classic peer-review, such as early career resarchers (ECRs) or resarch software engineers (RSEs).
- Credit is given to codecheckers.
More about this principle...Software and its review are crucial for research in the age of digitisation, so the contribution to the scientific body of knowledge in form of a check gets the credit it deserves. If a CODE CHECK was conducted as part of a review process, (a) the publisher ensures a proper creditation to the level given to scientific reviewers, e.g. by listen the codechecker on an article or journal page (with number of reviews) or by depositing metadata to public databases (e.g., CrossRef, Publons), and (b) a sentence in the methods section is added mentioning the occured CODE CHECK and the reviewer name. The deposited metadata includes a codechecker's ORCID, time, journal, and (if published) the article DOI. This principle intentionally does not regulate if/how the output of the CODE CHECK is deposited and who does it. Ideally though the contribution made by the codechecker is openly published in form of a DOI-able artifact and the sentence in the methods sections links to it as a simple hyperlink/URL.
- Workflows must be auditable.
More about this principle...Common sense and a collaborative process are the main drivers behind the level of documentation, the degree of openness, and the amount of data that is checked. But the minimal requirement is that the codechecker has enough material to validate the workflow submitted by the authors. This means the code could be executed once by following the provided instructions and selected outputs, e.g. figures or data files, are created. Ideally the execution is fully scripted and the execution can be triggered by a running a single command. Being executed once means that a detailed investigation may occur at a later time. Being auditable includes that authors provide data and code for relevant analsis steps and visualisations to the codecheckers, but does not imply that all of the code associated with an article must be checked. The check is not automated on purpose: automation may just lead to people gaming the system, and may hide details that eventually decrease level of certainty that a codechecker has in their assessment.
These basic principles ensure are feasible to add in a scholarly communication process but still have a huge positive impact on the transparency and usefulness of the published material. They strike a balance between the ideals of auditable high-quality research software and the reality of publication pressure and only slowly changing academic evaluation practices. Of course, numerous requirements on openness/transparency (e.g. depositing the check result publicly with a DOI), about software quality (tests, releases, documentation), on copyright/licensing, and regarding best practices for computer-based analyses (e.g. workflow management, data/software citation) are thinkable, but intentionally remain to be defined by implementations of the principles in each community of practice. While the CODE CHECK initiators strongly support of Open Science, a CODE CHECK does not exclude research not falling into your definition of Open Science.
Check out our FAQ page for more information about the limitations of a codecheck.
In the future we hope to update these principles and to work together with researchers, educators, editors, and publishers to raise the bar towards very high reproducibility and openness across all domains and communities of research.
The principles can be implemented in different ways. See the process page for details about the stakeholders and dimensions of variations in codecheck within a scholarly peer review. The community process describes a concrete realisation including practical requirements and steps.
If you want to get involved as a codechecker in the community or want to apply the codecheck principles in your journal or conference, see the Get Involved page.
Stephen Eglen presented CODE CHECK at The 14th Munin Conference on Scholarly Publishing 2019. You can take a look at the poster and the slides, or watch the video recording.
To stay in touch with the project follow the project team members on Twitter:
To give a quick overview of the project, feel free to use or extend the existing slide decks.
To cite CODE CHECK in scientific publications please use the following citation/reference:
Eglen, S., & Nüst, D. (2019). CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications. Septentrio Conference Series, (1). https://doi.org/10.7557/5.4910