How to make a software more robust
Scientific quality and reproducibility rely on the traceability of the experimental data, statistical methods and bioinformatics tools used to generate results. Being unable to replicate and validate scientific results is unfortunately very common. This reproducibility crisis as named by Monya Baker considerably slows down the research progress and affects all of the fields including chemistry, biology and medicine.
Best practices are crucially needed today to improve reproducibility of data analysis and hence to make software robust enough to be run by any user.
Indeed, most of the software tools used to produce scientific results and publications are prototypes and lack robustness. Usually designed and run by a single person in a specific computing environment, codes may be very difficult to be used by other persons to analyze their data and are too often abandoned after publication. Last month, Morgan Taschuk and Greg Wilson published Ten simples rules for making research software more robust providing a quick guide for mastering the key challenge of robustness in software engineering.
What is a “robust” software?
The authors define robust software as a “software that works for people other than the original author and on machines other than its creator’s.” And this mean that “it can be installed on more than one computer with relative ease, it works consistently as advertised, and it can be integrated with other tools.”
Increasing software robustness is a key question for software developers and all users who want to produce replicable and reproducible results and publish their work.Improving software robustness would only take the effort to follow these ten simple rules summarized in the list below:
1. Use version control
2. Document your code and usage
3. Make common operations easy to control
4. Version your releases
5. Reuse software (within reason)
6. Rely on build tools and package managers for installation
7. Do not require root or other special privileges to install or run
8. Eliminate hard-coded paths
9. Include a small test set that can be run to ensure the software is actually working
10. Produce identical results when given identical inputs
How OMICtools promotes software quality and traceability
We have developed several strategies to promote better quality of bioinformatics resources and reproducibility of computational analysis.
First, We promote the citation of bioinformatics resources and exact code version identification for reproducibility and traceability of biological data analysis. We bring together thousands of software in a single place where any user can find all the relevant information to choose and use the program he needs. Our search engine offer an easy way to get the list of tools dedicated to a specific question and analysis function. Moreover, citations and references are specified for each tool as well as the successive program versions and obsolete links to facilitate the survey of bioinformatics tools.
Secondly, OMICtools is a collaborative repository platform that facilitates the development, maintenance and follow-up of bioinformatic tools by programmers themselves. Software developers can directly upload their source codes into the OMICtools server so the community can easily locate them. In addition to the research resource identifier (RRID) which is attributed for each of OMICtools resource, each published source code version get a unique digital object identifier (DOI). Attributing DOI provides an interoperable exchange with other digital resources and a persistent identification, even if material is moved or rearranged. Software developers indicate the version of the source code, the operating system and architecture, as well as the publication, to link the code and program access to DataCite’s API which automatically generates the corresponding DOI. They can modify and update their own project by providing their new code versions. Moreover, OMICtools is implementing a dedicated GitLab service. On their GitLab page, programmers will be able to modify and update their own projects and work together to test, build, consolidate and deploy their codes.
3 good reasons to upload your code versions on OMICtools
Based on the recent papers
(Taschuk and Wilson, 2017) Ten simple rules for making research software more robust. PLoS Computational Biology.
(Baker, 2016) 1,500 scientists lift the lid on reproducibility. Nature.