Course for reproducible research
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Tobias Lindstrøm Jensen ec7df19daa CLAAUDIA intern version 8 months ago
invite invite for CREATE 2019 3 years ago
litt updated the material for our 2021 run 9 months ago
material CLAAUDIA intern version 8 months ago
.gitignore add gitignore 3 years ago
README.md checklist resource 1 year ago

README.md

Computational Reproducibility

Structure:

  1. 1h presentation about why, and how, principles, background, introduction to best-practices, etc?
  2. QnA, researchers own stories, opinions, questions and discussion (20 min.)
  3. Hands-on practices to implement the principles: “Scientific Software Management Plan” of you project/software
  4. Follow-up: last comments and discussion (20 min.)

Structure of 1h presentation:

  1. Why are we here? as a researcher? public goods. Knowledge.
  2. Reproducible vs. replicable: definitions
  3. Reproducibility in the computational sciences: some examples for good and bad: Reinhart and Rogoff
  4. Why: research ethics; derived (e.g. Donoho) * Improved work and work habits * Improved teamwork * Greater Impact * Greater continuity and cumulative impact
  5. Levels of reproducibility
  6. Trustworthiness, e.g. Hook and Kelly, 2009 * Developing (methods, peer-review, bug-issue tracking, version control) * Testing (what can be tested? manual vs. automatic) * Validation (Comparing with other methods, theoretical results, bounds, etc.) * Sharing (platform, documentation, readme, reproduce figures and data)
  7. Reproducibility and the Danish Code-of-Conduct
  8. Life-cycle of code
  9. Additional thoughts on best-practices Ten simple rules for reproducible computational research, Best Practices for Scientific Computing, Good enough practices in scientific computing
  10. Scientific Software Management Plan, based on the idea of a Data Management Plan, Software Management Plan, or Software Management Plan v1.0

Computational Reproducibility

Reproducibility is a centerpiece of the scientific method. Its importance can only be underestimated. After all, what is the value of a described method if it cannot be reproduced? On a scientific level it generates uncertainty on the truthness of the claimed results. As a society we may take political, organizational and individual poor decisions based on false claims and use additional research resources to verify/falsify the original scientific claim--both consequences provide poor utilization of resources.

The same goes for reproducibility in the computational sciences where the discussion was reignited more than 20 years ago [Buckheit & Donoho, 1995]. The problem is as old as computing itself - humans make errors and these will along with misconceptions creep into our computer programs as well. At every level people are greatly concerned and billions of dollars are every year spent on removing bugs, improving software development, consulting and risk minimization. Have you experienced a computer doing something weird at a point in time? When a researcher then makes a claim about a certain method and results obtained by a computer---is it correct or false? Sometimes results are falsified with dire consequences [Editorial, 2011]. But in general, an answer might be difficult, if not impossible, to give. However, we may determine another parameter: are the results trustworthy? [Hook and Kelly, 2009]

Level of trustworthiness

Instead of discussing true or false as a binary/Boolean decision, trustworthiness is on a scale. There are a number of aspects that may increase/decrease trustworthiness of particular results, some may include: 1) is the software development practised systematically and rigorously? 2) do the authors share the underlying software?

Add 1) Consider the following: say that a researcher follows a systematic and rigorous data collection scheme e.g. via lab experiments. Is the researcher equally systematic and rigorous in developing the software used to process the obtained data? Errors could slip in at both stages. This is also a fundamental problem in training, as a researcher in almost any field is likely not equally well equipped and educated on best practices in his/her own field of research and software development. Furthermore, how and to what extent was the software tested?

Add 2) This may be new to many and could be explained with another question? What do you think about a researcher sharing the software used to obtain the results and claims associated with a particular publication. What do you think about a researcher that does not share? In whom do you put your trust?

To extend on point 2) and openly sharing software. A computer is perhaps the most standardized piece of equipment humans have ever produced. With the correct information and a little care from the programmer, there should be no reason why another researcher should not be able to exactly reproduce the results you had at the screen when writing the publication. Some even argue that “an article about computational result is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result”, paraphrasing John Claerbout, an earth scientist at Stanford [Buckheit & Donoho, 1995].

There are several reasons why sharing software supporting results in publications are beneficial, even from an individual perspective.

  • Quality of the code increases when you know other people will see it. This is extra work [Donoho 2010]. But will also improve work habits and may actually be beneficial over time.
  • Increase impact of publication [Vandewalle et al. (2009)]. The motivation and argument is clear. If you had the option between downloading a piece of software from the internet for a research purpose, and be up and running in maybe 1 hour, compared to spending days and nights implementing a poorly described method in a publication, what would you do? What do you think other researchers do?

As a senior level researcher, you may also find improved teamwork among staff and greater continuity of work as likely consequences of advocating reproducible research [Donoho 2010].

Levels of reproducibility

Similar to trustworthiness, we may also describe reproducibility on a scale [Vandewalle et al. (2009)]

  1. The results cannot be reproduced by an independent researcher.
  2. The results cannot seem to be reproduced by an independent researcher.
  3. The results could be reproduced by an independent researcher, requiring extreme effort.
  4. The results can be reproduced by an independent researcher, requiring considerable effort.
  5. The results can be easily reproduced by an independent researcher with at most 15 minutes of user effort, requiring some proprietary source packages (MATLAB, etc.).
  6. The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring only standard, freely available tools (C compiler, etc.).

So, to aim for the highest level of reproducibility we should: make data and code available, provide usable documentation such than an independent researcher can make efficient use of the code, and provide code generating all data and figures supporting the scientific claim. Furthermore, to the extent possible, researchers should share using freely available tools.

Danish Code of Conduct for Research Integrity

Each researcher must decide which level he/she would like to be on the reproducibility scale. However, the Danish Code of Conduct for Research Integrity [Code of Conduct] that guides research practices also gives guidelines on reproducibility. The most relevant statement to this end, is the responsibilities in connection to data management. Data is defined as

“Data are detailed records of the primary materials that comprise the basis for the analysis that generates the results”

The responsibility is

“Primary materials and data should be retained, stored and managed in a clear and accurate form that allows the result to be assessed, the procedures to be retraced and - when relevant and applicable - the research to be reproduced. The extent to which primary materials and data are retained and the recommended retaining period should always be determined by the current practices applicable to the specific field of research. However, data should in general be kept for a period of at least five years from the date of publication.”

and

“Researchers are responsible for storing their primary materials and data.”

Depending on access requirements, licensing, size etc., there are many options for storing primary materials and data. If possible, a simple approach is to store data along with the publication itself when the publication is registered at AAUs research portal VBN (vbn.aau.dk). More elaborate data management control can be obtained by registering data at CLAAUDIA, see [?Karsten?], and possibly link the publication registred at VBN with the data hosted at a different location. Some publishers also offer archiving of data along with the publication. Consider your options and make a well-informed choice.

References

J. B. Buckheit and D. L. Donoho (1995). “WaveLab and Reproducible Research”. In: Lecture Notes in Statistics: Wavelets and Statistics. Ed. by A. Antoniadis and G. Oppenheim. Springer-Verlag, pp. 55-81.

D. Hook and D. Kelly (2009). “Testing for Trustworthiness in Scientific Software”. In: ICSE Workshop on Software Engineering for Computational Science and Engineering (SECSE’09). Vancouver, Canada, pp. 59-64. doi: 10.1109/SECSE.2009.5069163.

P. Vandewalle, J. Kovacevic, and M. Vetterli (2009). “Reproducible Research in Signal Processing [What, why, and how]". In: IEEE Signal Processing Magazine, pp. 37-47

D. L. Donoho (2010). An invitation to reproducible computational research, Biostatistics, 11(3), pp. 385-388, https://doi.org/10.1093/biostatistics/kxq028

Editorial (2011). “Devil in the Details”. In: Nature 470, pp. 305-306. doi:10.1038/470305b.

Ministry for Higher Education and Science (2014), “Danish Code of Conduct for Research Integrity”, ISBN: 978-87-93151-36-9

Additional materials

Data One Webinar

Coderefinery

Carpentries inspired

Machine Learning and Reproducibility Checklist