04 December 2024

“Pay me now or pay me later” in reproducibility

“Reproducibility debt” is an interesting and useful take on the matter of reproducibility and replication. I stumbled across a discussion on a recent podcast.

What I like about this phrasing is that a lot of discussion around reproducibility focuses on bad practices. Things like p-hacking, HARKing, and the like. Framing issues around reproducibility as debt makes it more obvious that what we are talking about are trade-offs.

You might have a more reproducible result if you had a bigger sample size and wrote a perfect paper. But that takes time (opportunity costs), and often takes money (financial costs). And there are benefits to getting papers out - both personal (another thing to add to your annual evaluation) and to the community (puts new ideas out and generates leads for others).

In the short term, it can make sense to take on debt. But you will have to pay it back later.

The paper develops a preliminary list of the kinds of trade-offs that cause reproducibility debt. 

  • Data-centric issues (e.g., data storage)
  • Code-centric issues (e.g., code development)
  • Documentation issues (i.e., incomplete or unclear documentation)
  • Tools-centric issues (e.g., software infrastructure)
  • Versioning issues (e.g., code unavailable)
  • Human-centric issues (e.g., lack of funding)
  • Legal issues (e.g., intellectual property conflicts)

It’s very software focused, so I don’t think the list is comprehensive. For example, in biology, reproducibility might become an issue because a species becomes rare or extinct.

If we have reproducibility debt, maybe we can also conceive of reproducibility bankruptcy: a point where the accumulated shortcuts add up to a complete inability to move forward on knowledge.

References

Hassan Z, Treude C, Norrish M, Williams G, Potanin A. 2024. Characterising reproducibility debt in scientific software: A systematic literature review. http://dx.doi.org/10.2139/ssrn.4801433 

Hassan Z, Treude C, Norrish M, Williams G, Potanin A. 2024. Reproducibility debt: Challenges and future pathways. Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering: 462-466. https://doi.org/10.1145/3663529.3663778

External links

To Be Reproducible or Not To Be Reproducible? That is so Not the Question

No comments:

Post a Comment

Comments are moderated. Real names and pseudonyms are welcome. Anonymous comments are not and will be removed.