Date: Thursday, 5/16
Time: 3:30pm
Location: Room 1005
Committee:
- Dr. Jane Greenberg, chair
- Dr. Erjia Yan
- Dr. Xia Lin
- Dr. Gail Rosen, ECE
- Dr. Karthik Ram, UC Berkeley
Title: Reproducible Computational Research in Bioinformatics: A Study of Tools and Metadata to Bind the Analytic Stack
Abstract: Reproducible computational research (RCR), and the “reproducibility crisis” continues to attract attention in a number of scientific disciplines. In this proposal, I represent reproducibility in terms of cohesiveness in the “analytic stack” comprising raw input data, tools, workflows, analyses, and publications. I review a number of existing major case study types – reproduction, replication, refactor, robustness test, survey, census, and case narrative. Of particular interest are refactors, in which an existing analysis with abstract methods is reimplemented by a third party. This proposal will identify three studies to be refactored, the state-of-the-art tools and standards to be applied, and how these attempts will be evaluated by external reviewers. The process of the refactor can be used to evaluate the limitations of reproducibility using conventional tools. From the refactor and survey I will identify persistent gaps in the “analytic stack”, and describe features of metadata solutions that can be used to address these deficiencies.