Jarrod Millman

Verifiable, reproducible research and computational science

A mini-symposium at the SIAM Conference on Computational Science & Engineering in Reno, NV on March 4, 2011.

Summary

As research grows increasingly dependent on computing, it becomes critical for our computational resources to be developed with the same rigor, open review and access as the results they support. In this minisymposium, we will discuss:

  • sharing of scientific software, data and knowledge necessary for reproducible research
  • unrestricted access to research outcomes and educational tools
  • open source software developed by collaborative, meritocratic communities
  • openly tested, validated and documented software as the basis for reliable scientific outcomes
  • high standards of computational literacy in the education of mathematicians, scientists and engineers

MS148 - Part I of II

The Challenge of Reproducible Research in the Computer Age (slides)

Computing is increasingly central to the practice of mathematical and scientific research. This has provided many new opportunities as well as new challenges. In particular, modern scientific computing has strained the ability of researchers to reproduce their own (as well as their colleagues’) work. In this talk, I will outline some of the obstacles to reproducible research as well as some potential solutions and opportunities.

Reproducible Research, Lessons from the Madagascar Project (slides)

Sergey Fomel, University of Texas at Austin, USA

The Madagascar open-source project is a community effort, which implements reproducible research practices, as envisioned by Jon Claerbout. More than 100 geophysical papers have been published, together with open software code and data, and are maintained by the community. We have learned that continuous maintenance and repeated testing are necessary for enabling long-term reproducibility. As noted by Claerbout and others, the main beneficiary of the reproducible research discipline is the author.

Top 10 Reasons to NOT Share your Code and Why you Should Anyway (slides)

Randall J. LeVeque, University of Washington, USA

The research codes used to produce results (tables, plots, etc.) in publications are rarely made available, limiting the readers’ ability to understand the algorithms that are actually implemented. Many objections are typically raised to doing so. Although there are some valid concerns, my view is that there are good counter-arguments or ways to address most of these issues. In this talk I will discuss what may be the top 10 reasons.

Intellectual Contributions to Digitized Science: Implementing the Scientific Method (slides)

Victoria Stodden, Columbia University, USA

Our stock of scientific knowledge is now accumulating in digital form, and the underlying reasoning is often in the code that generated the findings, which is often never published. The case for open data is being made but open code must be recognized as equally important in a principled approach, that of reproducibility of computational results. Issues involved with code and data disclosure are presented, along with possible solutions.

MS155 - Part II of II

Reproducible Research: Lessons from the Open Source World (slides)

Fernando Pérez, University of California, Berkeley, USA

Why are the practices of open source software development often more consistent with our ideas of openness and reproducibility in science than science itself? Today’s scientific praxis falls short of our ideals of reproducibility, and these problems are particularly acute in computational domains where they should be less prevalent. I will explore the reasons for this and will draw some ideas from software development that provide technical means to address some of these issues.

FEMhub, a Free Distribution of Open Source Finite Element Codes (slides)

Pavel Solin, Ondrej Certik, Aayush Poudel, and Sameer Regmi, University of Nevada, Reno, USA; Mateusz Paprock, Technical University of Wroclaw, Poland

FEMhub is an open source distribution of finite element codes with a unified Python interface. The goal of the project is to reduce heterogeneity in installation and usage of open source finite element codes, facilitate their interoperability and comparisons, and improve reproducibility of results. FEMhub is available for download as desktop application, but all codes are also automatically available in the Online Numerical Methods Laboratory.

Reproducible Models and Reliable Simulations: Current Trends in Computational Neuroscience (slides)

Hans E. Plesser, Norwegian University of Life Sciences, Norway; Sharon M. Crook, Arizona State University, USA; Andrew P. Davison, CNRS, France

Computational neuroscientists simulate models of neuronal networks to further our understanding of brain dynamics. Unfortunately, the validity of models of neuronal dynamics and of the simulation software implementing the models is difficult to ascertain, challenging the validity of computational neuroscience. I will describe how the computational neuroscience community is addressing validity through software reviews, best practices, increasing use of established software packages, meta-simulators, systematic testing, and simulator-independent model-specification languages.

Publishing Reproducible Results with VisTrails (slides)

Juliana Freire and Claudio Silva, University of Utah, USA

VisTrails is an open-source provenance management and scientific workflow system designed to support scientific discovery. It combines and substantially extends useful features of visualization and scientific workflow systems. Similar to visualization systems, VisTrails makes advanced scientific visualization techniques available to users allowing them to explore and compare different visual representations of their data; and similar to scientific workflow systems, VisTrails enables the composition of workflows that combine specialized libraries, distributed computing infrastructure, and Web services.