Methodicate: reproducible data science

IT Innovation ChallengesStaff project based in the Ludwig Institute for Cancer Research.

Spring 2016 round

For more information, please contact innovations@it.ox.ac.uk

ABSTRACT

Reproducibility is a fundamental requirement for any research, and reproducible science starts with good record keeping. Traditional paper notebooks are increasingly giving way to sophisticated computer programs that record the details of an experiment. But the nature of experiments is changing too.

In disciplines from physics and biology to social sciences to the humanities, an increasing number of researchers are doing Data Science – analysing large volumes of data using computer programs. To date, no comprehensive, user-friendly systems exist for keeping track of this kind of research. The recommended approach currently involves tools from the computer science world that have a steep learning curve and require the user to remember to manually record their changes every time they produce a new result. This is a prohibitive hurdle to most, which is why research reproducibility has become such a serious problem — it is hard for researchers to keep track of the steps needed to reproduce a significant result.

Methodicate2 is an online logbook designed for today’s data scientists that combines powerful existing version control systems with an intuitive user interface. Methodicate automatically keeps track of new results as they are produced and records them together with the data and computer programs that produced them. It allows researchers to easily share results and reports with supervisors and collaborators. Crucially, Methodicate does not interrupt the user’s workflow, but integrates seamlessly with his or her preferred tools. By taking care of the bookkeeping, Methodicate frees data scientists to focus their efforts on efficient, reproducible science.

Posted in | Comments Off on Methodicate: reproducible data science

Comments are closed.