Content area

Abstract

Reproducibility has long been a cornerstone of science, and is now becoming a key research area for e-Science. This is because it provides a way to validate, and build on, previous results. Underpinning reproducibility in e-Science is provenance, which has the potential to provide scientists with a complete understanding of data generated in eexperiments, including the services that produced and consumed it. This thesis explores the issues in exploiting provenance for reproducibility. Based on this, a reproducibility framework is designed and implemented to allow past experiments to be reproduced. Seven aspects of reproducibility are considered: 1) experiments, 2) reproducibility, 3) provenance, 4) provenance models, 5) provenance and versioning, 6) automatic transformation of provenance to support reproduction, and 7) a reproducibility taxonomy. A key to reproducibility is the provenance model: a data model that structures information about an e-experiment. A review of existing provenance systems shows that the problem caused by services being updated has been neglected. This can have a severe impact on the ability to reproduce experiments and it is therefore argued that the issue of service versioning must be addressed. Even after information on the provenance of an execution, and versioning of services, is captured there is the need for a method to transform this knowledge into a form that allows past experiments to be reproduced: that is another output of this thesis. The thesis focuses on the use of work ow as a means to represent the composition, and to execute experiments. This work explores how work ows can be automatically generated to re-execute past experiments. In order to do this, a transformation algorithm is described that maps a past experiment's execution log data into a work ow format that can be read and processed by the work- ow system. The thesis also introduces a Reproducibility Taxonomy that captures and structures the information required for reproducibility in the presence of versions and provenance.

Details

Title
The exploitation of provenance and versioning in the reproduction of e-experiments
Author
Abang Ibrahim, Dayang Hanani
Year
2016
Publisher
ProQuest Dissertations & Theses
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
1913432747
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.