Content area
Full Text
Heredity (2011) 106, 709
& 2011 Macmillan Publishers Limited All rights reserved 0018-067X/11
http://www.nature.com/hdy
Web End =www.nature.com/hdy
EDITORIAL
Data archiving
Heredity (2011) 106, 709; doi:http://dx.doi.org/10.1038/hdy.2010.43
Web End =10.1038/hdy.2010.43
Data are the building blocks of science, the basic observations around which we construct our theories. When a paper is published, we may doubt its interpretations, but a fundamental principle of the scientic method is that we should be able to re-examine the data and form our own opinion of their meaning. We should be able to extend existing data sets in the hope of making more powerful tests of our ideas. We might like to gather the data from many studies in order to test hypotheses or search for patterns. All of these rely on the assumption that the data underlying scientic publications are freely available, and remain available for a long time after the original project is completed.
In practice, data are not always easily accessible. A striking exception is DNA sequence data. Since the early days of sequencing, the whole international community has recognized the value of depositing sequence data in publicly available databases. The result is a fantastic resource, which is growing exponentially both in size and in value. One of the reasons for the success of EMBL/GenBank/DDBJ has been the near-universal insistence by journals that deposition of sequence data is a condition of publication. This principle has been extended to some other data types in genetics, especially microarray data, but until now has not been applied...