Content area

Abstract

Making data open and reusable is a central but challenging-to-achieve goal of open data initiatives. Data reuse is often regarded as an important, value-realizing stage of the research data life cycle, and reusability is a core element of the FAIR data principles, which is widely a adopted framework for data management and stewardship. However, it is essential to understand researchers’ actual data reuse practices in order to promote effective data sharing and reuse across the research communities. Although prior studies have examined these practices in fields with well-established data sharing cultures, including astronomy, earth and environmental sciences, and parts of the social sciences, many other disciplines remain unexplored. There is a growing need for research that explores data reuse practices more systematically and in greater depth across a broader range of fields, especially those also with longstanding traditions of data sharing and reuse.

This dissertation examines the landscape of data reuse practices within the Information Retrieval (IR) research community. IR has a long history of reusing shared data and providing system-based experimental data for reuse, but its data practices have never been systematically studied, and it presented an interesting case because it presented an research domain that involves researchers trained in a wide range of disciplines. Drawing on two rounds of semi-structured interviews with 36 participants, this study explores the purposes of data reuse, the ways researchers discover and access data, the incentives and disincentives influencing sharing and reuse, the decision-making processes involved, and the broader practices surrounding data reuse. It identifies the three primary purposes for which IR researchers reuse data, exploratory purposes, verificatory purposes, and preparatory purposes, thereby broadening the typological framework for understanding why researchers reuse others' data. Regarding data discovery and access, the study finds that IR researchers primarily operate at the individual level, relying on heterogeneous practices shaped by their research areas, institutional affiliations, and disciplinary backgrounds. It further demonstrates that data reuse decisions are not made in a single step but instead unfold through a multi-stage process. Five key stages of decision-making are identified: methodology appropriateness evaluation, trustworthiness evaluation, reusability screening, further reusability evaluation, and compliance evaluation. Moreover, this study highlights how disciplinary context influences researchers’ approaches to data reuse. Through this analysis, it contributes to the studies on data sharing and reuse by demonstrating that data reuse behaviors are shaped not only by individual preferences and risk assessments, but also by collective consensus, incentives, and the epistemic norms of researchers’ communities.

This work aims to encourage scholars, practitioners, and infrastructure designers to engage with efforts to foster a sustainable culture of data reuse. Continued research in this area is critical for developing the protocols, standards, and knowledge infrastructures necessary to support seamless and meaningful data sharing and reuse, not only within IR but across diverse research communities. The dissertation ends by offering four directions for future research. These include studying how community-level factors influence data reuse practices at individual level, examining how researchers search for and discover data, expanding the focus beyond research data to consider other shareable resources such as code, experimental designs, and AI models, as well as exploring how AI technologies are changing data practices.

Details

1010268
Title
The Landscape of Data Reuse in Information Retrieval Purposes, Practices, and Decisions
Number of pages
265
Publication year
2025
Degree date
2025
School code
0031
Source
DAI-A 87/6(E), Dissertation Abstracts International
ISBN
9798270213084
Committee member
Borgman, Christine L.; Leazer, Gregory H.; Furner, Jonathan; Milojević, Staša
University/institution
University of California, Los Angeles
Department
Information Studies 045A
University location
United States -- California
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32399318
ProQuest document ID
3282895942
Document URL
https://www.proquest.com/dissertations-theses/landscape-data-reuse-information-retrieval/docview/3282895942/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic