Content area
Abstract
How are historic texts selected for digitization in mass digital archives? It can be surprisingly challenging to know what is in the databases which now implicitly structure nearly all literary research. These resources, therefore, may operate as “black boxes,” complex equipment whose operations are mysterious to users. Examining the specific institutional histories of four literary databases, I explore how the scholarly practice of historical recovery might adapt to the site of the database. I measure and historicize uneven digitization as it might affect women writers of the late eighteenth century, to shed light on broader processes of textual selection. Specifically, I examine the English Short Title Catalogue (ESTC), Eighteenth Century Collections Online (ECCO), the Text Creation Partnership (TCP), and HathiTrust. I describe the development of these databases from the 1970s to the present. I then compare each database’s holdings of works published in England 1789–99, manually identifying how many works are attributed to women. I find that these databases in fact marginalize “authorless” works, namely, those which are unsigned, attributed to unidentified pseudonyms, or written by corporate authors. To bridge literary and archival understandings of textual value, I present a definition of value as a principle of selection, arguing that a site of textual selection is also a site of textual valuation.