Content area
Full text
1. Introduction
The availability of digitised or digital born text sources provides new opportunities for both humanities scholars and computer scientists: humanities scholars can access an enormous volume of digitised texts and computer scientists are developing innovative solutions to help with their analysis.
This shift from reading a single, printed, book to the option to browse many digital texts is at the heart of the digital humanities (DH) domain, which helps to develop solutions for the handling of vast amounts of data. New research questions arise from work in the DH: computational methods may support raising innovative research questions in the humanities.
Emerging in the late 1980s, the DH discipline initially focused on designing standards for cultural heritage data such as the text encoding initiative (TEI) [1], and on aggregating, digitising and delivering data. Then, visualization techniques have been integrated for better analysing the data provided (Saito et al., 2010). In the meantime, while the close reading of texts was developed in the middle of the twentieth century as a method in literary criticism (Gold, 2012), distant reading is a relatively new approach introduced by Franco Moretti at the beginning of the twenty-first century (Moretti, 2005). More recently, the research trend has been a combination of close and distant reading approaches (Beals, 2014; Bradley and Meister, 2012; Coles and Lein, 2013; Shneiderman, 1996). The rather specific field of textbook studies followed a similar development, complementing qualitative analysis with quantitative methods (Lachmann and Mitchell, 2014). Meanwhile, Svensson differentiates minimalist reading “application of computer technology to traditional scholarly work” and maximalist reading “changing the substance of humanistic matter” (Svensson, 2016).
A recent paper (Joo et al., 2022) explores the DH research agenda providing the knowledge structure and research trends in the domain of DH in the recent decade. Data mining in the DH usually involves extracting information from a body of texts and/or their metadata to answer research questions, that may be quantitative or qualitative, and to detect patterns across large text collections/corpora. While text analysis is part of qualitative research, the algorithms used by the tools apply quantitative methods as well as search/match procedures to identify the elements and features in any text.
This article explores how quantitative and qualitative approaches of text analysis...





