Content area
Full text
Computational literary study offers correctives to problems that literary scholarship was never confused about in the first place.
Quantitative methods are ascendant in literary studies, abetted by disproportionate funding, the absence of strict evaluative protocols, and a scarcity of knowledgeable and disinterested peer reviewers. It is time for the profession to take a closer look. Computational literary studies (CLS for short) — the most prominent strand of the digital humanities — applies computational methods to literary interpretation, from a single book to tens of thousands of texts. This usually entails feeding bodies of text into computer programs to yield quantitative results, which are then used to make arguments about literary form, style, content, or history. We are told, for instance, that digital analysis of 50,000 texts proves that there are "six or sometimes seven" basic literary plot types.
Not only has this branch of the digital humanities generated bad literary criticism, but it tends to lack quantitative rigor. Its findings are either banal or, if interesting, not statistically robust. The problem appears to be structural. In order to produce nuanced and sophisticated literary criticism, CLS must interpret statistical analysis against its true purpose; conversely, to stay true to the capacities of quantitative analysis, practitioners of CLS must treat literary data in vastly reductive ways, ignoring everything we know about interpretation, culture, and history. Literary objects are too few, and too complex, to respond interestingly to computational interpretation — not mathematically complex, but complex with respect to meaning, which is in turn activated by the quality of thought, experience, and writing that attends it.
Computational textual analysis itself is an imprecise science prone to errors. The degree to which this imprecision is acceptable depends on the size of your corpus and on the nature of your goals. In many sectors — but not in literary studies — machine-assisted textual analysis works. In such areas as social-media monitoring, biomedical research, legal discovery, and ad placement, unimaginably large sets of textual data are generated every second. Processing this data through computational and statistical tools is uncannily efficient at discovering useful and practical insights. CLS’s methods are similar to those used in professional sectors, but they can offer no plausible justification for their imprecision and drastic reduction of argumentative...