Content area

Abstract

This project investigates the efficacy and limitations of providing linguistic and morphosyntactic annotations to multilingual texts. I use the French, Latin and English versions of Jean Bodin’s Six Bookes of the Commonweale as a case study, despite the fact that the different language editions are as much different versions of the original text as they are translations. Employing a range of natural language processing tools, each sentence is matched with a sentence from the other languages, and then annotations are added to each word in that sentence. Word-level alignment is attempted, but the hurdles for this task on the Bodin corpus and at large are explained. Each word is also paired with a section in a reference grammar, which elucidates the annotations attached to it. While common in some well-resourced languages, these morpho-syntactic annotations are just beginning to be explored for under-resourced languages like Latin, and so this project also seeks to explore differences between these two groups how they manifest themselves in practice.

Details

Title
The Bodin Corpus: A Multilingual Parallel Text Case-Study
Author
Nadel, Peter
Publication year
2022
Publisher
ProQuest Dissertations & Theses
ISBN
9798352912393
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2731716393
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.