Content area
Abstract
This project investigates the efficacy and limitations of providing linguistic and morphosyntactic annotations to multilingual texts. I use the French, Latin and English versions of Jean Bodin’s Six Bookes of the Commonweale as a case study, despite the fact that the different language editions are as much different versions of the original text as they are translations. Employing a range of natural language processing tools, each sentence is matched with a sentence from the other languages, and then annotations are added to each word in that sentence. Word-level alignment is attempted, but the hurdles for this task on the Bodin corpus and at large are explained. Each word is also paired with a section in a reference grammar, which elucidates the annotations attached to it. While common in some well-resourced languages, these morpho-syntactic annotations are just beginning to be explored for under-resourced languages like Latin, and so this project also seeks to explore differences between these two groups how they manifest themselves in practice.