Abstract

We present a simple computational approach to assigning a measure of complexity and information/entropy to families of natural languages, based on syntactic parameters and the theory of error correcting codes. We associate to each language a binary string of syntactic parameters and to a language family a binary code, with code words the binary string associated to each language. We then evaluate the code parameters (rate and relative minimum distance) and the position of the parameters with respect to the asymptotic bound of error correcting codes and the Gilbert-Varshamov bound. These bounds are, respectively, related to the Kolmogorov complexity and the Shannon entropy of the code and this gives us a computationally simple way to obtain estimates on the complexity and information, not of individual languages but of language families. This notion of complexity is related, from the linguistic point of view to the degree of variability of syntactic parameter across languages belonging to the same (historical) family.

Details

Title
Syntactic Parameters and a Coding Theory Perspective on Entropy and Complexity of Language Families
Author
Marcolli, Matilde
Pages
110
Publication year
2016
Publication date
2016
Publisher
MDPI AG
e-ISSN
10994300
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1780818392
Copyright
Copyright MDPI AG 2016