Syntactic Parameters and a Coding Theory Perspective on Entropy and Complexity of Language Families
- Creators
-
Marcolli, Matilde
Abstract
We present a simple computational approach to assigning a measure of complexity and information/entropy to families of natural languages, based on syntactic parameters and the theory of error correcting codes. We associate to each language a binary string of syntactic parameters and to a language family a binary code, with code words the binary string associated to each language. We then evaluate the code parameters (rate and relative minimum distance) and the position of the parameters with respect to the asymptotic bound of error correcting codes and the Gilbert–Varshamov bound. These bounds are, respectively, related to the Kolmogorov complexity and the Shannon entropy of the code and this gives us a computationally simple way to obtain estimates on the complexity and information, not of individual languages but of language families. This notion of complexity is related, from the linguistic point of view to the degree of variability of syntactic parameter across languages belonging to the same (historical) family.
Additional Information
© 2016 by the author; licensee MDPI, Basel, Switzerland. This is an open access article distributed under the Creative Commons Attribution License (CC BY) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Received: 14 January 2016; Accepted: 18 March 2016; Published: 7 April 2016. Academic Editors: Frédéric Barbaresco, Frank Nielsen and Kevin H. Knuth. (This article belongs to the Special Issue Differential Geometrical Theory of Statistics) The author's research is supported by NSF grants DMS-1201512 and PHY-1205440, and by the Perimeter Institute for Theoretical Physics. The author thanks the referees for their useful comments. The author declares no conflict of interest.Attached Files
Published - entropy-18-00110.pdf
Files
Name | Size | Download all |
---|---|---|
md5:76680676f036c1c47db8eddd9b1ec5d8
|
297.7 kB | Preview Download |
Additional details
- Eprint ID
- 66344
- Resolver ID
- CaltechAUTHORS:20160420-165543640
- NSF
- DMS-1201512
- NSF
- PHY-1205440
- Perimeter Institute for Theoretical Physics
- Created
-
2016-04-21Created from EPrint's datestamp field
- Updated
-
2021-11-10Created from EPrint's last_modified field