Learning long-distance phonotactics as tier-based strictly local languages
Collaborator: Kevin McMullin
We hypothesize that all phonotactics can be expressed as a conjoined set of tier-based strictly 2-local languages. These languages resemble ones in which a grammar is defined as a set of (il)legal segmental bigrams, but also include statements of grammaticality for bigrams defined over word substrings composed of only segments that fall into specific sets ("tiers"), such as a tier of all/only vowels. To formalize and test this idea, we implement a learning algorithm for such TSL-2 languages with minimal input assumptions, e.g. no phonological features and no pre-specified tiers. Our empirical focus is on consonant and vowel harmony patterns, but local patterns are also, by hypothesis, expressable by using TSL-2 languages. We also extend the idea of TSL-2 languages into a probabilistic framework more amenable to real-world language data, allowing us to use learning techniques from the field of machine learning.
: learning algorithm design, all coding (Python)
Papers and presentations
McMullin, Kevin and Allen, Blake. Phonotactic learning and the conjunction of Tier-based Strictly Local languages. Paper presentation. LSA 2015 Annual Meeting. Portland, Oregon: January 10, 2015.
For other references, please see Kevin McMullin's website.