TR-2014-15

Combining successor and predecessor frequencies to model truncation in Brazilian Portuguese

Mike Pham; Jackson L. Lee. 27 October, 2014.
Communicated by John Goldsmith.

Abstract

Brazilian Portuguese exhibits word truncation: e.g. vagabunda "slut" > vagaba, where the theme vowel -a is added to the truncated stem vagab. Goncalves 2011 claims that truncated words preserve the rightmost syllable's onset of the first binary foot. Alternatively, Scher 2012 proposes a Distributed Morphology account involving reanalysis of internal morphological structure without actual truncation: cerveja is reanalyzed as CERV-ej-a, with the new root CERV as a truncated stem to derive cerva. We argue instead that derivation of the truncated stem is better modeled by successor frequencies and predecessor frequencies (SF and PF, respectively; Harris 1955, Hafer and Weiss 1974) optimizing phonological truncation and original word recovery. More specifically, a model incorporating both SF and PF outperforms one that uses only one or the other, as well as a binary foot model, in predicting truncated stems in Brazilian Portuguese. Locating the best SF-PF trade-off point can be viewed as the best morpheme boundary of a given word, which can in turn serve as the basis of a potential morpheme segmentation model, a fully unsupervised strategy that does not a priori assume (i) directionality of affixation and (ii) consistency among morphemes.

Original Document

The original document is available in PDF (uploaded 28 October, 2014 by John Goldsmith).