TR-2016-02

Working with CHAT transcripts in Python

Jackson L. Lee; Ross Burkholder; Gallagher B. Flinn; Emily R. Coppess. 29 January, 2016.
Communicated by John Goldsmith.

Abstract

This report introduces the Python library PyLangAcq for working with CHAT transcription data in Python. The library interfaces with speech data transcribed in the CHAT format, which is adopted by the CHILDES database for child language development research. Built in a Python infrastructure, PyLangAcq has direct access to a multitude of computational and statistical tools for language acquisition research. As the CHAT format is also used for other speech transcription databases, PyLangAcq will be useful for researchers in other linguistically related fields such as conversational analysis, corpus linguistics, and clinical linguistics.

Original Document

The original document is available in PDF (uploaded 29 January, 2016 by John Goldsmith).