Claire Bowern is a historical linguist whose research is centered around language change and language documentation in Indigenous Australia. While her work touches many areas, the overarching question is how to characterize the nature of language change. Language change involves a complex interplay of universal properties of language acquisition and production and community-specific social factors; her research program looks at how to study this so we understand both the micro change(es) in progress and the macro change that leads to language families. She works with speakers of endangered languages, with archival sound and print materials, and uses computational and phylogenetic methods. She is currently the editor of the journal Diachronica and a member of the GSAS executive committee. She has secondary appointments in Anthropology, Cognitive Science, and membership of the Executive Committee for Women, Gender, and Sexuality Studies.
What do you do with data science?
Most of my work in data science is centered around the statistical properties of natural language and how such information can be used to make inferences about human history. This includes techniques for studying language evolution, such as identifying macro- and micro- changes across datasets. A second line of work is related to language identification and separating enciphered language from non-language (specifically related to the Voynich manuscript). This includes using TF-IDF and semantic vectors. Finally, my work in speech looks at the properties of speaker-based and language-based variation and ASR and forced alignment for low resource languages. My research is particularly focused on low resourced and endangered languages, and so I am interested in the parts of data science that don’t use “big data” solutions. Finally, given my background in fieldwork I have a strong interest in ethics and the ethical implications of AI and data science.