People
Claire Bowern
Professor of Linguistics; Director of Undergraduate Studies
What do you do with Data Science?
Most of my work in data science is centered around the statistical properties of natural language and how such information can be used to make inferences about human history. This includes techniques for studying language evolution, such as identifying macro- and micro- changes across datasets. A second line of work is related to language identification and separating enciphered language from non-language (specifically related to the Voynich manuscript). This includes using TF-IDF and semantic vectors. Finally, my work in speech looks at the properties of speaker-based and language-based variation and ASR and forced alignment for low resource languages. My research is particularly focused on low resourced and endangered languages, and so I am interested in the parts of data science that don't use "big data" solutions. Finally, given my background in fieldwork I have a strong interest in ethics and the ethical implications of AI and data science.