C.M. Downey

PhD Student- Computational Linguistics, University of Washington

About Me

I am currently a PhD student in Computational Linguistics with the University of Washington Department of Linguistics, under the advisement of Dr. Gina-Anne Levow. I also belong to the CLMBR lab group within UW NLP, headed by Shane Steinert-Threlkeld. Before coming to UW, I earned a B.S. in Linguistics and Russian at Tulane University, where I also completed extensive coursework in Computer Science, including algorithmic analysis, machine learning, and computational linguistics.

The main focus of my research is NLP for low-resource and endangered languages, as well as improving cross-lingual/multi-lingual NLP more generally. I work most often with North American Indigenous languages such as Lakota and Inuktitut. I also often work with highly inflecting Slavic and Uralic languages such as Russian, Hungarian, and Saami.

One specific topic of interest for me is the unsupervised segmentation of units of language, due to the fact that many Indigenous and otherwise understudied languages have a complex structure within a single “word” as delineated by spaces that appear to each side. This requires NLP that diverges significantly from models constructed for English and other European languages, in which single “words” tend to have much less internal structure.

It is my aim to channel my research not just towards the acquisition of linguistic knowledge, but towards the more concrete task of Language Revitalization among groups who have experienced serious attrition of their native language due to colonialism and other global influences. Many of these communities have experienced outside pressure to give up their traditional language, and many of these languages now exist only in archives by-and-large inaccessible to Indigenous groups. It is my hope that NLP can be a tool to restore such languages to some level of vibrancy within Indigenous communities by making technology-based language-learning easy and accessible for both students and teachers in Indigenous groups, even if the language in question has been dormant for generations.


  • University of Washington FLAS Fellowship
    September 2019 - June 2020

    Fellow: Supervised by Nadine Fabbi

  • Low Resource Languages for Emergent Incedents (LORELEI)
    August - October 2019

    Research Assistant: Supervised by Gina-Anne Levow

  • Newberry Research Library Summer Institute
    July - August 2019

    Fellow: Supervised by Jenny Davis


  • University of Washington Department of Linguistics
    September 2018 - Present

    Teaching/Research Assistant: Supervised by Richard Wright

  • Apple: Siri Web Answers
    June - September 2020

    AI|ML Intern: Supervised by Chris DuBois



  • A Masked Segmental Language Model for Natural Language Segmentation
    April 2021

    C.M. Downey, Fei Xia, Gina-Anne Levow, and Shane Steinert-Threlkeld

    Arxiv Code

Invited Talks

  • Segmental Language Modeling
    April 20 2021

    UW CLMBR Lab Group; Seattle, Washington

  • Archival Work and Language Revitalization at the NCAIS Summer Institute
    December 13 2019

    UW Linguistics Field Reports; Seattle, Washington

  • Dependency vs Phrase-Structure Trees
    October 4 2019

    UW Linguistics Syntax Roundtable; Seattle, Washington

  • Subword Segmentation for Morphologically Complex Languages
    September 28 2019

    UW NLP Retreat; Lake Chelan, Washington


  • LING 575: Deep Learning for Natural Language Processing
    Spring 2021

    Teaching Assistant: Supervised by Shane Steinert-Threlkeld

  • LING 566: Introduction to Syntax for Computational Linguistics
    Fall 2020

    Teaching Assistant: Supervised by Emily M. Bender

  • LING 200: Introduction to Linguistic Thought
    Fall 2020

    Teaching Assistant: Supervised by Richard Wright, also:

  • ASL 305: Introduction to American Deaf Culture
    Spring 2019

    Grader: Supervised by Lance Forshay

  • LING 406: Introduction to Syntax
    Fall 2018

    Grader: Supervised by Kirby Conrod

Guest Lectures

  • Computational Linguistics and Language Revitalization
    May 4 2020

    as part of LING 234: Language and Diversity; University of Washington; Instructed by Lorna Rozelle

  • Computational Linguistics and Language Revitalization
    January 27 2020

    as part of ENGL 4717: NAIS Capstone Seminar; University of Colorado; Instructed by Penelope Kelsey


  • ACL Reviewer-Paper Assignment System
    December 2020 - February 2021

    Research Assistant/Developer: Supervised by Fei Xia