Stieff Silver

Just a brief note.

I’ve been quiet lately because I’m neck-deep in a rather interesting research problem: speech-to-text translation for low-resource languages.

HLTCOE Logo

I’m up at the Human Language Technology Center of Excellence, in Baltimore, Maryland.

Here’s a summary of what we are researching, from hltcoe.jhu.edu/research/scale-workshops/:

Automated translation of human speech has been a long-term research goal; it was mentioned by President Clinton in his 2000 State of the Union address, and more recently, prototypes were announced by Skype and Google in January 2015. Spoken language differs significantly from the high-quality, grammatical inputs typically provided to Machine Translation (MT) systems; difficulties include pauses, disfluencies, corrections, interruptions, repetition, and restarts. Speech To Text Translation (STTT) thus suffers from problems intrinsic to speech recognition, difficulties inherent in translation, and problems that arise from the composition of the two.

This workshop will investigate methods to develop and improve an integrated STTT pipeline. More than one language pair will be studied to avoid language-specific solutions, and to explore the effect of differently resourced languages (i.e., languages with higher speech error rates). Multiple approaches to the problem will be considered, and candidate research topics include: domain adaptation for training MT systems on speech output; supplementing translation training data; leveraging ASR lattices; and evaluation of downstream metrics that model different usage scenarios. We expect these efforts to produce important baselines and a comprehensive quantitative and qualitative understanding of the problems of low-resource speech-to-text translation.

Wish me luck!