Translating One Word at a Time
The Transonics Spoken Dialog Translator turns a doctor’s spoken English questions into spoken Persian and translates patients’ spoken Persian replies into English.
Shrikanth Narayanan led the USC Viterbi School of Engineering group that developed Transonics. A member of the team recently presented a report on the system at the Association for Computational Linguistics conference in Ann Arbor, Mich.
“Fluent two-way machine voice translation is one of the holy grails of engineering,” said Narayanan, an associate professor of electrical engineering, computer science and linguistics who directs the Speech Analysis and Interpretation Laboratory in the Integrated Media Systems Center.
“We are years away from perfecting it, but we think the choices we have made about how to go about creating such a system are working. We hope to have something that will be useful in emergency rooms or ambulances within two years or so.”
The existing system, funded by two grants from the Defense Advanced Research Projects Agency totaling $3.8 million, is a result of research in information technology, supplemented by observation of patient-doctor dynamics in numerous bilingual interaction sessions staged for the project.
“Two-way voice translation involves combining at least three highly imperfect existing disciplines, with the errors multiplying at every stage,” Narayanan said.
The disciplines are:
• Text translation: Taking a written text in one language and translating it into another. Machine translation systems developed by researchers Kevin Knight and Daniel Marcu in the Viterbi School’s Information Sciences Institute consistently rank among the world’s best – but still make frequent grammatical and other errors. Marcu and Knight developed a specialized system specifically for use in Transonics.
• Spoken-word recognition: Narayanan’s specialty. Being able to reliably recognize a large number of different single words, in a variety of regional or foreign accents, is a difficult problem. Recognizing a wide variety of words informally spoken in a noisy, chaotic environment (emergency room, ambulance) adds another level of difficulty.
• Extra-verbal communication: Humans speak with words and intonations. A rising tone at the end of a sentence to express a question is difficult for a machine to assess. Nonsense syllables (um, uh, ah, er), catchphrases (you know, like) and exclamations (Wow! Hey!) in utterances are easy for humans to decode or ignore, but major stumbling blocks for machines. The insights of David Traum of the USC Institute for Creative Technologies in dialog management are aiding in this area by narrowing the range of possibilities and bringing context and previous exchanges into the computer’s decision-making.
Teaching computers to detect human emotions in speech is a major focus by researchers at the USC Speech Analysis and Interpretation Laboratory under the direction of Narayanan and his colleague, USC research assistant professor Panos Georgiou.
“We can take advantage of using essentially pre-fabricated sentences in many cases by trying to understand and paraphrase what is being communicated instead of doing exact word-for-word translation,” Narayanan said.
The system also uses the human ability to read text as a bridge over some of the worst problems of speech recognition and machine translation, by allowing users to select alternate possible messages.
The Transonics interface runs on a laptop computer using the Linux operating system. Doctor and patient both wear headphones with attached microphones. A small keypad connected to the computer speeds and simplifies certain routine commands – switching from doctor to patient mode, for example.
When a doctor asks a question, the speech recognition software captures it – but hedges its bets by displaying its best guess about what was said plus a range of options.
When the doctor chooses the most appropriate (some of the most often used phrases can be put in a quick access “ready menu”), the result is a spoken Persian question in the earphones of the patient.
The same process then takes place in reverse.
Narayanan said much of the success of the interface grows directly out of analysis of a large database of some 300 English-speaking-doctor/Persian-speaking-patient dialogs created by USC medical students and Iranian-heritage USC students and Los Angeles residents.
“Rather than imagining what people might say, we analyzed what people did say,” he explained, adding that recordings of the encounters were used to train and tune the system.
USC linguistics Ph.D. candidate Shadi Ganjavi played a key role in setting up these encounters. “We are grateful to her and to the large Persian-speaking community in Los Angeles,” Narayanan said.
The system contains about 23,000 English and 9,000 Persian words, a disproportion that exists because relatively little has so far been done in machine translation of Persian (a language also called Farsi), either written or spoken.
For Narayanan, one of the striking things that emerged during the process was the dependence of the system, in its current state, on the ability of users to recognize its limits and weaknesses, and work within them.
The team has created an elaborate user manual, and as with any system, reading the manual improves performance a great deal.
In addition to the previously mentioned researchers and institutions, Malibu-based HRL Laboratories has worked with USC on the project. HRL personnel included USC alumni Robert Belvin and Howard Neely.
Usability testing and interface design contributions were made by Scott Millward, a postdoctoral scientist at IMSC. USC electrical engineering graduate students Emil Ettellaie, Dagen Wang, Ananthakrishnan Shankar, Murtaza Bulut also made contributions, as well as Sudeep Ghande, who presented the paper.
For more information on the system, visit http://sail.usc.edu/transonics.
Latest stories
- MSW@USC Student to Compete in 2012 Paralympics February 10, 2012 9:22 AM
- Judy Woodruff: Public Broadcasting Has Changed for the Good February 10, 2012 8:49 AM
- USC Price School Celebrates Naming Gift February 9, 2012 2:45 PM
-
For Journalists »
-
USC in the News
for 2/8/2012 »-
The Chronicle of Higher Education mentioned USC’s $6 billion fundraising campaign. The story noted that USC had already raised $1 billion in a “quiet phase,” including the $200 million naming gift from USC Trustee and alumnus David Dornsife and wife Dana Dornsife to the USC Dornsife College.
The Guardian (U.K.) highlighted two major gifts to USC in a list of the 10 biggest philanthropic benefactors in America. The list included the $200 million naming gift from USC Trustee and alumnus David Dornsife and wife Dana Dornsife to the USC Dornsife College, and the $110 million gift from USC Trustee and USC Viterbi School alumnus John Mork and wife Julie to create the USC Mork Family Scholars Program.
The New York Times featured the USC U.S.-China Institute documentary “Assignment: China — The Week that Changed the World.” The documentary, part of a series, examines media coverage of the 1972 Nixon trip that reshaped U.S.-China relations after a quarter century of isolation and hostility. “People look back now and take it for granted that the outcome was preordained,” said the institute’s Mike Chinoy, who produced the documentary. Voice of America also featured the story.
Los Angeles Times featured the Oscar Senti-meter, a tool developed by the USC Annenberg School, Los Angeles Times and IBM that analyzes thousands of tweets about the Academy Awards nominees. The story noted that Mexican actor Demian Bechir received an enormous boost on Twitter the day of the nominations, with a total of 6,893 tweets mentioning him, a 47-fold increase from the day before. The story noted the tool uses language-recognition technology developed in collaboration with USC Viterbi School’s Signal Analysis and Interpretation Lab.
The Times of India (India) featured a three-day medical emergency training workshop organized in association with USC. At the workshop, held at GCS Medical College in India, 50 doctors and more than 100 paramedics learned how to improve emergency support systems. William Mallon of the Keck School of USC said that discussion topics included the use of portable ultrasonic devices to scan patients. “The ultrasound applications help physicians make accurate and timely decisions,” he noted. Daily News & Analysis (India) also featured the workshop.
-
-
Campus News
- Capital Connections
- USC faculty, staff and alumni in Washington, D.C., and Sacramento
- In Print
- New and recent books written or edited by USC faculty and staff
- Family Matters
- Achievements and awards
- Obituaries
