Scientists translate brain signals into speech sounds

Scientists used brain signals recorded from epilepsy patients to program a computer to mimic natural speech–an advancement that could one day have a profound effect on the ability of certain patients to communicate. The study was supported by the National Institutes of Health’s Brain Research through Advancing Innovative Technologies (BRAIN) Initiative.

“Speech is an amazing form of communication that has evolved over thousands of years to be very efficient,” said Edward F. Chang, M.D., professor of neurological surgery at the University of California, San Francisco (UCSF) and senior author of this study published in Nature. “Many of us take for granted how easy it is to speak, which is why losing that ability can be so devastating. It is our hope that this approach will be helpful to people whose muscles enabling audible speech are paralyzed.”

In this study, speech scientists and neurologists from UCSF recreated many vocal sounds with varying accuracy using brain signals recorded from epilepsy patients with normal speaking abilities. The patients were asked to speak full sentences, and the data obtained from brain scans was then used to drive computer-generated speech. Furthermore, simply miming the act of speaking provided sufficient information to the computer for it to recreate several of the same sounds.

The loss of the ability to speak can have devastating effects on patients whose facial, tongue, and larynx muscles have been paralyzed due to stroke or other neurological conditions. Technology has helped these patients to communicate through devices that translate head or eye movements into speech. Because these systems involve the selection of individual letters or whole words to build sentences, the speed at which they can operate is very limited. Instead of recreating sounds based on individual letters or words, the goal of this project was to synthesize the specific sounds used in natural speech.

“Current technology limits users to, at best, 10 words per minute, while natural human speech occurs at roughly 150 words/minute,” said Gopala K. Anumanchipalli, Ph.D., speech scientist, UCSF and first author of the study. “This discrepancy is what motivated us to test whether we could record speech directly from the human brain.”

The researchers took a two-step approach to solving this problem. First, by recording signals from patients’ brains while they were asked to speak or mime sentences, they built maps of how the brain directs the vocal tract, including the lips, tongue, jaw, and vocal cords, to make different sounds. Second, the researchers applied those maps to a computer program that produces synthetic speech.

Volunteers were then asked to listen to the synthesized sentences and to transcribe what they heard. More than half the time, the listeners were able to correctly determine the sentences being spoken by the computer.

By breaking down the problem of speech synthesis into two parts, the researchers appear to have made it easier to apply their findings to multiple individuals. The second step specifically, which translates vocal tract maps into synthetic sounds, appears to be generalizable across patients.

“It is much more challenging to gather data from paralyzed patients, so being able to train part of our system using data from non-paralyzed individuals would be a significant advantage,” said Dr. Chang.

The researchers plan to design a clinical trial involving paralyzed, speech-impaired patients to determine how to best gather brain signal data which can then be applied to the previously trained computer algorithm.

“This study combines state-of-the-art technologies and knowledge about how the brain produces speech to tackle an important challenge facing many patients,” said Jim Gnadt, Ph.D., program director at the NIH’s National Institute of Neurological Disorders and Stroke. “This is precisely the type of problem that the NIH BRAIN Initiative is set up to address: to use investigative human neuroscience to impact care and treatment in the clinic.”

###

This research was funded by the NIH BRAIN Initiative (DP2 OD008627 and U01 NS098971-01), the New York Stem Cell Foundation, the Howard Hughes Medical Institute, the McKnight Foundation, the Shurl and Kay Curci Foundation, and the William K. Bowes Foundation.

For more information:

National Institute for Neurological Disorders and Stroke – //www.ninds.nih.gov/

NIH Brain Research through Advancing Innovative Technologies (BRAIN) Initiative – https://www.braininitiative.nih.gov/

The NIH’s Brain Research through Advancing Innovative Neurotechnologies® (BRAIN) Initiative is aimed at revolutionizing our understanding of the human brain. It is managed by 10 institutes whose missions and current research portfolios complement the goals of the BRAIN Initiative: NCCIH, NEI, NIA, NIAAA, NIBIB, NICHD, NIDA, NIDCD, NIMH, and NINDS.

NINDS is the nation’s leading funder of research on the brain and nervous system. The mission of NINDS is to seek fundamental knowledge about the brain and nervous system and to use that knowledge to reduce the burden of neurological disease.

About the National Institutes of Health (NIH):

NIH, the nation’s medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit //www.nih.gov.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

Scientists translate brain signals into speech sounds

Related posts