Edinburgh-based Speech Graphics and researchers at UC San Francisco and UC Berkeley create the world’s first brain-computer interface that synthesises speech and facial expression from brain signals, opening a way to restore natural communication for those who cannot speak.
The same software that’s used to drive facial animation in games such as The Last Of Us Part II and Hogwarts Legacy turns brain waves into a talking digital avatar.
[Edinburgh, 23rd August 2023] In a groundbreaking research study Speech Graphics, a pioneer of AI-driven facial animation, has collaborated with researchers at UC San Francisco and UC Berkeley to help a paralysed woman in the US communicate using a digital avatar controlled via a brain-computer interface (BCI).
The research was able to decode the woman’s brain signals into three forms of communication: text, synthetic voice, and facial animation on a digital avatar, including lip sync and emotional expressions. This represents the first time that facial animation has been synthesized from brain signals, and a paper detailing this research breakthrough is due to appear in the August edition of the science journal Nature.
The team was led by Edward Chang MD, chair of neurological surgery at UCSF, who has spent a decade working on brain-computer interfaces. They implanted a paper-thin rectangle of 253 electrodes onto the surface of the woman’s brain over areas his team has discovered are critical for speech. The electrodes intercepted the brain signals that, if not for the stroke, would have gone to muscles in her tongue, jaw, larynx, and face.
A cable, plugged into a port fixed to her head, connected the electrodes to a bank of computers, allowing AI algorithms to be trained over several weeks to recognise the brain activity associated with a vocabulary of over 1,000 words. Thanks to the AI, the woman could ‘write’ text, as well as ‘speak’ using a synthesised voice based on recordings of her real voice before she was paralysed.
The researchers also worked with Michael Berger, the CTO and co-founder of Speech Graphics, to decode this brain activity into facial movements. Speech Graphics’ AI-based facial animation technology – more commonly used to create realistic facial animation in video games including Halo Infinite, Hogwarts Legacy and The Last of Us Part II – simulates muscle contractions over time, including speech articulations and nonverbal activity.
This process is normally driven by audio input: the software analyzes the audio and reverse-engineers the complex muscle movements of the face, tongue and jaw that should have occurred while producing that sound. In one approach, the team used the subject’s synthesized voice as input to the Speech Graphics system in place of her actual voice to drive the muscles. The company’s real-time software then converted the muscle actions into 3D animation in a video game engine. The result was a realistic avatar of the subject that accurately pronounced words in sync with the synthesised voice as a result of her efforts to communicate.
However, in a second approach that is even more groundbreaking, the signals from the subject’s brain were meshed directly with the simulated muscles, allowing them to serve as an analog to the subject’s non-functioning muscles. She could also cause the avatar to express specific emotions and move individual muscles.
“Creating a digital avatar that can speak, emote and articulate in real-time, connected directly to the subject’s brain, shows the potential for AI-driven faces well beyond video games,” said Michael Berger, CTO and co-founder of Speech Graphics. “When we speak, it’s a complex combination of audio and visual cues that helps us express how we feel and what we have to say. Restoring voice alone is impressive, but facial communication is so intrinsic to being human, and it restores a sense of embodiment and control to the patient who has lost that. I hope that the work we’ve done in conjunction with Professor Chang can go on to help many more people.”
“We’re making up for the connections between the brain and vocal tract that have been severed by the stroke,” said Kaylo Littlejohn, a graduate student working with Chang and Gopala Anumanchipalli, PhD, a professor of electrical engineering and computer sciences at UC Berkeley. “When the subject first used this system to speak and move the avatar’s face in tandem, I knew that this was going to be something that would have a real impact.”
“Our goal is to restore a full, embodied way of communicating, which is really the most natural way for us to talk with others,” said Professor Chang, who is chair of neurological surgery at UCSF and a member of the UCSF Weill Institute for Neuroscience. “These advancements bring us much closer to making this a real solution for patients.”
The team hopes it will lead to an FDA-approved system that enables speech from brain signals in the near future.