It is increasingly realistic, from a technological point of view, to imagine decoding the thoughts of humans. For the first time, neuroscientists were able to "decode" noninvasive imaging data from the brains of three participants to reconstruct sequences of words and the overall meaning of stories that participants had heard, watched or imagined.
In this new study, published in Nature Neuroscience, Alexander Huth and his colleagues, from the University of Texas, succeeded in extracting the overall meaning and as well as sentences, from images of brain activity obtained by functional magnetic resonance imaging (or functional MRI).
Decode the language
Synthesizing words through brain signals could be very useful for people who lack access to speech due to illnesses such as motor neuron diseases, which affect the neurons controlling the voluntary movements of the body. This research also raises questions about the existence of our most intimate private life, that of our thoughts.
The language decoding patterns, or "speech decoders", seek to use recordings of brain activity to deduce the words that subjects hear, say or imagine.
Until now, language decoders have only been used on data obtained from devices implanted in the brain, which limits their usefulness. And so far, decoders using noninvasive recordings of brain activity have decoded single words or very short sentences, but have not been applied to extracting meaning from continuous speech.
In this new study, it is a very specific signal from functional MRI that has been exploited: this signal depends on the flow of blood in the brain and the level of oxygenation of the blood.
By focusing on brain activity in regions of the brain and in neural networks that are known to process language, the researchers showed that their decoder could be trained to reconstruct continuous speech, including specific words but also meaning. aggregate of more complete sentences.
The decoder used the brain responses of three participants recorded as they listened to stories, and it generated sequences of words that could have produced the recorded brain activity. These word sequences reproduced the general idea of the story quite well, and in some cases even included exact words or phrases.
Inside the functional MRI scanner, participants were also led to watch silent movies and imagine the corresponding stories. In both cases, the decoder was able to predict most of the stories.
For example, one participant thought "I haven't incurred my driver's license" (I don't have my driver's license yet), and the decoder predicted "She hasn't even learned to drive yet" (She has not even started to learn to drive yet).
Additionally, when participants had to actively listen to one story while ignoring a simulcast second story, the decoder was only able to identify the first plot.
How it works ?
First, the scientists asked the participants to spend 16 hours in a functional MRI scanner, where they listened to stories read while their brain activities were recorded.
These brain responses were used to train an “encoder,” which is a computer model that predicts how the brain responds to words heard by the participant. After this training, the encoder can predict with good accuracy what each participant's brain would respond to when listening to a sequence of particular words.
But going in the other direction, ie extracting a sequence of words from brain activity, is much more difficult.
Indeed, the encoder model is designed to relate brain activities and “semantic elements” or the overall meaning of words or sentences. To achieve this, the system uses the “GPT” language model, for generative pre-trained transformer, the precursor to today's GPT-4. The decoder then generates the sequence of words that could have produced the observed cerebral response.
The correctness of each decoder prediction is checked by using it to calculate the corresponding brain activity. This brain activity predicted by the decoder is then compared to that which was actually recorded.
During this process, which consumes a lot of computing resources, many predictions are generated one by one, and they are classified according to their accuracy: the inappropriate predictions are eliminated, the most accurate are kept. Then, the next word in sequence is predicted, until the most correct sequence is determined.
words and meaning
The new study shows that, to carry out the prediction process, data from multiple regions of the brain was needed. These regions are diverse but very specific: they are the speech network, the parietal/temporal/occipital association region and the prefrontal cortex.
A major difference between this work and previous studies is the type of data used. Indeed, most decoders link data from brain regions involved in the final stage of speech formation, namely mouth and tongue movements. This decoder works on another level, on ideas and the meaning of thoughts.
One of the limitations of functional MRI data is their low “temporal resolution”. Indeed, the blood oxygenation signal rises and falls in about 10 seconds, a period during which we hear about twenty words or more. Therefore, this technique does not detect individual words but the likely meaning of sequences of words.
We don't panic (not yet)
The idea that we can read minds naturally raises concerns about the existence of our most intimate private lives, what goes on in our heads. The researchers performed additional experiments to clarify the capabilities of the technique.
These experiments show that there is still no need to worry about our thoughts being read when we are walking down the street, or if we are unwilling to cooperate.
Indeed, a decoder trained on one person's brain data poorly predicts semantic elements from another person's brain data. Additionally, participants can complicate the decoding task by turning their attention to another task, such as naming animals or telling another story.
The decoder also malfunctions if participants move in the functional MRI scanner, as this is a very motion-sensitive imaging technique. The cooperation of the participants is essential here.
With these technical constraints, in addition to the need for very powerful computers to run the decoder, it is very unlikely at this stage that anyone's thoughts can be decoded against their will at this stage.
Finally, the decoder only works for the moment with data obtained by functional MRI, which is a costly and often tricky technique to implement. The research group intends to test this method with data from other non-invasive brain imaging technologies.
Image credit: Shutterstock / ORION PRODUCTION