Prediction across sensory modalities: A neurocomputational model of the McGurk effect

Olasagasti Rodriguez, Miren Itsaso; Bouton, Sophie; Giraud Mamessier, Anne-Lise

doi:10.1016/j.cortex.2015.04.008

Scientific article

English

Prediction across sensory modalities: A neurocomputational model of the McGurk effect

ContributorsOlasagasti Rodriguez, Miren Itsaso; Bouton, Sophie; Giraud Mamessier, Anne-Lise

Published inCortex, vol. 68, p. 61-75

Publication date2015

Abstract

The McGurk effect is a textbook illustration of the automaticity with which the human brain integrates audio-visual speech. It shows that even incongruent audiovisual (AV) speech stimuli can be combined into percepts that correspond neither to the auditory nor to the visual input, but to a mix of both. Typically, when presented with, e.g., visual /aga/ and acoustic /aba/ we perceive an illusory /ada/. In the inverse situation, however, when acoustic /aga/ is paired with visual /aba/, we perceive a combination of both stimuli, i.e., /abga/ or /agba/. Here we assessed the role of dynamic cross-modal predictions in the outcome of AV speech integration using a computational model that processes continuous audiovisual speech sensory inputs in a predictive coding framework. The model involves three processing levels: sensory units, units that encode the dynamics of stimuli, and multimodal recognition/identity units. The model exhibits a dynamic prediction behavior because evidence about speech tokens can be asynchronous across sensory modality, allowing for updating the activity of the recognition units from one modality while sending top-down predictions to the other modality. We explored the model's response to congruent and incongruent AV stimuli and found that, in the two-dimensional feature space spanned by the speech second formant and lip aperture, fusion stimuli are located in the neighborhood of congruent /ada/, which therefore provides a valid match. Conversely, stimuli that lead to combination percepts do not have a unique valid neighbor. In that case, acoustic and visual cues are both highly salient and generate conflicting predictions in the other modality that cannot be fused, forcing the elaboration of a combinatorial solution. We propose that dynamic predictive mechanisms play a decisive role in the dichotomous perception of incongruent audiovisual inputs.

Affiliation entities

Faculté de médecine / Section de médecine fondamentale / Département de neurosciences fondamentales

Research groups

Groupe Anne-Lise Giraud (939)

Citation (ISO format)

OLASAGASTI RODRIGUEZ, Miren Itsaso, BOUTON, Sophie, GIRAUD MAMESSIER, Anne-Lise. Prediction across sensory modalities: A neurocomputational model of the McGurk effect. In: Cortex, 2015, vol. 68, p. 61–75. doi: 10.1016/j.cortex.2015.04.008

Article (Published version)

Identifiers

PID : unige:75900
DOI : 10.1016/j.cortex.2015.04.008
PMID : 26009260

Journal ISSN0010-9452

687views

1045downloads

Creation01/09/2015 12:41:00

First validation01/09/2015 12:41:00

Update time14/03/2023 23:41:15

Status update14/03/2023 23:41:15

Last indexation31/10/2024 01:29:39

Archive ouverte UNIGE

Prediction across sensory modalities: A neurocomputational model of the McGurk effect

Technical informations