Modélisation bayésienne du développement conjoint de la perception, l'action et la phonologie

par Marie-Lou Barnaud

Thèse de doctorat en Ingénierie de la cognition, de l'interaction, de l'apprentissage et de la création

Sous la direction de Jean-Luc Schwartz, Pierre Bessière et de Julien Diard.

Le président du jury était Laurent Besacier.

Le jury était composé de Jean-Luc Schwartz, Pierre Bessière, Julien Diard, Janet Breckenridge Pierrehumbert.

Les rapporteurs étaient Sharon Andrea Peperkamp, Francis Colas.


  • Résumé

    A travers les tâches de perception et de production, les humains peuvent manipuler non seulement des mots et des phrases mais également des unités de plus bas niveau tels des syllabes et des phonèmes. Les études en phonétique sont principalement focalisées sur ces seconds types d'unitées. Un des objectif majeur dans ce domaine et de comprendre comment les humains acquiert et manipulent ces unités.Dans cette thèse, nous nous intéressons à cette question à travers l'utilisation de la modélisation computationnelle en réalisant des simulation informatiques à l'aide d'un modèle bayésien de la communication nommé COSMO (“Communicating Objects using Sensory-Motor Operations”). Nos études s'étendres à trois aspects.Dans une première partie, nous étudions les représentations cognitives des unités phonétiques. Il est maintenant bien établie que les unités sont caractérisées par des représentations auditives et motrices. En examinant leur rôle respectifs durant le développement, nous établissons leur complémentarité à travers ce que nous nommons la propriété <<bande étroite/bande large>>.Dans une seconde partie, nous nous intéressons à la variabilité des unités phonétiques, notamment à travers l'étude de la corrélation des idiosyncrasies en perception et en production. En comparant plusieurs conditions de développement, nous établissons qu'elles s'acquiert à travers un processus de reproduction des catégories plutôt qu'à une répétition des sons.Dans une troisième partie, nous analysons la nature des catégories phonétiques. En phonétique, il y a un débat autour du statut des syllabes vs. des phonèmes dans la communication de la parole. Dans nos simulations, nous examinons leurs acquisitions respectives à travers un apprentissage non supervisée et montrons les particularités nécessaires à la communication.

  • Titre traduit

    Bayesian modeling of the joint development of perception, action and phonology


  • Résumé

    Through perception and production tasks, humans are able to manipulate not only high-level units like words or sentences but also low-level units like syllables and phonemes. Studies in phonetics mainly focus on the second type of units. One of the main goal in this field is to understand how humans acquire and manipulate these units and how they are stored in the brain. In this PhD thesis, we address this set of issues by using computer modeling, performing computer simulations with a Bayesian model of communication, named COSMO (“Communicating Objects using Sensory-Motor Operations”). Our studies extend in three ways.In a first part, we investigate the cognitive content of phonetic units. It is well established that phonetic units are characterized by both auditory and motor representations. It also seems that these representations are both used during speech processing. We question the functional role of a double representation of phonetic units in the human brain, specifically in a perception task. By examining their respective development, we show that these two representations have a complementary role during perception: the auditory representation is tuned to recognize nominal stimuli whereas the motor representation has generalization properties and can deal with stimuli typical of adverse conditions. We call this the “auditory-narrow/motor-wide” property.In a second part, we investigate the variability of phonetic units. Despite the universality of phonetic units, their characterization varies from one person to another, both in their articulatory/motor and acoustic content. This is called idiosyncrasies. In our study, we aim at understanding how they appear during speech development. We specifically compare two learning algorithms, both based on an imitation process. The first version consists in sound imitation while the second version exploits phoneme imitation. We show that idiosyncrasies appear only in the course of a phoneme imitation process. We conclude that motor learning seems rather driven by a linguistic/communication goal than motivated by the reproduction of the stimulus acoustic properties.In a third part, we investigate the nature of phonetic units. In phonetics, there is a debate about the specific status of the syllable vs phoneme in speech communication. In adult studies, a consensus is now found: both units would be stored in the brain. But, in infant studies, syllabic units seem to be primary. In our simulation study, we investigate the acquisition of both units and try to understand how our model could “discover” phonemes starting from purely syllabic representations. We show that contrary to syllables and vowels, consonants are poorly characterized in the auditory representation, because the categories overlap. This is due to the influence of one phoneme on its neighbors, the well-known “coarticulation”. However, we also show that the representation of consonants in the motor space is much more efficient, with a very low level of overlap between categories. This is in line with classical theories about motor/articulatory invariance for plosives. In consequence, phonemes, i.e. vowels and consonants, seem well displayed and likely to clearly emerge in a sensory-motor developmental approach such as ours.Through these three axes, we implemented different versions of our model. Based on data from the literature, we specifically cared about the cognitive viability of its variables and distributions and of its learning phases. In this work, modeling computation has been used in two kinds of studies: comparative and explanatory studies. In the first ones, we compared results of two models differing by one aspect and we selected the one in accordance with experimental results. In the second ones, we interpreted a phenomenon observed in literature with our model. In both cases, our simulations aim at better understanding data from the literature and provide new predictions for future studies.


Il est disponible au sein de la bibliothèque de l'établissement de soutenance.

Consulter en bibliothèque

La version de soutenance existe

Où se trouve cette thèse\u00a0?

  • Bibliothèque : Université Savoie Mont Blanc (Chambéry-Annecy). Service commun de la documentation et des bibliothèques universitaires. Bibliothèque électronique.
  • Bibliothèque : Université Grenoble Alpes. Bibliothèque et Appui à la Science Ouverte. Bibliothèque électronique.
Voir dans le Sudoc, catalogue collectif des bibliothèques de l'enseignement supérieur et de la recherche.