Analyse automatique des macros et micro-expressions faciales: détection et reconnaissance par deep learning

par Dawood Al Chanti

Projet de thèse en Signal image parole telecoms

Sous la direction de Alice Caplier.

Thèses en préparation à Grenoble Alpes , dans le cadre de École doctorale électronique, électrotechnique, automatique, traitement du signal (Grenoble) , en partenariat avec Grenoble Images Parole Signal Automatique (laboratoire) et de Architecture, Géométrie, Perception Images Gestes (AGPIG) (equipe de recherche) depuis le 01-10-2016 .


  • Résumé

    Le visage est un vecteur formidable de nos émotions. Il nous sert de moyen pour montrer aux autres notre état d'esprit via des déformations franches du visage que l'on qualifiera de macro expressions. A l'inverse, dans certaines situations, nous souhaitons cacher nos réactions émotionnelles en tentant de conserver un visage impassible. Dans un tel cas, malgré tous nos efforts, il demeure bien souvent des microdéformations très subtiles et très rapides sur notre visage, ce sont les micro-expressions. Dans cette thèse sont développés des algorithmes destinés à détecter et reconnaître d'une part les macro-expressions et d'autre part les micro-expressions. Pour la reconnaissance automatique des macro-expressions, nous avons mis en oeuvre des descripteurs spatio-temporels définis manuellement (LBP, LBP-TOP, SIFT, 3D-SIFT, HOG, 3D-HOG) associés d'une part à une méthode de type représentation éparse et d'autre part à une méthode à base de sac de mots visuels. Nous proposons d'aborder le problème de la modélisation de la dynamique des expressions faciales spontanées en prenant en compte deux catégories de caractéristiques spatio-temporelles: les informations spatio-temporelles locales et globales apprises par apprentissage profond. Pour la détection des micro-expressions dans une séquence vidéo, nous proposons un nouvel algorithme par apprentissage profond afin d'en modéliser en particulier les informations spatiales et temporelles. Les algorithmes proposés sont testés sur des bases de macro expressions et de micro expressions de l'état de l'art et les résultats obtenus dépassent ceux de l'état de l'art.

  • Titre traduit

    Automatic Analysis of Facial Macro- and Micro-Expressions: Detection and Recognition via Deep Learning.


  • Résumé

    Face is an ideal prime facet to express and communicate mental states such as emotions, thoughts, and desires in natural situations. The occurrence of natural emotions can leads to spontaneous Macro-Facial Expression movements such as a smile when the emotion is acknowledged by the expresser's. On the other hand, sometimes emotions can be suppressed while being obscured by the expresser's during unpleasant situations, leading to leakage of Micro-Facial Expression movements. Detecting and recognizing non-verbal cues such as Macro expression is effortless for human (above 95% accuracy rate for human performance) but it is a challenge for computer due to many critical issues, for instance unconstrained environments and the variability of expressing spontaneous expressions. Yet, recognizing Micro expressions is very challenging for both human (around 45% accuracy rate for human performance) and computer due to its fast occurrence, brief duration and subtle spatial movements. Macro- and Micro-Facial expressions plays an important role in interpersonal relations. Automatic detection and recognition of these expressions are two important steps towards building emotionally intelligent systems. The valuable part of emotionally aware system could be as intelligent companions for humans, e.g. helping individual who are visually impaired ot have Autism spectrum to read and interpret other emotions or predicting the presence or lack of cognitive attention during tasks such as driving. In this thesis, an attempt to build a fully automated system that detect and recognize a person's affective state via both Macro and Micro Facial Expressions in space and time is conducted. Consequently, toward this goal: First we examine the possible information channels that could yields for good representation of emotions, mainly static images versus dynamism; Second, we evaluated the state-of-the-arts most common techniques to detect facial expressions features: the geometry-based approach via Bag-of-Visual words and appearance-based approach via sparse representation, and we contributed in devising two new enhanced methods; Third, regard obtaining a reliable spatial or spatiotemporal features representation, we evaluated the performance of handcrafted features such as LBP, LBP-TOP, SIFT, 3D-SIFT, HOG, 3D-HOG descriptors and we compare it with our major contribution wherein we propose to tackle the problem of modelling the dynamic of spontaneous facial expressions by taking into account two categories of spatiotemporal features: local and global spatiotemporal information that are learned via deep learning framework. Forth; we propose a new approach for spotting the Micro-Facial expressions segment in time and then within the detected segment we localize the occurrence of Micro-expressions in the image space, and for that recurrent convolutional autoencoder alongside Gaussian Mixture Model is devised. Finally, we deploy visual attention mechanism in an attempt toward complete analysis and understanding for the learned features via deep learning. We validated our algorithms over various and distincts databases: First regarding Macro-Expressions study, JAFFE, DynEmo, CK+, MMI and DIFSA databases are used; Second regarding Micro-Expressions study: SMIC-HS, CASME-I and CASME-II databases are used. Our results are compared with other approaches and we obtained better performance.