Thèse de doctorat en Mathématiques et Informatique
Sous la direction de James L Crowley.
Soutenue le 07-05-2013
à Grenoble , dans le cadre de École doctorale mathématiques, sciences et technologies de l'information, informatique (Grenoble) , en partenariat avec LIG - Laboratoire d'Informatique de Grenoble (laboratoire) et de PRIMA - Perception, reconnaissance et intégration pour la modélisation d'activité (équipe de recherche) .
La thèse s'inscrit dans le domaine de la vision par ordinateur. Il s'agit, dans un environnement intérieur inconnu, partiellement connu ou connu de trouver la position et l'orientation d'une camera mobile en temps réel à partir d'une séquence vidéo prise par cette même camera. Le sujet implique également la reconstruction 3D de l'environnement. Les algorithmes de vision seront implémentés et testés sur des plateformes massivement parallèles. Processing the video sequence of a indoor camera in motion we have to find the position and angle of the camera in real time. We will use a single prime lens camera. It may involve an unknown, partially known or well known environment. A big part of the computation is the 3D reconstruction of the scene. The algorithms used to locate the camera will be implemented and tested on GPU.
Visual SLAM in indoor environment
In this thesis, we explore the problem of modeling an unknown environment using monocular vision for localization applications. We focus in modeling dynamic indoor environments. Many objects in indoor environments are likely to be moved. These movements significantly affect the structure and appearance of the environment and disrupt the existing methods of visual localization. We present in this work a new approach for modeling the environment and its evolution with time. We define explicitly the scene as a static structure and a set of dynamic objects. The object is defined as a rigid entity that a user can take, move and that is visually detectable. First, we show how to automatically discover new objects in a dynamic environment. Existing methods of visual localization simply ignore the inconsistencies due to changes in the scene. We aim to analyze these changes to extract additional information. Without any prior knowledge, an object is a set of points with coherent motion relative to the static structure of the scene. We combine two methods of visual localization to compare various explorations in the same environment taken at different time. The comparison enables to detect objects that have moved between the two shots. For each object, a geometric model and an appearance model are learned. Moreover, we extend the scene model while updating the metrical map and the topological map of the static structure of the environment. Object discovery using motion is based on a new algorithm of multiple structures detection in an image pair. Given a set of correspondences between two views, the method based on RANSAC extracts the different structures corresponding to different model parameterizations seen in the data. The method is applied to homography estimation to detect planar structures and to fundamental matrix estimation to detect structures that have been shifted one from another. Our approach for dynamic scene modeling is applied in a new formulation of place recognition to take into account the presence of dynamic objects in the environment. The model of the place consists in an appearance model of the static structure observed in that place. An object database is learned from previous observations in the environment with the method of object discovery using motion. The place recognition we propose detects the dynamic objects seen in the place and rejects the false detection due to these objects. The different methods described in this dissertation are tested on synthetic and real data. Qualitative and quantitative results are presented throughout the dissertation.