Couplage suivi et détection pour l'analyse de foules

par Jennifer Vandoni

Projet de thèse en Traitement du signal et des images

Sous la direction de Sylvie Le hegarat et de Emanuel Aldea.

Thèses en préparation à Paris Saclay , dans le cadre de Sciences et Technologies de l'Information et de la Communication , en partenariat avec SATIE - Systèmes et Applications des Technologies de l'Information et de l'Energie (laboratoire) , MOSS - Méthodes et outils pour les Signaux et Systèmes (equipe de recherche) et de Université Paris-Sud (établissement de préparation de la thèse) depuis le 01-12-2015 .

  • Titre traduit

    Coupling tracking and detection for crowd analysis


  • Résumé

    Crowd analysis is an emerging topic in computer vision, which is closely related to image processing, machine learning and multiple object tracking. Advances in this field benefit to a variety of problems, which may be more methodological, such as graph modelling for collective motion analysis [1], or more practical, such as enhancing the security of a specific location. In order to improve the current state of the art in large scale tracking, both in moderately crowded and densely crowded contexts, multiple fundamental issues have yet to be adressed: the strong occlusions, the detector performance and the scaling up of the data association algorithms. Even if multiple camera networks seem very attractive for coping with strong occlusions, detection cues must still be estimated in single camera vues in order to initialize more complex data fusion algorithms. This PhD topic is set in this single camera context. The PhD will be mainly focused on the improvement of pedestrian detectors in moderately and highly crowded contexts, where existing body part based detectors do not apply. From an image processing perspective, the line of work intends to identify relevant descriptors for the detection task which are able to grasp different but complementary information (such as for example in [2,3]) and to use them in a discriminative learning context [4], with extra constraints imposed by the dynamics of the objects. We will also investigate deep learning in order to circumvent the problem of hand-crafting and using multiple descriptors, but it rests to be seen whether the excellent performances of this technique for full body detection [5] may be transferred successfully in strongly occluded scenes. Regularization methods have been successfully used in order to improve the quality of the detection [5]. In our work, we intend to rely on Marked Point Processes (MPP) in order to enforce a prior on the spatial interactions among the particles; again MPP have been previously used in pedestrian detection, but in entirely different contexts [6] and we expect them to be particularly adapted to our task due to the constrained movement and the appearance of specific spatial interactions. Lastly, a key element is that in these difficult contexts with poor detector performance, detection and tracking should be coupled. For instance, we would like to explore the coupling either by using salient detections which are easier to track and propagate constraints in the detection map, or by employing altogether tracking-by-detection algorithms (which have currently the drawback of a significant computational cost for large scale problems).