découverte de connaissances pour la maintenance avionique, une approche d'apprentissage de concepts non supervisée

par Luis Palacios Medinacelli

Projet de thèse en Informatique


Sous la direction de Chantal Reynaud, Yue Ma et de Gaelle Lortal.

Thèses en préparation à Paris Saclay , dans le cadre de École doctorale Sciences et technologies de l'information et de la communication (Orsay, Essonne ; 2015-....) , en partenariat avec LRI - Laboratoire de Recherche en Informatique (laboratoire) et de Université Paris-Sud (établissement de préparation de la thèse) depuis le 01-06-2016 .


  • Résumé

    Le travail consistera à proposer une approche de fusion d'information basée sur l'induction et la construction d'un modèle de situation, reposant sur les axes suivants : Fusion d'information à base d'ontologie : Comparaison de l'alignement d'ontologies et de la fusion d'information de graphes conceptuels ; Combinaison de techniques d'alignement d'ontologies et de fusion d'information pour l'intégration de sources de données hétérogènes. Induction (semi)-automatique du modèle de situation : extraction de modèle de situation en logique de description ; extraction de modèle de situation à partir du langage de requêtes OBDA ; raisonnement automatique pour la révision du modèle de situation. Évaluation (effectuée en parallèle aux deux tâches précédentes) : exploitation des données issues des cas d‘utilisation (ASRS : Aviation Safety Reporting Systems ou GTD : Global Terrorism data Base) ; Méthodologie et expérimentations.

  • Titre traduit

    knowledge discovery for avionics maintenance, an unsupervised concept learning approach


  • Résumé

    Context: Due to massive information available in heterogeneous resources, filtering is now a necessary step to get more pertinent information. Situation models are used in a fusion process to filter information coming from different sources and keep only the relevant pieces of information. For this, the access to situation models is a necessity. For graph based information fusion approaches, it is often domain experts who provide situation models as an input of the system. This becomes labor-intensive when the domain is knowledge intensive. In order to avoid or minimize the efforts of knowledge engineers, approaches to constructing appropriate situation models of a good quality are necessary. Tasks: In the use-cases considered in this PhD work, we are given a set of instances that should be kept after filtering by a situation model. The work will consist in two aspects: extracting situation models from these interesting examples as well as revising situation models by automated reasoning. For this, a target language of situation models has to be defined in the first place. To benefit from advanced Ontology Based Data Access (OBDA) techniques, we plan to use OBDA query language as the target language, a well-known sub-class of SQL queries, called conjunctive queries (conjunctions of atoms). The considered query language extends certain description logics with an explicit use of variables in conjunctive queries. To extract situation model (semi-) automatically, we can then benefit from the approaches designed for ontology construction with significant challenges in tracking the identity of instances/variables in different atoms of the conjunctive query. This will be partially examined in Mr. Polla's master thesis. An important benefit of constructing such situation models is that the implementation of the proposed approach will further allow reasoning over background knowledge and database data (encoded in an ontology). The answers returned for a generated query will make explicit all the potential instances captured by the corresponding situation model. If a new instance kept by the situation model is unintended, it means a further revision of the situation model should be performed. This is the second important task planed for this PhD work. To realize this, we consider first identifying the explanations to the unintended answers. The aim is to identify the problematic parts of the generated situation model. This is different from most of existing work on justification finding that search for errors merely in ontological data instead of queries. Once the errors can be located in a query, different strategies to revising the situation model will be designed. A theoretical question in measuring the differences in their answer sets is worth explored. In this way, end users can be assisted to valid the generated situation model. The work will be evaluated on real-world graph fusion problem, such as GTD data fusion. The evaluation methodology and metrics will be defined to assess the quality of the approaches to situation model extraction and revision.