Thèse soutenue

Approche de traitement structuel pour la recherche d'information : une approche de fouille de documents juridiques semi-structurés

FR  |  
EN
Auteur / Autrice : JingTao Yao
Direction : Khaldoun Zreik
Type : Thèse de doctorat
Discipline(s) : Informatique
Date : Soutenance en 2010
Etablissement(s) : Paris 8

Résumé

FR  |  
EN

The use of legal documentary databases has become more and more open and frequent, leading to a fairly important "semi-automatic" feeding mode. Observation 1: we intend to make a "semi-automatic" process to deposit directly documents in databases, including indexing and classification with a limited human intervention. In fact, it is the documents templates (the logical and physical structures modeled by the markup language) that take an important place in the process of indexing and management. Observation 2 : in the presence of such a mass data (very often textual), it becomes essential to adopt an approach to manage the electronic legal documents as carriers of knowledge and expertise. This shifts the problem to domains of information retrieval and knowledge discovery. These two observations lead us to formulate an hypothesis for automatic classification that considers the knowledge and expertise incorporated in the structures of the legal electronic documents. That pilots us to an approach of clustering to discover decision-making clusters. We propose a representation method for semi-structured document who allows to analysis very precisely the knowledge and expertise incorporated in both contents and structures of document. The experiments upon a real legal corpus show that incorporation of content and structure produces a remarkable improvement of the quality of decision-making clusters.