Section: New Results
Tool for Mapping Textual Documents
In 2008, the DRAST (Direction de la Recherche et des Affaires Scientifiques et Techniques), component of the French Ministry of Environment(Ministère de l'Ecologie, de l'Energie, du Développement Durable et de l'Aménagement du territoire), issued a grant for the design and implementation of a tool which automatically compares textual documents submitted in the context of proposals following generic requests.
The work undertaken, included in what was named Calfat-Ville Project, resulted in the end of 2008 in a crude prototype and a methodology which met DRAST's objectives.
The chosen methodology is related to the keyword technique as introduced in (Scot, M., 1997. PC analysis of key words - and key key words. System 25 (2), Elsevier, pp.233-245).
During 2009, the tool was improved by fixing minor bugs and most notably expanding the number of syntactic patterns used to extract syntagms. Also, it was intensively tested on various documents, including literary documents. The results of these tests show a necessity to evolve towards semantics to improve the methodology, by including some taxonomical resources (ontology, thesaurus, lexical network,...) in the process.
Work currently in progress includes the re-use of the extraction process in order to automatically produce such a taxonomy from the documents and obtain fully satisfying results.