Team gemo

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Ontology-Based Information Retrieval

Participants : Nathalie Pernelle, Cédric Pruski, Chantal Reynaud, Mouhamadou Thiam, Yassine Mrabet.

Adaptive Ontologies for Web Information Retrieval

We address the problem of taking knowledge evolution for improving Web search in the sense of the relevance of the returned results. The advocated solution is based on the use of ontologies, cornerstone of the Semantic Web, for representing both the domain targeted by the query and the profil of the user who submits the query. Ontologies are considered as knowledge that is evolving over time. In consequence, the ontology evolution problem has to be tackled as regards the evolution of the target domain but also with respect to the evolution of the user's profile. This work is the core of a PhD [3] which was defended in April 2009. We introduced a new paradigm : adaptive ontology as well as a process for making adaptive ontologies smoothly follow evolution of the domain. The so-defined model relies on the adaptation of ideas developed in the field of psychology and biology to the knowledge engineering field. We proposed an approach exploiting adaptive ontologies for improving Web Information Retrieval. To this end, we first introduced data structures, WPGraphs and W3Graphs, for representing Web data. We the introduce the ASK query language tailored for the extraction of relevant information from these structures. We also propose a set of query enrichment rules based on the exploitation of ontological elements as well as adaptive ontologies characteristics of the ontology representing the domain targeted by the query and the one representing the view of the user on the domain. Lastly, we have devised a tool for managing adaptive ontologies and for searching relevant information on the Web as well as experimental validation of the introduced concepts. We based our validation on the definition of a realistic case study devoted to the retrieval of scientific articles published at the International World Wide Web series of conference.

Semantic Annotation

SHIRI is an ontology-based system for integrating semi-structured documents related to a specific domain. The system¿s purpose is to allow users to access to relevant parts of documents as answers to their queries. SHIRI uses RDF/OWL for representation of resources and SPARQL for their querying. It relies on an automatic, unsupervised and ontology-driven approach for extraction, alignment and semantic annotation of tagged elements of documents. We have developed and tested the SHIRI-Extract component, which exploits a set of named entity and term patterns to extract term candidates to be aligned with the ontology. It proceeds in an incremental manner in order to populate the ontology with terms describing instances of the domain and to reduce the access to extern resources such as Web. The results obtained on a HTML corpus related to call for papers in computer science show how that the number of terms (or named entities) aligned directly with the ontology increases as the method is applied [32] . For the SHIRI -querying, we have defined an order relation on the queries and validated it on two corpora [27] .


previous
next

Logo Inria