Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities
Inria / Raweb 2003
Project: ACACIA

Project : acacia

Section: New Results

Keywords : Knowledge Acquisition , Knowledge Engineering , Knowledge Management , Corporate Memory , Programming Environment Knowledge Server , World Wide Web , Semantic Web , XML , RDF , Conceptual Graph , Ontology , Information Retrieval .

Information Retrieval in a Corporate Semantic Web

We study the problems involved in the dissemination of knowledge through a knowledge server via Intranet or Internet: we consider the Web, and in particular the semantic Web, as a privileged means for the assistance to management of knowledge distributed within a firm or between firms. A knowledge server allows the search for information in a heterogeneous corporate memory, this research being intelligently guided by knowledge models or ontologies. It also allows the proactive dissemination of information by intelligent agents. We look further into the case of a memory materialized in the form of a corporate semantic Web, i.e. in the form of resources (such as documents) semantically annotated by RDF statements relating to an ontology.

Corese Semantic Search Engine

Keywords : Knowledge Acquisition , Knowledge Engineering , Knowledge Management , Corporate Memory , Programming Environment Knowledge Server , Semantic Web , XML , RDF , CommonKADS , Conceptual Graph , Ontology , Information Retrieval .

Participants : Olivier Savoie, Olivier Corby [correspondant], Francis Avnaim.

The Corese ODL (Software Development Operation) aims at increasing the impact and the diffusion of Corese. In this purpose, the ACACIA team wants to develop the quality of the Corese architecture (modularity, documentation, test, evolution,...), the quality of the application programming interface (API) and the quality of the global usability of the software.

The ODL began in June 2002 for two years.

We have been working on leveraging Corese towards RDF semantics evolution, in particular, in order to take into account literals with language tag and XML Schema datatypes.

We have developed:

The query processor has been extended :

An extension of the projection is proposed with path of length greater than one, bound by an integer :

x R (3) y ::= x R y OR x R t R y OR x R t1 R t2 R y.

After query, it is possible to group results as in the SQL ``group by'', and to count results :

We have introduced some statements from OWL :

A first distribution version of Corese is available for download on the Corese web page:

A first prototype of semantic web server built on top of Corese is available.

Figure 1. Corese

Adaptation of Corese for KMP

For the KMP project, we have designed a generic model of computation of equivalence classes among resources described in RDF. The KMP project needs to compute sets of equivalent competences which are composite objects. We define an equivalence relation, called similar. The relation extension is computed by inference rules that encode the conditions under which competences are equivalent. i.e. competences are equivalent if their components (action, environment, deliverable) are members of the same ontology subtree.

Then, we compute the equivalent competences by projection. Eventually, we compute the equivalence classes of equivalent classes with a new operator: the connected join which computes connected components of the equivalence relation.

Ontology-Guided Information Retrieval

Keywords : Conceptual Graph , XML , RDF , Semantic Web , Information Retrieval .

Participants : Carolina Medina-Ramírez, Rose Dieng-Kuntz.

This work was carried out in the framework of Carolina Medina-Ramírez's PhD [28] [29] [20].

The goal of this thesis was to give not only a translation process from languages using different semantic levels but also an environment for managing, capitalizing and distributing knowledge into an information retrieval framework.

The main contributions of this thesis concern three aspects: document semantic retrieval, documentary memory and conceptual graphs. In particular, for semantic document retrieval, we have proposed:

  1. A method to translate an ontology, annotations and queries represented in a pivot language towards the conceptual graphs while passing by an intermediate translation into RDF(S). This method is formalized by a translation regular grammar: Escrire -> RDF(S) -> CG.

  2. A base of inference rules for exploiting tacit knowledge underlying the Medline scientific abstracts that composed the test corpus.

For representing, handling, diffusing and querying a documentary memory, we have proposed :

  1. A knowledge server called EsCorServer allowing to retrieve documents from a documentary memory of gene interactions by a sequence of operations such as the normalization of queries, the filtering of information, the inference rules and the creation of virtual documents. We have used CORESE for the information retrieval.

  2. A method to create virtual documents in order to complete the results obtained from a request. This method exploits the query given by the user of EsCorServer and the annotations (possibly in various formats) available in the documentary memory.

For Conceptual graphs we have proposed :

  1. Algorithms to handle disjunction and negation in conceptual graphs queries.

Software Agents for Web Mining: Application to Technological and Scientific Watch

Participants : Tuan-Dung Cao, Rose Dieng-Kuntz.

Keywords : Multi agent system, Corporate memory, semantic web, web mining, ontology, semantic annotations, technological watch.

This work was performed in the context of the thesis of Tuan-Dung Cao.

The huge amount of information now available on line and accessible through the Web can be used for the technological and scientific watch of an enterprise. For knowledge management purposes in an organization or a community (especially, for technological or strategic watch), Web mining techniques can be particularly useful for discovering relevant information in the Web.

The objective of the thesis is to exploit agent technology to develop a multiagent system, these agents being guided by ontologies, to collect, capture, filter, classify and structure the contents of the Web coming from several sources of information, in a scenario of assistance to the technology watch at the CSTB (Center of Science and Technology for Building).

Initially, to delimit and to define the problem we studied the state of the art on multi-agent system, web mining, semantic web (XML, RDF(S), ontologies). This analysis enabled us to analyze the task of monitoring which is currently carried out by the documentalists of the CSTB and to understand the current monitoring system and the monitoring process including: phases, actors, types of information sources... We tried to identify where ontologies and agents could intervene and to propose a description of this task, by relying on Lesca's monitoring model. It will help us to choose a relevant scenario of monitoring and to build ontologies, which will guide the information search and extraction.

Then, we will propose a multi-agents architecture allowing to distribute the work of Web mining between several cooperating software agents, including "wrappers" on the sources of information in order to produce semi-automatically semantic annotation bases: we will offer an extension of our previous work [25] [26]. These annotations could then be exploited by agents in semantic search as in [23].

Semantic Web Technologies for a Health Care Network

Participants : David Minier, Frédéric Corby, Rose Dieng-Kuntz, Olivier Corby, Phuc-Hiep Luong, Laurent Alamarguy.

This work was performed in the framework of the Ligne de Vie project (detailed in section 7.2) and in the framework of the trainings of Frédéric Corby, Phuc-Hiep Luong and David Minier. The ACI Ligne de Vie project objective is to develop a knowledge management system for a health care network, so as to ensure care continuity and support to collaborative work of the actors of the network.

Our contribution consisted of:

Fuzzy Conceptual Graphs and Fuzzy RDF(S)

Participants : Phuc-Hiep Luong, Rose Dieng-Kuntz, Olivier Corby.

This work was carried out in the context of Phuc-Hiep Luong's DEPA final internship [33].

The current World Wide Web is showing its limitations with the explosion of information over the Internet. Many Knowledge Representation formalisms have been applied to exploit contents of Web resources and better reason on them. Conceptual Graphs (CGs) and RDF(S) language have shown limitations in expressing imprecise and uncertain information. We studied several extensions of these knowledge representation formalisms with purpose of providing a flexible expressivity. With the extended Fuzzy Conceptual Graphs and Fuzzy RDF(S) obtained by combination of fuzzy concepts and fuzzy set, Web documents can be interpreted in a way similar to human expressions and arguments. With the aim of providing a flexible expressivity, we need a way of representation with a degree of certainty by fuzzy set and fuzzy logic for reasoning. Relying on this idea, we have studied Fuzzy Conceptual Graphs and proposed an extension of RDF(S) with certainty degrees. This study was realized in the framework of the ACI Ligne de Vie project (see section 7.2) that aims at the development of an online system for managing patient's healthcare documents.