Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities
Inria / Raweb 2004
Project: ACACIA

Project : acacia

Section: New Results

Keywords : Knowledge Acquisition, Knowledge Engineering, Knowledge Management, Corporate Memory, Knowledge Server, Semantic Web, XML, RDF, OWL, Conceptual Graph, Ontology, Information Retrieval.

Information Retrieval in a Corporate Semantic Web

We study the problems involved in the dissemination of knowledge through a knowledge server via Intranet or Internet: we consider the Web, and in particular the semantic Web, as a privileged means for the assistance to management of knowledge distributed within a firm or between firms. A knowledge server allows the search for information in a heterogeneous corporate memory, this research being intelligently guided by knowledge models or ontologies. It also allows the proactive dissemination of information by intelligent agents. We look further into the case of a memory materialized in the form of a corporate semantic Web, i.e. in the form of resources (such as documents) semantically annotated by RDF statements relating to an ontology.

Corese Semantic Search Engine

Keywords : Knowledge Acquisition, Knowledge Engineering, Knowledge Management, Corporate Memory, Knowledge Server, Semantic Web, XML, RDF, Conceptual Graph, Ontology, Information Retrieval.

Participants : Olivier Corby [responsible], Olivier Savoie.

Semantic distances and clustering

Keywords : ontologies, semantic distance, approximate search.

Participant : Fabien Gandon.

Most of the conceptual structures used in knowledge-based systems essentially rely on a logical formalization of the knowledge. However focusing on the logical implications lead knowledge-based systems to ignore some characteristics of the conceptual structures of people. One of the things that graph-based formalisms underline is an isomorphism between graph-distances or geometric distances in the representation and natural conceptual distances between the notions they represent and articulate. In other words, two notions geometrically close in the graphical representation are supposed to be intuitively close in the mind of the modelers. This closeness is a characteristic that can be exploited, for instance, to improve information retrieval in the form of constraints relaxation to closest notions. To do so we are studying algorithms to simulate conceptual distances using the ontological tree and we are applying it in particular to approximated search and result clustering.

Visualization surrogates for conceptual structures

Participant : Fabien Gandon.

Here we address a problem faced in many projects: the generation of semiotic representations for conceptual structures such as the annotations, and query results on the semantic Web. Drawing on the parallel between the patterns of such surrogates and the notion of identity conditions, we proposed and explained a mechanism exploiting the semantic Web frameworks to automate the generation of templates for these surrogates. We showed how these templates improve representation, for instance when viewing the results of a query. The approach focused on generating templates providing the properties to include in a surrogate, regardless of the way it is rendered (text, graphics, speech, etc.).

Our goal was to detect a maximum of these properties that were potentially interesting, then fine tuning can take place. Our approach and implementation relied on rules because the Corese platform of the ACACIA team is based on conceptual graphs and graph rules. In other platforms offering other formalization means or insights in the ontology engineering process, other sources than rules could be exploited to derive surrogate properties from identity conditions. Our point here is that the semantic web will have to be dynamic and will use: the users' profile, the context and history of interactions, semiotic modeling primitives added to our meta-model, signs linked to the primitives of our ontologies, logics of semiotics and surrogate generation, in addition to the conceptual structures to be communicated to the users.

Software Agents for Web Mining: Application to Technological and Scientific Watch

Keywords : Multi agent system, Corporate memory, semantic web, web mining, ontology, semantic annotations, technological watch, technological monitoring.

Participants : Tuan-Dung Cao, Rose Dieng-Kuntz.

This work was performed in the context of the thesis of Tuan-Dung Cao.

Technological Watch or Technology Monitoring is now recognized as a crucial activity for achieving and maintaining competitive positions in a rapidly evolving business environment. It serves the purpose of identification and assessment of technological advances critical to the company's competitive position, and of detecting changes and discontinuities in existing technologies. The rise of Internet supported the appearance of much information available on line, potentially useful for the technological and scientific survey of an enterprise. Within the framework of knowledge management of an organization or a community, the Web mining can be particularly useful when it is applied by a multi agent system to discover in the Web of relevant information, at ends of the technological or strategic watch.

The objective of the thesis is to exploit technology agents to develop a multiagent system, these agents being guided by ontologies, to collect, capture, filter, classify and structure the contents of the Web coming from several sources of information, in a scenario of support to technology watch at the CSTB (French Scientific and Technical Center for Building).

First of all, we analysed the task of monitoring for the field considered (construction and building) which is carried out at CSTB to choose a relevant scenario of monitoring and to build an ontology which will guide the search and the extraction of information. On the one hand, this ontology inherits the vocabulary in the ontology O'CoMMA which was developed for the CoMMA European IST project (2000-2001). On the other hand we added concepts and relations concerning not only the field of Construction but also the actors, tasks, and information sources in the technological monitoring process too.

After identifying the important roles of ontology in each phase in technological monitoring process, we proposed an ontology based approach for building an information system supporting technology monitoring implemented by agents. One of the most important work in this system is to find out useful resources on the Web, and then annotate them using the ontology so that user can retrieve them easily through the semantic search engine Corese.

To do so, we proposed an algorithm using ontology to search the Web with Google and then generate automatically the RDF annotations from these results of Google [28]. The algorithm has been implemented and is currently in the phase of test. As further work, we will continue to test our algorithm and extend it to improve the results. In parallel, we will design and implement a subsociety of annotator agents encapsulating this algorithm, working in cooperation with other agents dedicated to other tasks in the technological monitoring system.


Logo Inria