Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities
Inria / Raweb 2003
Project: TEXMEX

Project : texmex

Section: Scientific Foundations


The work within the team needs two kinds of competencies: to exploit the content of documents, one should first be able to access this content, i.e. to characterize or describe this content. One should also be able to use this description in order to fulfill a task related to these documents. Finally, both description and exploitation techniques must satisfy the needs of the user (and the proof of this simple fact is not trivial).

Finding a solution requires the use of document description techniques based on text, image or video processing (sound and speech processing are studied by the METISS team with which we closely collaborate.) It is also necessary to exploit the correlation and complementarity between the different media, since they do not bring the same information and do not share the same limitations.

After this description stage, it is necessary to exploit the descriptions to satisfy the user's query. At this second stage are needed sorting, indexing, retrieving algorithms which must provide good and fast results, two constraints usually opposite.

These two aspects are not independent and any solution with only one of the two aspects can not solve any real problem. The combination of the two in the context of large databases raises many difficult, but interesting, questions, and their solution may only come from a confrontation of people and ideas coming from both sides.