Section: Contracts and Grants with Industry
Grants with Industry
EIFFEL: RNTL Project (2006-2009)
Keywords : tourism, Semantic Web, ontology, web usage mining, personnalisation.
Participants : Abdouroihamane Anli, Marie-Aude Aufaure, Zeina Jrad, Yves Lechevallier [ resp. ] , Bernard Senach, Brigitte Trousse.
The EIFFEL project related to Semantic Web and e-Tourism was labelled in 2006 by the RNTL program and started this year. Industrial partners are Mondeca and Antidot (leader) and academic partners are LIRMM, University of Paris X (Nanterre) and INRIA.
The main goal of the Eiffel project is to provide users with an intelligent and multilingual semantic search engine dedicated to the tourism domain. This solution should allow tourism operators and local territories to highlight their resources; the end users will then use a specialised research tool allowing them to organize their trip on the basis of contextualized, specialised, organised and filtered information. Queries and results will be guided by user profiles extracted from usage analysis. These profiles will facilitate the access to distributed and highly heterogeneous data. In this project, AxIS is in charge of the sub-package entitled SP8 and will define new paradigms dedicated to knowledge searching and visualizing, and will extract and exploit users' models and profiles from web logs.
A second deliverable (june 2008) concerns the preparation of web log data for the pattern discovery task. Our approach consists on using WUM techniques from AxisLogMiner in order to extract information from log files of tourism websites.
As we mentioned previously, in the Eiffel project we acquire our data from many sources. However, data derived from server web logs are the most relevant. A log file is a plain text file arranged in a particular format usually containing the host name (user IP), the date, the requested resource (page), the status (success, failure, error, etc.), the referrer and the user agent (browser identification).
The two main objectives of web log processing are (i) structuring data and (ii) improving the quality of data. In fact, web log files are not structured to be directly usable by datamining tools and they contains many noisy entries (for example, robots bring a lot of noise in the analysis of user behaviour, then it is important in this case to identify robot requests).
INTERMED: ANR Techlog Project (2008-2010)
Participants : Nicolas Faure, Celine Fiot, Cristina Isai, Julie Marlier, Bernard Senach, Elena Smirnova, Brigitte Trousse [ resp. ] .
The Intermed project in response to the ANR TechLog call for proposals has been accepted in 2007. The Intermed kick-off meeting previously planned in December 2007 took place in April 2008. Academic partners are Cemagref (G-EAU and TETIS) LIRMM, CEPEL and industrial partners are SCRIPTAL, SIRENA, Normind, PIKKO. The aim of the InterMed project is to design and implement a set of tools fitting the requirements of users in charge of territory planning.
The goal is to use appropriate technologies to establish a functional link between citizen and local authorities. The technologies we are looking for will be progressively adapted to deal with human factors and constraints of the "field". The proposed experimental approach will rely on several iterations and active participation of people involved in the discussions.
The project follows a participatory design methodology in which involvement of professionals and final users is continuous and several meetings have been conducted with them (Users requirements meetings and so-called Evocation meeting in which solution options are introduced and discussed according to previously identified users' needs).
A first experiment has been conducted with scholars and two experiments are currently settled up. In the first experiment, scholars had to engage themselves in a role play :
-
in a first step, they discussed with different professionals to understand the issues of a specific action of territory planning
-
then they had to debate about the actions in a role play, half of the groups with Intermed tools. The experiments aims at understanding how a new tool can suit people and what can be its effects on the debate (argument quality and quantity, improvement of decisions, speed of decision).
AxiS participated to several meeting on the fields (Narbonne, Thau, Camargue) and to others Intermed meetings.
The technical participation of the AxIS team within the Intermed project in 2008 is related to the major aspect of usage analysis within a context of collaborative and debating tools.
For this first year experiments, the Intermed Web platform was designed as a composition of several modules using different technologies and databases for storing on one server user profile and document information and on another user actions and interactions.
Therefore the usage analysis on the global platform necessitate to integrate all the data sources within an analysis database with two requirements : a) to process the data generated by the use of the application and b) to understand what final users, politics and other partners would like to know about the collaborative work participants.
Then, the first step of this work was to study the technologies designed for the debating interface and collaborative work. As we need to be able to use every kind of data generated by using the platform (click stream on the standard Web interface, actions calling to Java embedded application), a list of tools was suggested to be integrated to the different modules in order to help tracking user actions and interactions on the Web plateform.
The second step was the design of a preprocessing database in order to unify and store all usage data coming for the different data sources, but also the preliminary data, regarding user profile (location, age, education,...) and debate information (what documents are available, who created them, what should be modified, etc.).The modeling of this database should also take into account the analysis that will be done: categorization and profiling of user and usage, but also social interaction and social network analysis, interaction in the debate.
The implementation of this model is currently in progress and should also integrate some dynamic parts that will be build depending on the debate or collaborative work to be analyzed. In fact each participation to a debate can be structured by some metadata specific to the section of the work. These metadata can be used to analyze more specifically the interactions in some specific context. Therefore we want to integrate them to the analysis database. Imported data are RDF annotations of which the schemata will be extended. Then the RDF schema will be used to dynamically create some part of the database.
We participated to two delivrables [65] and [62] .
MIDAS: ANR MDCA Project (2008-2010)
Participants : Yves Lechevallier, Alice Marascu, Florent Masseglia [ resp. ] , Brigitte Trousse, Chongsheng Zhang.
The MIDAS project “Mining Data Streams”, granted by ANR, started on January 2008 and will complete on December 2010. Partners are Ceregmia, EDF, France Telecom R&D, Lirmm, Telecom ParisTech and Inria.
The MIDAS project aims at studying, developing and demonstrating new methods for summarizing data streams. It tackles the following scientific challenges related to the construction of summaries:
-
Summaries are built from infinite streams but must have a fixed or low increasing size;
-
The construction of summaries must be incremental (done on the fly );
-
The amount of CPU used to process each element of the streams must be compatible with the arrival rate of the elements;
-
The summaries must cover the whole stream and enable to build summaries of any past part of the history of a stream.
In 2008, we participated in a delivrable written by all partners, on the related works [63] . AxIS was responsible of Chapters 2 and 3, and involved in Chapter 8.