Team mostrare

Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: Contracts and Grants with Industry

Contracts and Grants with Industry

RNTL ATASH (2006-2009)

Participants : Rémi Gilleron [ correspondent ] , Aurélien Lemay, Joachim Niehren, Marc Tommasi.

Atash is a French industrial project supported by the “Agence Nationale de la Recherche (ANR)”. It is a collaboration with the Xerox Research Center Europe xrce in Grenoble and the lip6 laboratory. The objective is the design of learning algorithms for tree transformations and their implementation for data integration of documents (PDF, html, doc) in XML databases according to a target DTD. The project began in 2006. The TreeCRF ( ) library and the Im2 ${\#120449 ^2\#120450 ^2}$ software were developed in the project. Mostrare has developped a library TreeCrf ( ) that implements conditional random fields for XML data. The library has been used in the R2S2 web application. RS2S is a machine learning application that builds tree transformations from HTML to RSS (an XML dialect). Its aim is to provide personalized feeds from web sites. TreeCrf has also been used to query hidden web in a collaboration with GEMO.

RNTL Webcontent (2006-2009)

Participants : Rémi Gilleron, Marc Tommasi, Fabien Torre [ correspondent ] .

Webcontent is a french industrial project supported by the “Agence Nationale de la Recherche (ANR)”. It involves academic partners and companies. The objective is to develop a platform for Web document processing and semantic Web. The main goal of Mostrare was to create an extensible Web Service framework for Web information extraction. This software, called Miele , and was integrated in the Webcontent platform in january. Miele mainly allows to create wrappers for table extraction from Web documents. The deliverable includes a set of user interface tools (www browser plugins) and implementation of wrapper inference algorithms developed in the Mostrare team: Squirrel containing methods based on query induction using grammatical inference and paf containing methods based on supervised classification algorithms.

Cifre Xerox (2009-2011)

Participants : Jean-Baptiste Faddoul, Rémi Gilleron, Fabien Torre [ correspondent ] .

R. Gilleron with F. Torre started supervising the PhD thesis (Cifre) of Jean-Baptiste Faddoul with B. Chidlovski from Xerox's European Research Center (xrce ).


Logo Inria