Team LeD

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Other Grants and Activities
Dissemination
Bibliography

Section: Other Grants and Activities

National level

Acquiring and validating subcategorisation lexicons for French and Polish

Description: The aim of the project is to develop and evaluate subcategorisation lexicons for French and Polish and to do this by combining the complementary strengths of both partners. Indeed while the french side has extensive expertise in the design, formatting and normalisation of linguistic resources, the polish side is already engaged in the exploration of statistical and corpus based techniques for the acquisition of subcategorisation information. The project will aim at producing, and then evaluating, the lexicon using a blend of symbolic and statistical methods. While the french side will provide specifications concerning the content and format of the lexicons, the polish side will provide expertise on the statistical acquisition of subcategorisation information from corpora.

Administrative context: Projet PAI Polonium

Period: start 2006-01-01 / end 2007-31-12

Web site:

Partner(s): LED/LORIA and Institute of Computer Science PAS, Warszawa, Poland

Contact: Claire Gardent

Lexsynt

Description: The aim of the project is to develop a syntactic and semantic lexicon for french that is open source and adapted for Natural Language Processing.

Administrative context: Projet ILF (Institut de la Langue Francaise)

Period: start 2005-01-01 / end 2008-31-12

Web site: http://lexsynt.inria.fr

Partner(s): LED/LORIA, Calligramme/LORIA, Atoll/INRIA Rocquencourt, Laboratoire de Linguistique Formelle/Paris 7, Laboratoire Parole et Langage/U. de Provence, Signes/INRIA Futurs, ATILF/UMR 7118, ERSS/UMR 5610, IGM/UMR 8049, LPL/UMR 6057, Lattice/UMR 8094, Modyco/UMR 7114, ATV/K. U. Leuven, OLST/Université de Montréal

Contact: Claire Gardent

MEDIA

Description: Within the framework of the French Technolangue project, several campaigns for evaluating different approaches to natural language processing were carried out on various topics (for example, parsing and natural language understanding). One of these campaigns, the EVALDA-MEDIA project, aimed to evaluate the ability of a dialogue system to understand spoken utterances produced in a real dialogue context. For a given task (here, hotel-booking transactions), a consortium of eight academic or industrial research laboratories carried out the transcriptions of 1200 dialogs collected using the Wizard of Oz protocol. The corpus has been manually annotated and verified and two evaluation protocols have been elaborated. The context-independent evaluation consists in producing the semantic annotation of isolated utterances extracted from their dialogue history (but still interpreted within the transactional context). For the context-dependent evaluation, each utterance has to be interpreted with the dialogue context, and referential expressions have to be resolved. Neither evaluation explicitly concerns pragmatic understanding, such as for example speech act recognition or dialogue structuring.

For these evaluations, the LED team developed a Natural Language Interpretation system using a fully symbolic approach. Utterances are first parsed with a flexible TAG grammar, then interpreted against the task ontology. Finally, the internal semantic representation (a graph in the Multimodal Interface Language) is converted to a MEDIA semantic formalism. The output is compared with the manual annotation.

Our participation in the MEDIA campaign enabled us to validate the relevance of our fully symbolic approach for the design of robust dialog systems.

Administrative context: Programme Technolangues, Campagne EVALDA

Period: start 2002-12-04 / end 2006-12-04

Contact: Matthieu Quignard

Mosaique

Description: The project aims to develop a high level formalism for describing the syntax and semantics of natural language; to provide compilers for this formalism that supports the compilation of the high level formalism into several operational ones as needed for the various existing theories of syntax; and to validate the grammars thus obtained on hand built and real text testsuites.

Administrative context: Action de Recherche Concerté INRIA

Period: start 2006-01-01 / end 2007-31-12

Web site: http://mosaique.labri.fr

Partner(s): LED/LORIA, Calligramme/LORIA, Atoll/INRIA Rocquencourt, Laboratoire de Linguistique Formelle/Paris 7, Laboratoire Parole et Langage/U. de Provence, Lattice/Paris 7, Modyco/Paris X, Signes/INRIA Futurs, TALN/Nantes

Contact: Claire Gardent


previous
next

Logo Inria