SynLex: Extracting a Syntactical Lexicon from the LADL Tables

Maurice Gross' grammar lexicon [64] contains extremely rich and exhaustive information about the morphosyntactic and semantic properties of French syntactic functors (verbs, adjectives, nouns). Yet its use within natural language processing systems is still restricted.

The aim of our work is to translate this information into a format which is more suitable for use by NLP systems and also compatible with the state of the art practice in lexical data representation.

The lexicon should assign to each verb a set of subcategorisation frames. Frames are defined by a list of atoms (e.g., A0 V A1 ) representing the verb and its arguments, and by a list of atoms/feature structure pairs specifying the feature values associated with each of these atoms.

Two sets of subcategorisation lexicons (called LADL-SynLex and NLP-SynLex) were extracted from the LADL tables (the LADL tables are a digitized version of Gross' grammar lexicon). The current SynLex contains the LADL- and NLP-SynLex lexicons for all the available LADL-tables, which amounts to roughly 22,462 entries and 5,548 verbs. Work is underway to process the remaining available tables which should yield a description of roughly 6,500 verbs.

SynLex is the result of joint work between LED, ATILF and Calligramme [59] , [58] . It is currently being evaluated by the members of the ILF (Institut de la Langue Française) funded LexSynt project, and is partially available at .

