Section: Software
Alpage's linguistic workbench, including Sx Pipe
Participants : Benoît Sagot [ correspondant ] , Pierre Boullier, Éric Villemonte de La Clergerie.
See also the web page http://lingwb.gforge.inria.fr/ .
Alpage's linguistic workbench is a set of packages for corpus processing and parsing. Among these packages, the Sx Pipe package is of a particular importance
Sx Pipe, now in version 2 [107] is a modular and customizable chain aimed to apply to raw corpora a cascade of surface processing steps. It is used
-
as a preliminary step before Alpage's parsers (FRMG, SxLfg );
-
for surface processing (named entities recognition, text normalization...).
Developed for French and for other languages, Sx Pipe 2 includes, among others, various named entities recognition modules in raw text, a sentence segmenter and tokenizer, a spelling corrector and compound words recognizer, and an original context-free patterns recognizer, used by several specialized grammars (numbers, impersonal constructions...).