Team Alpage

Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: New Results

Finite-State Multi-Tape Transducers

Participant : François Barthélemy.

François Barthélemy has been working in the definition of finite-state multi-tape transducers using typed Cartesian Product. Tapes are identified using a unique name and the Cartesian Product is an operator which allows the combination of several components which are either a language on a given tape or an embedded Cartesian Product on several tapes. The components of a Cartesian Product must be independent, namely they do not share any tape. The types are implemented in tapes using auxiliary symbols which are used to obtain a closure under intersection (and also difference and complementation) of the transducers.

François Barthélemy developped a system called Karamel devoted to the development and execution of finite-state multi-tape transducers. The system comprises a language and a Integrated Development Environment. The language uses three ways for defining finite state machines:

The IDE is written in HTML/CSS/Javascript. It provides some basic edition functions, some test facilities and an interface to execute the descriptions. Karamel uses a C++ library from AT&T called FSM which implements efficiently finite-state algorithms. Karamel implements an original unit test framework inspired from the JUnit framework for Java [17] . Tests of finite-state transducers are performed using assertions, namely evaluable boolean predicates. Tests may involve auxiliary finite-state machines called fixtures (e.g.: a given input to a transducer and the corresponding expected output are fixtures). At the moment, Karamel is still a prototype. We plan to complete its development and begin to distribute it in the near future.

The relevance of multi-tape transducers for Natural Language Processing has been exemplified in a case study in Semitic Morphology: a comprehensive verbal grammar of the Akkadian language has been written using Karamel [18] .


Logo Inria