Team MODBIO

Members
Overall Objectives
Scientific Foundations
Application Domains
Software
New Results
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Keywords : Statistical learning theory, grammatical inference, probabilistic automata, rational languages.

Probabilistic automata inference

Participant : François Denis.

In Probabilistic Grammatical Inference, it is supposed that learning data consist in a sequence of words over a finite alphabet $ \Sigma$ drawn according to a fixed but unknown probability distribution Pcalled a stochastic language . Then, the goal is to find a model, which can be a probabilistic automata (PA) or a Hidden Markov Model (HMM) for instance, consistent with the data. Hidden Markov Models and Probabilistic Automata have the same expressivity and their relationship have been precisely studied in [17] . With Yann Esposito, from the "Laboratoire d'informatique fondamentale de Marseille" (LIF), we have proved in [25] that stochastic languages pgenerated by probabilistic automata Adepend continuously on the parameters of A, for the Im2 ${{||·|}|_\#8734 }$ norm. As a corollary, we prove that probabilistic automata can be identified in the limit and that the identification is exact when the parameters of the target are rational numbers. However, this result is theoretical and does not lead to a practical learning algorithm. The main difficulty is to infer an appropriate structure from the data: this is possible when natural components of the model correspond to intrinsic components of the target language. We defined the notions of residual languages of a stochastic language and Probabilistic Residual Automata . A PRA is a PA whose states directly correspond to the residual of the language it generates. When the target stochastic language can be generated by a PRA, an efficient learning algorithm can be defined (see [25] ). Stochastic languages defined from probabilistic automata are rational languages and we feel necessary to study Rational Stochastic Languages from a Language Theoretical point of view. Main results have been described in [26] . A main publication is in preparation.


previous
next

Logo Inria