Overall Objectives
Scientific Foundations
Application Domains
New Results
Other Grants and Activities

Section: New Results

Keywords : Statistical learning theory, bio-informatics, boosting.

Boosting blast

Participant : François Denis.

The function of a single protein is mainly carried out by a domain which is a subsequence of amino-acids within the whole sequence of the protein. During evolution, the sequence of such a domain can be significantly modified while the function is still conserved. Our work deals with functional families whose domains are not well conserved during evolution. Let Fbe a functional family, let P= { p1, ..., pn} a set of annotated proteins which are known to belong or not to F, our problem is to decide whether any new protein pbelongs to F.

In many cases, comparing a new sequence of protein pwith some sequences of the family Fis enough for predicting whether p$ \in$F . Such a similarity search may be achieved by using either an alignment program such as Blast or any model of the family's sequences, for example stochastic and probabilistic models such as Hidden Markov Models. Unfortunately, none of these methods is satisfactory whenever the sequences of the domains of the family are not conserved. Our proposal is to use a boosting algorithm associated with Blast to deal with this problem. First results have published in [24] , [23] . Cécile Capponi, at the LIF, is leader on this thema.


Logo Inria