Section: New Results
Proteins structures
Protein-protein interaction
A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them in order to extract a native-like conformation. We have already demonstrated that using Voronoi constructions and a defined set of parameters, we could optimize an accurate scoring function. However, the precision of such a function is still not sufficient for large-scale exploration of the interactome. This year we tried another construction: the Laguerre tessellation. It also allows fast computation without losing the intrinsic properties of the biological objects. Related to the Voronoi construction, it was expected to better represent the physico-chemical properties of the partners. In [12] , we present the comparison between both constructions. In the recent years, we also worked on introducing a hierarchical structure of the original complex three-dimensional structures used for learning, obtained by clustering. Using this clustering model we can optimize the scoring functions and get more accurate solutions. This scoring function has been tested on Capri scoring ensembles, and an at least acceptable conformation is found in the top 10 ranked solutions in all cases. This work has been submitted for publication. It is part of the thesis of Thomas Bourquard [1] .
Computational protein design
A. Sedano has studied the inverse folding problem of proteins during her internship supervised by T. Simonson and J.-M. Steyaert: the classic problem of the fold recognition consists in predicting the threedimensional structure of a protein from its sequence of amino acids, using the modelling by homology. An additional approach consists in inverting this problem, and in raising the inverse folding problem: identify the most favorable sequences corresponding to a 3D structure, or given fold [7] , [9] . main question is to map the millions of protein sequences extracted from the genomes onto the tens of thousand known 3D structures. She applied methods of probability analysis, such as those of Ranganathan, Thirumalai or Nussinov to big sets of sequences of the family of domains PDZ (at first calculated then natural). These methods allow to determine what are the correlations between distant mutations in a structure. Later, these correlations should allow to describe in terms of sequence the signature of a given structure. She also tried to test these methods by working not on mutations between amino acids but on mutations between classes of amino acids, to facilitate the comparisons between sites along the sequence.
Transmembrane
-barrels
Our algorithm [16] predicts first a super-secondary structure by dynamic programming. This step runs in for
the common up-down topology, and at most
for the
Greek key motifs, where n is the number of amino acids. Finally, a
predicted three-dimensional structure is built from the geometric
criteria.
The method has been tested on transmembrane
-barrel proteins
and it reaches comparable efficiency with respect to previous
approaches. It can be further improved by refining the energetic
model, especially on turns and loops. The structural model may be
also refined since additional structural constraints may simplify the
problem.
The prediction accuracy, for the class of known
-barrel
transmembrane proteins, evaluated as the percentage of well-labelled
residues, reaches 70-85%. The number of strands is correctly
predicted, whereas the shear number, the second main geometric
characteristic for a
-barrel, is relatively suitable.
The method is being used to carry out screening experimentations on
proteomic databases, eg. the Paramecium bank, in a collaboration with
Ph. Dessen (Institut Gustave Roussy).