## Section: New Results

### Proteins structures

#### Protein-protein interaction

A protein-protein docking procedure traditionally consists in two successive tasks: a search algorithm generates a large number of candidate solutions, and then a scoring function is used to rank them in order to extract a native-like conformation. We have already demonstrated that using Voronoi constructions and a defined set of parameters, we could optimize an accurate scoring function. However, the precision of such a function is still not sufficient for large-scale exploration of the interactome. This year we tried another construction: the Laguerre tessellation. It also allows fast computation without losing the intrinsic properties of the biological objects. Related to the Voronoi construction, it was expected to better represent the physico-chemical properties of the partners. In [12] , we present the comparison between both constructions. In the recent years, we also worked on introducing a hierarchical structure of the original complex three-dimensional structures used for learning, obtained by clustering. Using this clustering model we can optimize the scoring functions and get more accurate solutions. This scoring function has been tested on Capri scoring ensembles, and an at least acceptable conformation is found in the top 10 ranked solutions in all cases. This work has been submitted for publication. It is part of the thesis of Thomas Bourquard [1] .

#### Computational protein design

A. Sedano has studied the inverse folding problem of proteins during her internship supervised by
T. Simonson and J.-M. Steyaert: the classic
problem of the fold recognition consists in predicting the threedimensional
structure of a protein from its sequence of amino acids, using
the modelling by homology. An additional approach consists in inverting
this problem, and in raising the inverse folding problem: identify the most
favorable sequences corresponding to a 3D structure, or given *fold* [7] , [9] .
main question is to map the millions of protein
sequences extracted from the genomes onto the tens of thousand known 3D structures.
She applied methods of probability analysis, such as those of Ranganathan,
Thirumalai or Nussinov to big sets of sequences of the family of domains
*PDZ* (at first calculated then natural). These methods allow to determine
what are the correlations between distant mutations in a structure. Later,
these correlations should allow to describe in terms of sequence the
*signature* of a given structure. She also tried to test these methods by
working not on mutations between amino acids but on mutations
between classes of amino acids, to facilitate the comparisons between sites
along the sequence.

#### Transmembrane -barrels

Our algorithm [16] predicts first a super-secondary structure by dynamic programming. This step runs in for the common up-down topology, and at most for the Greek key motifs, where n is the number of amino acids. Finally, a predicted three-dimensional structure is built from the geometric criteria. The method has been tested on transmembrane -barrel proteins and it reaches comparable efficiency with respect to previous approaches. It can be further improved by refining the energetic model, especially on turns and loops. The structural model may be also refined since additional structural constraints may simplify the problem. The prediction accuracy, for the class of known -barrel transmembrane proteins, evaluated as the percentage of well-labelled residues, reaches 70-85%. The number of strands is correctly predicted, whereas the shear number, the second main geometric characteristic for a -barrel, is relatively suitable. The method is being used to carry out screening experimentations on proteomic databases, eg. the Paramecium bank, in a collaboration with Ph. Dessen (Institut Gustave Roussy).