Section: Overall Objectives
Our research specificities include our interest in large scale studies (genomes, proteomes or regulation networks) and discrete methods necessary to handle the associated complexity. We have a global concern for high performance computing and two types of modeling tasks, modeling sequences and structures and modeling regulation networks.
Optimized algorithms on parallel specialized architectures
First and foremost, large scale studies need a fine tuning and management of computational resources. We investigate the practical usage of parallelism to speed-up computations in genomics. Topics of interest range from intensive sequence comparisons to pattern or model matching, including structure prediction. We work on the co design of algorithms and hardware architectures tailored to the treatment of such applications. It is based on the study of reconfigurable machines employing Field Programmable Gate Arrays (FPGA) or fast components such as Flash memories or Graphical Processing Units (GPU).
Modeling sequences and structures
This track concerns the search for relevant (e.g. functional) spatial or logical structures in macromolecules, either with intent to model specific spatial structures (secondary and tertiary structures, disulfide bounds ... ) or general biological mechanisms (transposition ... ). In the framework of language theory and combinatorial optimization, we address various types of problems: design of grammatical models on biological sequences and machine learning of grammatical models from sequences; efficient filtering and model matching in data banks; protein structure prediction. Corresponding disciplinary fields are language theory, algorithmic on words, machine learning, data analysis and combinatorial optimization.
We address the question of constructing accurate models of biological systems with respect to available data and knowledge. The availability of high-throughput methods in molecular biology has led to a tremendous increase of measurable data along with resulting knowledge repositories, gathered on the web (e.g. KEGG,MetaCyc, RegulonDB). However, both measurements as well as biological networks are prone to incompleteness, heterogeneity, and mutual inconsistency, making it highly non-trivial to draw biologically meaningful conclusions in an automated way. Based on this statement, we develop methods for the analysis of large-scale biological networks which formalize various reasoning modes in order to highlight incomplete regions in a regulatory model and to point at network products that need to be activated or inactivated to globally explain the experimental data. We also consider small-scale biological systems for a fine understanding of conclusions that can be drawn on active pathways from available data, working on deducible properties rather than simulation. Corresponding disciplinary fields are model checking, constraint-based analysis and dynamical systems.