Team Symbiose

Overall Objectives
Scientific Foundations
Application Domains
New Results
Other Grants and Activities

Section: Software

Activity of transfer from Symbiose to GenOuest

Participants : Olivier Sallou, Michel Le Borgne, Israël-César Lerman, Hugues Leroy, Jacques Nicolas, Anne Siegel, Basavanneppa Tallur, Anthony Bretaudeau, Annabel Bourdé, Carito Guziolowski.

Modeling activity concerns sequences and networks in Symbiose.


The first software suite aims at offering a platform to search for complex models within both DNA and protein sequences. It is based on previous works made within the team in order to propose an expressive language (Stan and Wapam) that goes beyond pattern matching in biological sequences and study modeling needs of biologists at the level of whole genomes. The Logol Software Suite is a set of software composed of a Logol language interpreter(biological patterns) and pattern search tool, a graphical web-based editor, and a result analyser. Result files contain the matches on the sequence(s) with all required details. Pattern description supports (among others) word complement, overlaps, substitution and distance errors as well as variables usage. The interface provides a drag and drop facility to build interactively Logol grammars from graphical templates. The Logol Designer is written in Java script and licensed under the CeCILL v2 license. The analyser is written in Prolog. It may be run in command-line mode and on a personal computer or via a scheduler web page submitting jobs to the genouest cluster. Coupled with BioMAJ, the tool allows to parse updated versions of public banks or personal sequences.


ModuleOrganizer is a software package proposing a synthetic view of a set of DNA sequences by providing both a segmentation of them into domains and a classification on the basis of these domains. It has been developed in the framework of ANR Modulome, leaded by Symbiose. It indexes the maximal repeats in the sequences and assembles them for create modules. After a classification step relatively to the presence or the absence of modules, the method results in a graphical view of a hierarchical clustering of the segmented sequences.


The CRISPR genomic structures (Clustered Regularly Interspaced Short Palindromic Repeats) form a family of repeats that is largely present in archea and frequent in bacteria. CRISPI is a user-friendly web interface with many graphical tools and facilities allows extracting CRISPR, finding out CRISPR in personal sequences or calculating sequence similarity with spacers. It offers a reference in this domain with more than 1100 species and is updated automatically on a regular basis. It has been developed during ANR Modulome [26] , leaded by Symbiose, in collaboration with LME/Ifremer Brest.


In the paper [24] we proposed a filter for speeding-up the multiple repeat search. This filter, called Tuiuiu, defines a necessary condition stronger than previous filters and proposes an efficient way to apply it on large set of sequences.

In collaboration with the Genouest platform, the Tuiuiu filter had been housed on the mobyle portal and is thus publicly available. Using a submission form ( ) the Tuiuiu tool can be lunched using the Genouest machines.

FROSTO and A purva

Two programs working on protein structures, FROSTO and A purva, are installed on the GenOuest cluster and available to registered users. FROSTO (PhD thesis G. Collet) is a program that finds remote homolgies between proteins, based on INRA program Frost. It aligns a protein sequence with a database of protein structures by an efficient protein threading method with non-local parameters and uses a dedicated solver based on a Lagrangian relaxation approach. A purva (PhD thesis N. Malod- Dognin) is a tool for computing the similarity of two protein structures, by finding the maximum overlap of their contact maps.


PLAST and GASSSTT are freely downloadable codes available on the software web site of Symbiose. PLAST (PhD thesis V. H. Nguyen) is a parallel Blast, the most used sequence comparison software. GASSST (PhD thesis G. Rizk) finds global gapped alignments of short DNA sequences against large DNA banks.

Bioquali Cytoscape plugin

Bioquali is dedicated to computations on qualitative models represented by interaction graph. Nodes of these graphs represents chemical species and arrows are labeled by positive or negative influences.

The software offers several functionnalities for the confrontation of networks and observation data: (i) The internal consistency of the network corresponds to checking that the whole set of constraints have at least a solution. (ii) Consistency between a network and datasets corresponds to checking that a partial set of variations on node can be extended to a whole solution to the set of constraints. (iii) Diagnosing an inconsistent network means that if a system does not check the basic rule, we shall identify a subset of interactions and data that bear inconsistencies. (iv) Predicting new variations corresponds to identifying the variables that have the same sign in all solutions of the set of constraints.

In 2009, we focused our attention on the design of a cytoscape plugin to allow a friendly use of Bioquali functionnalities ( ). The BioQuali plugin [16] facilitates in silico exploration of large-scale regulatory networks by combining the user-friendly tools of the Cytoscape environment with high-performance automatic reasoning algorithms. As a main feature, the plugin guides further investigation regarding a system by highlighting regions in the network that are not accurately described and merit specific study.

The BioQuali plugin is implemented in Java, based on the Cytoscape API, and using the REST architectural style. By default, the client component uses an unauthenticated HTTP connection to communicate with the GenOuest Web server. This enables fast remote execution of the algorithm underlying the BioQuali plugin on the GenOuest high performance computing facility. Alternatively, the server side component can be downloaded and installed on any standard PC using the Cytoscape plugin management system. The plugin is available to download from the Cytoscape plugin website ( ), under the Plugin/Analysis section or via Java Web Start. It is compiled with the latest Cytoscape API (version 2.6) and packaged as a jar file.

FlowCore: a Bioconductor package for high throughput flow cytometry

Flow cytometry is a technology used for high throughput screening, generating large complex data sets often in clinical trials or drug discovery settings. We developed a set of flexible open source computational tools in the R package flowCore to facilitate the analysis of these complex data. A key component of which is having suitable data structures that support the application of similar operations to a collection of samples or a clinical cohort. In addition, our software constitutes a shared and extensible research platform that enables collaboration between bioinformaticians, computer scientists, statisticians, biologists and clinicians. This platform will foster the development of novel analytic methods for flow cytometry. The software has been applied in the analysis of various data sets and its data structures have proven to be highly efficient in capturing and organizing the analytic work flow. Finally, a number of additional Bioconductor packages successfully build on the infrastructure provided by flowCore, open new avenues for flow data analysis [17] .


Logo Inria