Section: Scientific Foundations
Visual Analytics for Graphs
Graphs offer a powerful and flexible mathematical tool to model real life phenomena. Biologists naturally use graphs to infer relationships between subcellular components (proteins, peptides, genes, RNAs, molecules, ...). Geographers have long used graphs to represent exchange networks (roads, air traffic, immigration, …). Sociologists heavily rely on the use of graphs to study social networks. In all cases, the visual inspection of a network supports the analysis of its community structure and helps to answer questions concerning prominent actors (proteins; cities; manager; logical entity) or subgroups (biological function; territory; team; logical unit). The identification of communities in a network is an essential step towards understanding the whole network architecture. Once a subgroup has been identified, and when it appears as such within the visualization, it can be zoomed in to allow a more detailed inspection of its own dynamics. Graphs also appear as a natural modelling tool in computer science itself (data structures, web graphs, workflows, etc.).
Graphs moreover become a profitable metaphor when studying data equipped with a similarity measure either inherited from the data or computed from semantic attributes. A graph can readily be constructed applying a threshold on similarities. The use of a correlation measure to infer similarities is a common approach bringing similarities into the picture when analyzing data.
The case of image classification/indexing is typical. Structural indicators such as the MPEG7 colour structure descriptors [70] [82] can be computed for each image; as a consequence, the similarity between any two images can be computed as a [0, 1] value. Images being highly similar can then be considered as neighbours in a (weighted) graph, enabling the analyst to exploit analytical tools borrowed from graph drawing, graph algorithmics, graph theory and combinatorial mathematics.
Bioinformatics also provide other interesting examples. For instance, an important use of DNA microarray data is to annotate genes by clustering them on the basis of their gene expression profiles across several microarrays. Because the transcriptional response of cells to changing conditions involves the coordinated coexpression of genes encoding interacting proteins, studying coexpression patterns can provide insights into the underlying cellular processes. In this context, the (Pearson) correlation coefficient is a standard dissimilarity measure used to infer network structure. On the assumption that genes and their protein products carry out cellular processes in the context of functional modules, it is natural to ask whether such modular organization can be revealed through the study of gene or protein interaction networks.
Graph Visualization is an active subfield of Information Visualization dealing with graph algorithms to find patterns, test properties, embed graphs in particular geometries (most often 2D or 3D Euclidean) or interactively manipulate their representations on the screen. Each year, a number of papers accepted at the IEEE InfoVis Symposium(See the URL http://www.infovis.org ), the IEEE/Eurographics EuroVis Conference(See the URL http://www.eurovis.org ) or the IEEE London Information Visualization Conference(See the URL http://www.graphicslink.co.uk ) concern graph visualization. The Graph Drawing community, with its own annual international symposium also contributes to the development of the field(See the URL http://www.graphdrawing.org ).
When focusing on relational data (graphs), combinatorial mathematics offer tools to exploit the topology of graphs and other structural regularities either numerically or from an algorithmic standpoint. A typical graph drawing algorithm will assume or test specific topological conditions such as being a tree or being biconnected. Visualization techniques can benefit from combinatorial knowledge on particular graphs. One good example picked from our own results is the use of Strahler numbers (generalized to general directed or undirected graphs) to optimize the rendering of large graphs on a screen [24] . Other examples from our group exploit the fact that combinatorial parameters in a tree can be approximated using a Gaussian distribution [61] [27] , folding or unfolding subtrees as the user navigates. Community identification methods based on using a node or edge dissimilarity measure in conjunction with a clustering method have proved fruitful.
The development and full exploitation of combinatorics to feed all subprocesses of the visualization pipeline (Fig. 1) with emphasis on the data analysis part is at the heart of our project. The core strength of our team resides in the development of combinatorial mathematics and graph algorithmics to serve the aims of graph visualization. We deploy our mathematical and algorithmic skills in Information Visualization to develop:

Graph statistics: that capture key properties of the data, including scalable implementations;

Clustering methods: that handle large datasets both visually and computationally;

Graph hierarchies: that transform large graphs into a hierarchy of smaller, more readable and easiertomanipulate substructures;

Graph drawing algorithms: that lay out large datasets rapidly, enhancing scalability and addressing domainspecific conventions and requirements;

Interactions: that exploit graph hierarchies as a central mechanism for navigating large graphs, while taking domainspecific tasks into account;

Evaluation methods: that generate artificial datasets (randomly) based on key properties of the target data.