Section: New Results
The Autonomous Computing SIG focuses on autonomic grids, identified as a promising field of TAO applications since 2006 in the context of EGEE(Enabling Grids for E-Science in Europe, infrastructure project (2001-2003), (2003-2007), (2008-2010).). Our implication in EGEE resulted in the Grid Observatory project. Supported by EGEE, DIGITEO and CNRS, concerned with digital assets curation for the grid, the Grid Observatory establishes long-term repositories of grid traces for current and future references, that will ultimately support the deployment of advanced management tools.
A first line of research concerns the exploratory analysis (clustering) of the job queries submitted to the grid. For traceability reasons, the examplar-based Affinity Propagation (AP) approach was chosen; AP has been extended to StrAP, featuring a quasi-linear complexity and dealing with non-stationary distributions  . A new result concerns the self-tuning of the change detection test involved in StrAP, through the optimization of the BIC criterion  ; this result will be extended to handle on-line process segmentation, an issue widely relevant to grid modeling since the stationarity hypothesis is clearly unrealistic. Another result concerns the large-scale assessment of StrAP on recent and extensive datasets  .
Another line of research aims at generative and interpretable models of the grid workload. First results based on the principled use of the MDL criterion  establish that consistent grid workload models can be discovered from the empirical data; the model space of piecewise linear time series provides a satisfactory trade-off between accuracy and stability. Additionally, a bootstrapping strategy building more accurate models from limited samples is presented.
Model-Free Scheduling Policy
The huge computational needs of e-science can either be supported by grid architectures (enabling hardware and software sharing) or cloud computing (enabling dynamic resource provisioning, aka elastic computing). Large (scientific) collaborations critically depend on organized resource sharing, governing the responsiveness of the system and in fine its everyday use.
A model-free resource provisioning strategy supporting both above scenarii has been devised, implemented and validated by  . The provisioning objective is formalized as a continuous action-state space, multi-objective reinforcement learning problem, modeling the high level goals of users, administrators, and shareholders through simple utility functions.
On the e-science application side, a grid-enabled study in collaboration with INSERM U525 (Génétique épidémiologique et moléculaire des pathologies cardiovasculaires ) has resulted in identifying the SLC22A3-LPAL2-LPA gene cluster as a strong susceptibility locus for coronary artery disease; this result has been published by Nature Genetics  . The collaboration is resumed using the StrAP algorithm.