Section: New Results
Crossing the Chasm
Many forefront techniques in both Machine Learning and Stochastic Search have been very successful in solving difficult real-world problems. However, their application to either newly encountered problems or new instances of known problems, remains a challenge for experienced researchers of the field - and even more so for scientists or engineers from other areas. No theory- or practice-based guidance is provided to make these techniques crossing the chasm(Moore, G.A.: Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customer. Collins Business Essentials (1991).). The difficulties faced by the users concern the selection of some appropriate algorithm in a portfolio, or the tuning of algorithm parameters, and the lack of a robust selection methodology. Moreover, state-of-the-art approaches for real-world problems tend to represent bespoke problem-specific methods which are expensive to develop and maintain.
The “Crossing the Chasm” research theme is structured around the joint MSR-INRIA project Automatic Parameter Tuning in collaboration with Youssef Hamadi from Microsoft Research Cambridge (A. Fialho and A. Arbelaez' PhDs) the EvoTest STREP project, where TAO is in charge of automatic generation of the Evolutionary Engine, and the ANR Metamodel project.
A longer-term goal relevant to all abovementioned projects concerns the design of order parameters accurately reflecting the problem instances at hand. Such order parameters or meta-descriptors would enable systematic experiments, supporting the identification of the appropriate algorithms/parameters conditionally to the order parameter values, as shown in relational learning(M. Botta and A. Giordana and L. Saitta and M. Sebag. Relational Learning as Search in a Critical Region. Journal of Machine Learning Research, 2003, pp 431–463) or SAT(F.Hutter, Y.Hamadi, H.H.Hoos, and K.Leyton-Brown. Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms, CP'2006.) domains.
Adaptive Operator Selection
Alvaro Fialho's PhD focuses on Online Parameter Tuning, specifically on selecting the different variation operators in Evolutionary Algorithms. The proposed approach is inspired from Multi-Armed Bandit algorithms (where each operator is viewed as an arm), and it additionally involves a statistical change detection test to deal with the non-stationary context of evolution. Another contribution is based on the use of extreme values as opposed to average ones to compute the operator reward: in quite a few design contexts, rare events are acknowledged to be more consequential than average ones. This approach has been validated in 2008 on the OneMax problem, where the “ground truth” is known; new extensions in 2009 concern:
the hybridation of the above ideas with the population diversity criterion, in collaboration with the Université of Angers  ;
a sliding-window extension of the Exploration vs Exploitation scheme underlying the MAB algorithm  .
Learning Heuristics Choice in Constraint Programming
Alejandro Arbelaez' PhD focuses on the Online Selection of Heuristics for Constraint Programming. The idea is to use the idle computer time to explore the search space, thus gathering new resolution paths. Each such path is labelled after its overall computational cost, and the whole set of paths (training set) is exploited by supervised learning algorithms to discriminate the best heuristics depending on the current state of the search (described after static and dynamic order parameters, e.g. domain sizes and constraint “hardness”)  .
The goal of this project is to apply multi-armed bandits (MABs) to accelerate AdaBoost. AdaBoost constructs a strong classifier in a stepwise fashion by selecting simple base classifiers and using their weighted “vote” to determine the final classification. We model this stepwise base classifier selection as a sequential decision problem, and optimize it with MABs. Each arm represents a subset of the base classifier set. The MAB gradually learns the “utility” of the subsets, and selects one of the subsets in each iteration. AdaBoost then searches only this subset instead of optimizing the base classifier over the whole space. We investigate the feasibility of both adversarial and stochastic bandits. Since the boosting setup is inherently adversarial, we can prove weak-to-strong learning boosting theorems only in this framework. On the other hand, stochastic bandits such as UCB and UCT are more versatile and they can exploit existing structure in the weak classifier space, and make bandit-based optimization more efficient in case of trees or products  . Our experiments on benchmark data sets suggests that AdaBoost can be accelerated by an order of magnitude using MABs without significant loss of performance. This computational improvement made it possible to be competitive in a large scale ML task, the KDD 2009 cup  .