Section: New Results
Crossing the Chasm
Participants : Alejandro Arbelaez, Anne Auger, Robert BusaFekete, Luis Da Costa, Alvaro Fialho, Nikolaus Hansen, Balázs Kégl, Marc Schoenauer, Michèle Sebag.
Many forefront techniques in both Machine Learning and Stochastic Search have been very successful in solving difficult realworld problems. However, their application to either newly encountered problems or new instances of known problems, remains a challenge for experienced researchers of the field  and even more so for scientists or engineers from other areas. No theory or practicebased guidance is provided to make these techniques crossing the chasm(Moore, G.A.: Crossing the Chasm: Marketing and Selling HighTech Products to Mainstream Customer. Collins Business Essentials (1991).). The difficulties faced by the users concern the selection of some appropriate algorithm in a portfolio, or the tuning of algorithm parameters, and the lack of a robust selection methodology. Moreover, stateoftheart approaches for realworld problems tend to represent bespoke problemspecific methods which are expensive to develop and maintain.
The “Crossing the Chasm” research theme is structured around the joint MSRINRIA project Automatic Parameter Tuning in collaboration with Youssef Hamadi from Microsoft Research Cambridge (A. Fialho and A. Arbelaez' PhDs) the EvoTest STREP project, where TAO is in charge of automatic generation of the Evolutionary Engine, and the ANR Metamodel project.
A longerterm goal relevant to all abovementioned projects concerns the design of order parameters accurately reflecting the problem instances at hand. Such order parameters or metadescriptors would enable systematic experiments, supporting the identification of the appropriate algorithms/parameters conditionally to the order parameter values, as shown in relational learning(M. Botta and A. Giordana and L. Saitta and M. Sebag. Relational Learning as Search in a Critical Region. Journal of Machine Learning Research, 2003, pp 431–463) or SAT(F.Hutter, Y.Hamadi, H.H.Hoos, and K.LeytonBrown. Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms, CP'2006.) domains.
Adaptive Operator Selection
Alvaro Fialho's PhD focuses on Online Parameter Tuning, specifically on selecting the different variation operators in Evolutionary Algorithms. The proposed approach is inspired from MultiArmed Bandit algorithms (where each operator is viewed as an arm), and it additionally involves a statistical change detection test to deal with the nonstationary context of evolution. Another contribution is based on the use of extreme values as opposed to average ones to compute the operator reward: in quite a few design contexts, rare events are acknowledged to be more consequential than average ones. This approach has been validated in 2008 on the OneMax problem, where the “ground truth” is known; new extensions in 2009 concern:

the validation of Extremevalues based reward on the kpath and Royal Road problems [51] , [52] ;

the hybridation of the above ideas with the population diversity criterion, in collaboration with the Université of Angers [61] ;

a slidingwindow extension of the Exploration vs Exploitation scheme underlying the MAB algorithm [11] .
Learning Heuristics Choice in Constraint Programming
Alejandro Arbelaez' PhD focuses on the Online Selection of Heuristics for Constraint Programming. The idea is to use the idle computer time to explore the search space, thus gathering new resolution paths. Each such path is labelled after its overall computational cost, and the whole set of paths (training set) is exploited by supervised learning algorithms to discriminate the best heuristics depending on the current state of the search (described after static and dynamic order parameters, e.g. domain sizes and constraint “hardness”) [23] .
BanditBased Boosting
The goal of this project is to apply multiarmed bandits (MABs) to accelerate AdaBoost. AdaBoost constructs a strong classifier in a stepwise fashion by selecting simple base classifiers and using their weighted “vote” to determine the final classification. We model this stepwise base classifier selection as a sequential decision problem, and optimize it with MABs. Each arm represents a subset of the base classifier set. The MAB gradually learns the “utility” of the subsets, and selects one of the subsets in each iteration. AdaBoost then searches only this subset instead of optimizing the base classifier over the whole space. We investigate the feasibility of both adversarial and stochastic bandits. Since the boosting setup is inherently adversarial, we can prove weaktostrong learning boosting theorems only in this framework. On the other hand, stochastic bandits such as UCB and UCT are more versatile and they can exploit existing structure in the weak classifier space, and make banditbased optimization more efficient in case of trees or products [57] . Our experiments on benchmark data sets suggests that AdaBoost can be accelerated by an order of magnitude using MABs without significant loss of performance. This computational improvement made it possible to be competitive in a large scale ML task, the KDD 2009 cup [41] .