Section: New Results
Abstract: The TAO group is also historically involved in other applications of either Machine Learning or Evolutionary Computation that are not directly linked to its main streams of research. They are surveyed below.
Text Mining (TM) is concerned with exploiting/transforming documents to achieve particular tasks. The difficulty lies in the delicate balance to keep between texts, transformations and tasks. Problem resolution implies the existence of cognitive entities, called concepts of specialty, necessary to the resolution of the current tasks.
A key data preparation step in Text Mining, Term Extraction selects the terms, or collocation of words, attached to specific concepts. The task of extracting relevant collocations can be achieve through a supervised learning algorithm, exploiting a few collocations manually labeled as relevant/irrelevant. In Thomas Heitz' PhD work, an evolutionary learning algorithm (Roger , see section 6.1.1 ), based on the optimization of the Area under the ROC curve criterion, extracts an order on the candidate terms. The robustness of the approach was demonstrated on two real-world domain applications, considering different domains(biology and human resources) and different languages (English and French)(Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag. Preference Learning in Terminology Extraction: A ROC-based approach, in: "ASMDA'05", 2005.). All detailed will be available in Thomas' forthcoming PhD dissertation.
After having organized the fist edition, the team has organized the second French text-mining challenge (DEFT: DÉfi Fouille de Textes), which consisted of topics segmentation from French corpora built for the occasion. It took place in a workshop of SDN'06 conference with twenty five participants belonging to seven francophone teams. The results by the participating teams are presented in  .
Large-Eddy Simulations efficiently provide the large eddies of the flow, but not the Filtered Density Functions (FDF), that is required for evaluating some interesting quantities. As the FDF is well approximated by a two-parameters family of distributions, namely the -distributions, we therefore only have to estimate these two parameters from the large eddies. We compared in  (i) various sets of variables extracted from the large eddies (ii) various approaches for the estimation of these parameters from these sets of variables. We conclude to the theoretical and practical efficiency of universally consistent non-parametric learning tools like neural networks.
Scheduling problems are a known success area for Evolutionary Algorithms (EAs). The French Railways (SNCF) were interested to find out whether they could benefit from EAs to tackle the problem of rescheduling the trains after an incident has perturbed one (or more) train(s). At the moment, they are using in their production units a commercial software (CPlex from Ilog), and they experience serious difficulties when several incidents occur simultaneously on large networks.
After some significant results on full size problems(Y. Semet and M. Schoenauer. An efficient memetic, permutation-based evolutionary algorithm for real-world train timetabling. In B. McKay et al.,Eds, Proc. CEC'05, pp 661-667, IEEE Press, 2005), showing that EAs can indeed give better and faster results that CPlex thanks to a very specialized ``scheduler'', the contract was renewed for an additional year by SNCF, and ended in May 2006. Yann Semet was thus able to polish his algorithm, and to increase the performance of EAs even more thanks to some inoculation of good solutions in the initial population. This work also allowed him to start identifying the type of problems where EAs are a better alternative than CPlex  . Note also that this work was mentioned in 2 publications in Rail et Recherche , the SNCF internal magazine.
Time-dependent planning with bounded resources
A one-year contract between TAO and Thales - Land & Vision division was concerned with temporal planning with limited resources (in 2004-2005). Two approaches have been tried: coupling Evolutionary Algorithms on the global scale with Constraint Programming to solve local (hopefully small) problems on the one hand; and using Petri Networks representing partial plans.
Only the first approach succeeded, and feasibility results for the TGV approach have been obtained during the contact (that ended in July 2005). Collaboration continues with Pierre Savéant (Thales) and Vincent Vidal (Université de Lens) as this approach allowed to obtain breakthrough results : it is the first Pareto approach for multi-objective temporal planning problems  . A CIFRE PhD position on this topic, funded by Thalès, has recently been posted.
Multi-disciplinary multi-objective optimization
TAO team is involved in the RNTL project on Multi-disciplinary optimization coordinated by Rodolphe Leriche (Ecole des Mines de St-Etienne), for its expertise in surrogate models. This will most probably lead to work in different application domains. Independently (even though Renault is also a partner of OMD consortium), Claire Le Baron had started a CIFRE PhD in September 2005 funded by the automobile company Renault. The (ambitious) original goal of the PhD is to optimize the complete motor of a car, thus involving structural mechanics, vibration and acoustics, combustion and thermics. But the thesis is now focusing on comparing a wide range of approaches involving some trade-off between solving a faithfull but complex optimization problem (using Evolutionary Algorithms), and solving an easy but maybe not really significant optimization problem using standard numerical methods.