Section: New Results
Abstract: Autonomous Robotics is a challenging application for both machine learning and evolutionary computation. From a machine learning perspective, robotics asks for extending machine learning algorithms beyond the classical assumptions of independent and identically distributed examples (indeed the robot exploration results in non independent and identically distributed data). Furthermore, an autonomous robot is immersed in a noisy and dynamic environment where its actions are bound to modify its very perception of the environment (the perception-action loop). Last, robotics perfectly illustrates the learnability issues: robotic sensors with high definition result in a very high dimensional instance space, hindering the learning task; but if the sensor definition is low, this results in "perceptual aliasing": the robot mistakes different places, e.g. corners in a maze, as being the same.
Controller Optimization and Modelling
In robotics, control architecture are either explicitely modelized (e.g. subsomption architecture) or learned (e.g. evolutionary robotics).
From human to robot behavior
In collaboration with Lutins (U.Paris 8 - Cognitive Psychology) within a joint project, Vincent Besson's M2 stage focused on the modeling of a control architecture from observed behaviors of humans involved in an exploration task with reduced perceptual capabilities (i.e. no vision). The control architecture was then implement on simulated robot so as to reproduce these experiments in silico. The resulting architecture is based on the traditional subsomption architecture which is extended to deal with several contexts. Simulated experiments showed comparable behaviors in silico compared to in vivo experiments with human subject, thus providing insightful results for psychologists.
In the field of automatic design of controllers, Nicolas Godzik's PhD  (defended in September 2005) focused on improving neural-network-based controllers. As a matter of fact, as the rewards for the robot actions are delayed, sparse and noisy, e.g., due to perceptual aliasing, and the state space is huge, an alternative to reinforcement learning is offered by Evolutionary Robotics, pioneered by Nolfi and Floreano (2000). Along this line, the robot controller is sought for by evolutionary computation, as the goal at hand, e.g., exploring the arena, getting to a specific location in a maze, combining exploration and search for a source of energy, is formalized as optimizing a manually designed fitness function. A central issue for Evolutionary Robotics is to decide whether the search space is discrete (as for classifier systems) or continuous (using, e.g., neural nets as the search space). A hybrid approach called "Symbolic Controllers" was proposed (N. Godzik and M. Schoenauer and M. Sebag, Evolving Symbolic Controllers, in G. Raidl et al., eds, Applications of Evolutionary Computing , pp 638-650, LNCS 2611, Springer Verlag, 2003). It involves continuous inputs and high-level outputs (discrete outputs that are transformed in continuous command for the actuators using either hand-made or previously-learned behaviors). Compared to the now standard Action Selection framework, this approach allows to gradually construct and exploit libraries of hierarchical actions, enforcing the scalability of the approach toward more complex target behaviours.
In a similar framework, Agustin Martinez, graduate student from University of Buenos Aires (Argentina) spent 3 months in TAO and studied the influence of the milieu on the results of evolution: even for the simple problem of obstacle avoidance, punctuated equilibrium-like phenomena can take place provided the milieu is complex enough  .
Another issue is the generality of the robotic controller and the adaptation to environments that are different from those encountered during the (limited) training period. In collaboration with LIMSI (Robea project), an architecture combining three functionalities has been proposed (N. Godzik and M. Schoenauer and M. Sebag, Robustness in the long run: Auto-teaching vs Anticipation in Evolutionary Robotics. In X. Yao et al., Eds, Proc. PPSN VIII , pp 932-941, LNCS 3242, Springer Verlag, 2004): action selection, anticipation (predicting the next robot sensations based on the selected action) and adaptation (reacting to the difference between the predicted sensations and those actually encountered by the robot). This architecture, inspired from cognitive science and sensory-motor contingency, contains a model of the world (the anticipation module) which can be costlessly confronted with the world and provides hints into the needed adaptation. Interestingly, this approach was shown to be much more robust in the long run than the competitive auto-teaching approach proposed by Nolfi and Parisi  .
Cedric Hartland built on this architecture during his Master thesis  : he used anticipation to automaticaly calibrate a controller learned in a simulated world when it is implemented in a real robot. He is now continuing this work during his PhD, started in September 2005.
Fuzzy controllers and a priori rules
An alternative to neural networks as controllers for autonomous robots is the use of fuzzy controllers. Starting from the function approximation using Voronoi diagrams (see section 6.4.1 , Carlos Kavka, a student at University of San Luis (Argentina) working under the supervision of Marc Schoenauer developed an evolutionary system to design fuzzy systems in inverse problems. This approach is original in the fuzzy system domain, because the algorithm here evolves rules whose domain is not restricted to hyper-rectangles, as in most previous work. Moreover, it is possible in this approach to define a priori rules (e.g., for an obstacle avoidance problem for an autonomous robot, ``Go forward if there is no obstacle ahead'') without defining strictly its domain of application: the system can restrict or expand this rule depending on the other rules it will find useful. The application to evolutionary robotic (C. Kavka and M. Schoenauer. Evolution of Voronoi-based Fuzzy Controllers. In Xin Yao et al., eds, PPSN'04, LNCS 3242, Springer Verlag, 2004.)demonstrated the efficiency of the proposed approach for fuzzy system design. Rcently, this work was extended to recurrent fuzzy systems  , where some memory is required to perform the target task (e.g. a light on the right hand side of the corridor indicates that the robot must turn right in the next corridor).
Cartography for autonomous robots
Localization and Mapping is a very important thematic in autonomous robotics and has long been studied in the litterature. A basic definition of mapping is to incrementaly build a map of the environment so as to be able to navigate and relocalize in this very environment. In this context, the goal of the PhD thesis of Sylvain Gelly is to endow such a mapping algorithm with generalization capability so as to be able to build more compact map as well as to provide an efficient way to detect similarities (e.g. to speed up exploration). A probabilistic representation is studied that relies on extensions of traditional Hidden Markov Models and Bayesian networks. Several (national and international) papers have been published on this topic about the representation used as well as theoretical properties of the underlying concepts.
Automatic design of robots and objects
Aside from control-related problematic, Research also focus on the automatic design of robotic morphology. The first goal is to identify the relevant building blocks using simple lego-like structure to achieve construction such as Lego robot, or, as a first step, bridges and tables. A second goal is then to perform a relevant optimization in such a space to discover the fitted structure (current research focus on the use of Genetic Programming). Alexandre Devert started a PhD thesis this september and already worked on this subject during his Master thesis  . However, to make things simple, he started to evolve static structures, removing the controller issue (see section 6.4.1 ). But one long-term goal of his PhD is to actually come back to robotics and design both the morphology and the controller of a robot.
Remark: Several other Mater theses dealing with autonomous robotics have been made by students in TAO, but only the ones that are being continued as a regular activity have been listed here.
See also section 5.3 for the description of the robot simulator that has been developed by Jérémie Mary within this theme.
OpenDP , a framework for stochastic dynamic programming optimization has been developped by TAO (see section 5.5 ). Stochastic dynamic programming is a classical tool for discrete-time control problems, but its use for robotics, thanks to the combination with learning techniques with a good generalization ability, is a prospective issue. Some scalable benchmarks of robotics are already included and some others are under development for a systematic study of the possibilities.