Section: New Results
Reliable robot localization and scene modeling
Participants : Patrick Rives, Cyril Joly, Gabriela Gallegos, Maxime Meilland, Andrew Comport.
Simultaneous Localization and Mapping (SLAM) :
In the Simultaneous Localisation And Mapping paradigm, the robot moves from an unknown location in an unknown environment and proceeds to incrementally build up a navigation map of the environment, while simultaneously using this map to update its estimated position. Several issues have been addressed this year concerning 6dofs Smoothing and Mapping (SAM) ,e.g. an alternative and more consistent formulation to the SLAM problem and a new approach aiming to fuse catadioptric vision and laser scanner in order to get a more reliable representation of the 3D environment.
6dofs visual SAM:
Let us consider a standard 6dofs Bearing Only (BO) SLAM configuration, i.e. the robot in the 3D space observing 3D-landmarks with a camera. We consider the visual data only (we do not use odometry information). It is well known that a perspective camera only provides a measurement of the bearing of a landmark and not the distance. The distance can be computed by using several observations from different viewpoints. As a follow-up, the landmarks initialisation must be delayed when using the standard euclidian coordinates in the state representation (a criterion based on the parallax is generally used). However, landmarks can bring bearing information even though the depth is unknown. In order to use this information, Civera  introduced an alternative not delayed landmark representation, the inverse depth parameterization. This parametrization is usually used with standard EKF method and 6 components per landmarks are used in the filter. Unfortunately, such a parameterization is not minimal and instabilities in the estimation process can occur.
We have shown that in the Smoothing and Mapping (SAM) formulation, one can use only the 3 last components of the inverse depth parametrization in the estimated state. The unestimated part is used as equilibrium points to evaluate measurement functions and Jacobians. This can be done in a consistent way thanks to the SAM approach which stores and estimates at each step all the trajectory (the standard EKF does not estimate the full trajectory, so the 3 first components are estimated in the filter in the attempt to make them consistent during the next steps).
We tested our new approach with the data provided by an omnidirectional camera on a mobile robot. The trajectory includes an inclined plane in order to validate the 6dofs algorithm with real 3D data. Results are given on fig. 2 (top left, top right and bottom left). One can see that the trajectory is very accurate and smooth. For example, we recovered the angle of the ramp with a precision of 1deg. Finally, we compared our results with the standard EKF with inverse depth parametrization. A sample of the result is given by fig. 2 (bottom right): the result is not so precise compared to the SAM algorithm. Moreover, we noticed that the 99% confidence ellipses provided by the EKF are smaller thant the ones provided by the SAM algorithm. This last observation tends to show that the SAM algorithm is more consistent than the EKF (which is too optimistic).
Multisensors SLAM merging catadioptric images and laser scans:
We propose an hybrid approach combining the advantages of a laser range finder and an omnidirectional camera to efficiently solve the indoor SLAM problem and reconstruct a 3D representation of the environment. We report the results validating our methodology using a mobile robot equipped with a 2D laser range finder and an omnidirectional camera.
In order to build 2D local maps of the environment we implemented an adaptation of Diosi and Kleeman Polar Scan Matching (PSM) algorithm. It directly works in the laser scanner's polar coordinate system and, therefore, does not require a time consuming search for corresponding points. Local maps are used in both the localization process and the mapping of the environment. Thanks to a SLAM approach, local maps are fused in a 2D global map and the localization of the robot is refined. Figure 3 shows the robot trajectory as computed from the wheels encoders (in red) and as computed by the SLAM algorithm (in green) superimposed on the estimated global map. The sequence was obtained by manually driving the robot to explore the ground floor of a building. Note that although we did not compensate the drift by a loop closure detection, the results are quite satisfactory. The localization obtained using the SLAM approach notably improves the dead-reckoning method based on the odometry only.
We then developed a new approach to merge laser and vision data in a consistent representation. Radial lines corresponding to approximatively vertical features (e.g. walls, facades, doors, windows) in the scene are extracted from the omnidirectional images using a Canny edge detector combined with a Hough method. As shown in Figure 4 .a , by projecting the laser scan data in the omnidirectional image we can find the laser range measurements corresponding to the radial lines and, thus, compute the related 3D lines in the scene (Figure 4 .b).
Autonomous navigation in urban environment
In order to navigate safely and autonomously in complex urban environments, it is necessary to localize the vehicle precisely in real time. Our approach is based on a novel representation, the 3D image memory, which is built during a learning step. This memory is called Robot-centred spherical representation and it contains a sequence of key spherical images geo-referenced in a GIS (Geographic information system). To each pixel of the sphere is associated several attributes: luminance, depth and salliency informations . This representation is generic, so that different kinds of sensors or prior knowledge on the environment can be used in the building step. In order to exploit this representation in real time during the in-line step, we define for each sphere a binary mask which allows to select only useful information for navigation. This selection step is made off-line by computing the Jacobian of the spherical image projection. Pixels that render the Jacobian matrix best conditionned allow to generate the binary selection mask (Figure 5 ).
During the in-line step, the current vehicle position is estimated by a direct method  which minimizes grey levels between the closest reference sphere and the current image. A robust estimation technique is used to reject outliers not corresponding to the objective function. The current estimated error is used to control the vehicle on a desired trajectory.
During the learning phase some parts of the world will be missing because of moving objects and limits of the sensors. The last step is to update this missing information in the database by using a SLAM technique. This step is computed in-line, so that at each new estimation the model is improved with the effect of rendering the navigation more accurate and robust.