Team e-Motion

Members
Overall Objectives
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Dynamic World Perception and Evolution Prediction

Visual Dynamic Obstacles detection using Change of Scale

Participants : Amaury Nègre, Guillem Alenyà Ribas, Christian Laugier, Jim Crowley.

To navigate safely in a dynamic environment (with moving pedestrians, vehicles or other obstacles), it is necessary to detect and characterize dangerous objects. In this aim, we developed a visual method that consists in detecting and tracking interesting elements in a camera image and evaluating the time-to-contact to estimate the collision risks.

Our detector extracts ridge segments in a Laplacian Scale Space. Such segments correspond to elongated contrasted areas in the image and are particularly interesting because their describe well object's structure and because the detector is invariant to affine changes. This detector can be define as an extantion of the Laplacian interest point detector as it extracts lines where the Laplacian value is maximum.

After having extracted these ridge segments in a single image, we track them using a particle filter in order to evaluate the motion and the change of scale [27] . Change in scale is known to be used by humans to navigate as a growing motion in the image is related to an approaching motion in the 3D world [90] . Here, we use this change of scale to compute the time to collision (TTC is the time remaining before contact with an obstacle). The TTC computed for each tracked element give us a precious information about their dangerousness of an obstacle (a small positive TTC corresponds to a dangerous approaching obstacle).

Our current work consists in using these tracked features in a reactive navigation framework. The set of tracked objects and their associated TTC is used to define a probability distribution describing the restricted area in the robot commands domain (linear and angular speed). A Bayesian fusion between this distribution and the desired command distribution can then define the final command than is used to control the robot (see Figure  1 ).

Figure 1. Object tracking for Time-To-Collision estimation and vehicle navigation with obstacle avoidance using position and TTC of tracked ridge segments.
IMG/capture0063IMG/res_4
(a) An example of ridge segment tracking and TTC estimation in the camera image (red discs indicate a short TTC).(b) Vehicle trajectory with obstacle avoidance.

In order to have video rate performance, and so can use our method on line, all algorithms have been implemented on the graphic processor (GPU) which is well adapted to parallel tasks like image processing (Scale space filtering, segment's extraction) and particles filter (each particle can be processed in parallel).

To experiment the visual navigation task, we also designed an artificial landmark very robust and easy to track. This landmark has been used with an autonomous underwater vehicle for docking [10] .

This work was done in collaboration with Guillem Alenya of the IRI laboratory (UPC, Barcelona).

Data-Driven Markov Chain Monte Carlo for Moving Object Tracking

Participants : Trung-Dung Vu, Olivier Aycard.

Recent years have seen many research works using laser scanners to detect and track moving objects. On one hand, existing methods usually separate the detection and tracking as two independent procedures, this is what we have done in 2007.

Since detection at one time instant usually results in ambiguities that make the data association become more difficult with missing detections and false alarms. In 2008, we proposed to solve the detection and tracking in a whole process that allows object detection to make use of temporal information and facilitates robust tracking. On the other hand, due to occlusions or laser-absorbed surfaces, an object can be divided into several segments. This makes object detection and tracking much harder when dealing with object merging and track grouping. We introduce a model-based approach and will discuss how using object models to interpret the laser measurements can overcome these problems.

Figure 2. Example of a model-based interpretation of moving objects from laser data. (a) four scans consecutive: in blue is current scan and in green are scans in the past; (b) one possible solution including seven tracks of four cars and one bus represented by red boxes and one pedestrian represented by red dots which are imposed on the range data; (c) situation reference.
IMG/example_solution

We formulate the detection and tracking problem as finding the most likely trajectories of moving objects given laser measurements over a sliding window of time (Fig. 2 ). An object trajectory (track) is regarded as a sequence of shapes of a predefined model produced in the spatio-temporal space by an object satisfying constraints of the measurement model and the smoothness in motion from frame to frame. In this way, our approach can be seen as a batch method searching the global optimum solution taking into account all past measurements. Due to the high computational complexity of such a scheme, we employ a Data-driven Markov chain Monte Carlo (DDMCMC) method to explore the solution space.

The detection results from our previous works [109] [32] are employed to help driving the DDMCMC search effeciently. The detection results are moving evidences detected based on the occupancy grid constructed around the vehicle. Starting from these identified dynamic segments, by fitting suitable object models to each segment, we generate all hypotheses possiblely corresponding to potential moving objects. These rough hypotheses provide initial proposals for the DDMCMC process that performs a finer search over the spatio-temporal space to find the most likely trajectories of moving objects.

The proposed approach is tested on the Navlab datasets [110] . We implement the described algorithm as an online process within a sliding window of 10 frames. From our initial evaluations, the DDMCMC detection and tracking outperforms the detection and tracking using MHT in our previous work [109] [32] in terms of higher detection rates and less false alarms. In addition, with the use of object models, segmented objects caused by laser discontinuities are no longer a problem and tracking results are more accurate. Furthermore, moving objects are naturally classified. The average computational time for the total detection and tracking process is about 60 ms on P4 3.0 GHz PC with unoptimized codes so that it can fulfill the real time requirement (Fig. 3 ). The result is submitted to ICRA09 [31] .

A comparative evaluation of our proposed algorithm with other detection and tracking approaches using MHT is being carried out. We also intend to integrate a road detection procedure in order to provide prior information on moving objects that will certainly improve the effectiveness of the detection and tracking process. An optimization of code to reduce the computational time is ongoing.

Figure 3. Moving object detection and tracking in action.
IMG/result

Fast Clustering and Tracking of Moving Objects in Dynamic Environments

Participants : Yong Mao, Kamel Mekhnacha, David Raulo, Christian Laugier.

Perceiving of the surrounding physical world reliably is a major demanding of the driving assistant systems and the autonomous mobile robots. The dynamic environment need to be perceived and modeled according to the sensor measurements which could be noisy. The major requirement for such a system is a robust target tracking algorithm. Most of the existing target tracking algorithms [103] use an object-based representation of the environment. However, these existing techniques have to take into account explicitly data association and occlusion problems which are the major challenges of the performances. In view of these problems, a grid based framework, the Bayesian occupancy filter (BOF) [105] [62] has been presented in our previous works.

In the BOF framework, concepts such as objects or tracks do not exist. It decompose the environment into a grid based representation. Thanks to the grid decomposition, the complicated data association and occlusion problems do not exist. Another advantage of the BOF is that the multiple sensor fusion task could be easily achieved. Uncertainties of multiple sensors are specified in the sensor models and are fused into the BOF grid naturally with solid mathematical ground.

Despite of the aforementioned advantages, a lot of applications demand the explicit object-level representation. In our former work, we suggested a hierachical structure where a joint probabilistic data association filter (JPDAF) [103] based object detecting and tracking algorithm was implemented above the BOF layer. However, the computational complexity of this algorithm increases exponentially to the number of objects detected and tracked which prevented it from being applied in the cluttered environment. To overcome this, in 2008 we proposed a novel object detecting and tracking algorithm for the BOF framework [25] , [24] . This algorithm takes the occupancy/velocity grid of the BOF as input and extracts the objects from the grid with a clustering module which takes the prediction of the tracking module as a feedback. Thus, the clustering module avoids searching in the entire grid which guarantees the performance. A re-clustering and merging module is employed to deal with the ambiguous data associations. The extracted objects are then tracked and managed in a probabilistic way. The computational cost of this approach is linear to the number of dynamic objects detected, so as to be suitable for scenes in cluttered environment.

Figure 4. The distance error of the target to the GPS data.
IMG/error

We applied our novel method on the real world experimental dataset collected with the Cycab platform and also on realworld traffic datasets provided by our industrial parners (Denso in particular). Taking the GPS data as the ground truth, we compared our algorithm with the JPDAF based algorithm. Figure 4 shows our method achieved comparable accuracy. Meanwhile, an experiment on simulated data shows the computational cost of our method is 0.0003 seconds per frame and increases linearly with the number of targets in scene. Compared with the NNJPDA algorithm which consumes 0.075 seconds per frame with an average of 11 targets and 18 clusters, but 5 seconds per frame with an average of 22 targets and 28 clusters, our method is extremely effective. Another example of the proposed method is shown in figure 5 . These figures illustrate that the BOF based fast clustering and tracking algorithm can detect and track the moving objects consistently and robustly even when occlusions of the targets occur [25] .

Figure 5. top: The two persons are walking towards each other in the perpendicular direction to the Cycab, middle: An occlusion takes place, Bottom: The occlusion lasts for several frames and finishes.
IMG/camera1IMG/bof1IMG/tracker1
IMG/camera2IMG/bof2IMG/tracker2
IMG/camera3IMG/bof3IMG/tracker3

Object Extraction using Self Organizing Networks and Statistical Background Detection

Participants : Thiago Bellardi, Alejandro Dizan Vasquez Govea, Agostino Martinelli, Christian Laugier.

In many computer vision related applications it is necessary to distinguish between the background of an image and the objects that are contained in it. This is a difficult problem because of the double constraint on the available time and the computational cost of robust object extraction algorithms.

For the specific problem of finding moving objects from static cameras, the traditional segmentation approach is to separate pixels into two classes: background and foreground. This is called Background Subtraction   [74] and constitutes an active research domain (see [96] ). Having the classified pixels, the next processing step consists of merging foreground pixels to form bigger groups corresponding to candidate objects, this process is known as object extraction .

In previous work, we introduced a novel clustering approach for object extraction based on Self Organizing Networks (SON). We have applied this algorithm to images [107] and occupancy grids [108] , and shown that it is able to produce good results in real time. Later improvements in the object extraction algorithm made possible the use of a continuous foreground representation instead of binary. The algorithm complexity is linear in respect to the number of pixels in the image, keeping the capability of processing at the frame rate.

Figure 6. Approach overview. Enlarged views showing the different steps of our algorithm using CAVIAR data.
IMG/example2IMG/example1IMG/example4IMG/example5
(a) Initial SON(b) Original Image(c) Difference Image and resulting grid(d) Extracted objects

From 2007 to 2008 we start to focus on the background/foreground classifier exploring the new capability of the extraction algorithm. We implemented a statistical classifier [12] , that produces a continuous output between 0 and 1, for each pixel in a frame, corresponding to our believe that the pixel belong to the foreground. Comparing with the previous results, the detections became more robust to noise and we observed less false negative and false positive detections.

Figure 7. Object detection using the statistical background/foreground classifier. The grayscale foreground is passed as it to the object detector.
IMG/stat_fg_img_crop1IMG/stat_fg_input_crop1IMG/stat_fg_detection_crop1
(a) Original image(b) Grayscale foreground(c) Detection results

Future work includes continuing our experimental work, in particular by improving the background/foreground classifier. Other possibilities include taking into account temporal information by updating the state of the SON when a new input image is available, instead of working in a frame per frame fashion. Finally, we would like to explore the use use of our SON to perform data fusion on a multicamera system.

This work is implied in the BACS project.

Moving Objects' Future Motion Prediction

Participants : Alejandro Dizan Vasquez Govea, Thiago Bellardi, Agostino Martinelli, Thierry Fraichard, Olivier Aycard, Christian Laugier.

To navigate or plan motions for a robotic system placed in an environment with moving objects, reasoning about the future behaviour of the moving objects is required. In most cases, this future behaviour is unknown and one has to resort to predictions.

Most prediction techniques found in the literature are limited to short-term prediction only (a few seconds at best) which is not satisfactory especially from a motion planning point of view.

We have first started to explore the problem of medium-term motion prediction for moving objects. As a result, we have proposed a novel cluster-based technique that learns typical motion patterns using pairwise clustering and use those patterns to predict future motion. We have developed a new learn and predict approach which addresses issues that were not solved by our first proposal: (a) Prediction of unobserved patterns and (b) On-line/adaptive learning. The new approach represents motion on the basis of a proposed extension to the well-known Hidden Markov Models framework, that we have named Growing Hidden Markov Models (GHMM). Basically, the extension allows incremental, adaptive, real time learning of the models parameters and structure. Incorporating final positions, objects' goals, in the GHMM state several motion patterns can be represented with a single GHMM.

During 2008, we have performed further experimental validation of GHMMs by comparing them against two state of the art techniques [71] , [56] . In our experiments, our approach has exhibited considerably better performance than the other two concerning prediction accuracy and model parsimony. This work has been the subject of an invited journal paper which is now in the review process.

This work is implied in the BACS project.

Interpretation and Prediction in Dynamic Environments

Participants : Christopher Tay Meng Keat, Christian Laugier, Chiara Fulgenzi.

A model of dynamic objects at a longer term (usually several seconds) are especially useful in environments where movements of objects exhibit certain structures or motion patterns. In these cases, it is possible to construct motion exemplaires that characterizes the motion patterns which are then used for performing longer term motion prediction. This is performed by construction models of each motion exemplaires and learning the parameters of the models based on data collected from a certain scene. The ability to perform motion prediction on a longer time scale is beneficial for robotic applications such as target tracking or collision avoidance.

Most existing methods in the literature requires a discretization of the state space. To overcome this, in 2007 we proposed a novel [104] motion model based on Gaussian Processes (GP). A GP is a generalization of the gaussian distribution to function spaces where a gaussian distribution is placed on functions. Not only discretization issues are avoided in using GP, it provides a theoretically sound bayesian framework in which one can obtain probability distributions of motion in space as well as a proper way of performing motion prediction in a fully probabilistic manner.

We model the scene using a mixture of GP where each GP represents a motion pattern. The mixture of GP then corresponds to the different motion patterns in a certain scene. An example is shown in figure 8 where the shaded bars represent the covariances of motion patterns and the mean of the of the motion pattern as a red line.

Another advantage of using GP to represent exemplaire paths is the representation of uncertainty in when performing motion prediction. The uncertainty in motion prediction can be simply calculated by marginalizing the gaussian distribution representing the path, given a set of observations. An illustration of the prediction is in figure 9 . The figure shows the various GP components corresponding to the predicted paths starting from a certain point. Part of this work is done in the scope of our long-term collaboration with Toyota.

Figure 8. Representations of GP paths. Red lines represent GP mean. Covariances are represented by the green bars
IMG/cluster-multi-theta
Figure 9. Prediction path means and covariances
IMG/predict-result2

As the notion of GPs is able to represent motion patterns in a probabilistic manner, it is thus a natural candidate for applications such as probabilistic motion planning. Such motion planners search for a risk averse motion which avoids potential collisions with dynamic objects in the environment. The GPs are able to give the probability of collision for each considered trajectory by the motion planner. Experimental results have been obtained in 2008 in a joint work [19] ( see section 6.3.4 ) where a Rapidly-exploring Random Tree was used to perform a risk averse motion planning in dynamic environment using the GP predictions to measure the risk of a certain trajectory.


previous
next

Logo Inria