Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: Scientific Foundations

Local Image Description

In most contexts where images are to be compared, a direct comparison is impossible. Images are compressed in different formats, most formats are error-prone, images can be re-sized, cropped... The solution consists in computing descriptors from the images and to turn image comparison into descriptor comparison. This can be done if, on the one hand, the descriptors contain some information on the image content, while, on the other hand, they do not depend on the image format, size or on transformations the image can undergo.

The most classical method associates a unique global descriptor with each image, e.g. a color histogram or correlogram, a texture descriptor. Such descriptors are easy to compute and use, but they usually fail to handle cropping and cannot be used for object recognition. A second method consists in extracting regions in the image and to associate a descriptor with each of these regions. Most of the time, this is done by extracting points (called interest points) with a Harris-like detector [54] , and by considering a circular or elliptic region around each of these points.

The differential invariants were among the first local descriptors used. Established by Florack [51] , their use for image comparison was proposed by Schmid [66] . Each descriptor is a combination of the first derivatives of the signal at the interest point. These descriptors appeared experimentally to be very robust to geometric and photometric transforms. An even more powerful descriptor was then proposed by Lowe: the SIFT descriptor [57] . It is composed of 16 local histograms of gradient directions around the interest point.

Local descriptors can be used in many applications: image comparison for object recognition, image copy detection, detection of repeats in television streams... While they are very reliable, local descriptors are not without problems. As many descriptors can be computed for a single image, a collection of one million images can generate a database of one billion of descriptors. That is why specific indexing techniques are required. Up to now, most of them are computed from decompressed images, while most formats images are stored compressed. Thus it would be interesting to directly compute the descriptors in the compressed domain. Finally, their evaluation for very large image collection (several millions of images) is still an open and interesting problem.


Logo Inria