Section: Scientific Foundations
3D object and scene modeling, analysis, and retrieval
This part of our research focuses on geometric models of specific 3D objects at the local (differential) and global levels, physical and statistical models of materials and illumination patterns, and modeling and retrieval of objects and scenes in large image collections. Our past work in these areas includes research aimed at recognizing rigid 3D objects in cluttered photographs taken from arbitrary viewpoints (Rothganger et al. , 2006), segmenting video sequences into parts corresponding to rigid scene components before recognizing these in new video clips (Rothganger et al. , 2007), and retrieval of particular objects and buildings from images and videos (Sivic and Zisserman, 2003) and (Philbin et al. , 2007). Our current research focuses on acquisition of detailed object models from multiple images and video streams, theoretical analysis of camera models, and object/scene retrieval.
High-fidelity image-based object and scene modeling.
We have recently developed multi-view stereopsis algorithms that have proven remarkably effective at recovering intricate details and thin features of compact objects and capturing the overall structure of large-scale, cluttered scenes. Some of the corresponding software (PMVS, http://grail.cs.washington.edu/software/pmvs/ ) is available for free for academics, and licensing negotiations with several companies are under way. Our current work extends this approach in two directions: the first one is theoretical, with a general formalism for modeling central and non-central cameras using the formalism and terminology of classical projective geometry (Section 6.1.1 ), while the second one is more applied, using our multi-view-stereo approach to model archaeological sites (Section 6.1.2 ).
Video-based modeling of deformable surfaces.
Another focus of our research is markerless motion capture from multiple video streamsn. Our previous work was targetted at the acquisition of accurate models of the shape of motion of deformable surfaces that bend but do not stretch (for example, many types of cloth). Our current work is aimed at extending this approach to surfaces that may stretch or shrink, such as human skin (Section 6.1.3 ). The targetted application is performance capture in the film industry.
Retrieval and modeling of objects and scenes in large image collections.
The goal of this research is to develop techniques for visual search and recognition of objects and scenes in large image collections. In addition, the goal is to also investigate novel applications of large scale recognition in other domains, such as image processing (e.g. image enhancement and restoration), computer graphics (novel scene synthesis, visualization), 3D reconstruction, or visual localization.
We have introduced a geometric Latent Dirichlet Allocation (gLDA) model for unsupervised modeling of unstructured image collections and developed an approach for avoiding confusing features (such as trees or road markings) in the context of large scale place recognition in structured, geo-referenced image databases. In terms of applications, we have (i) developed a method for image inpainting using strong priors in the form of multiple other images of the same scene and (ii) investigated scene category recognition techniques for synthesizing novel scenes and navigating large collections of still images.