Section: Overall Objectives
The goal of the TEMICS project-team is the design and development of algorithms and practical solutions in the areas of analysis, modelling, coding, communication and watermarking of images and video signals.The TEMICS project-team activities are structured and organized around the following research directions :
3D modelling and representations of multi-view video sequences .
The emergence of new video formats, allowing panoramic viewing, free viewpoint video (FTV), and Three-Dimensional TV (3DTV) on immersive displays are creating new scientific and technological problems in the area of video content modelling and representation. Omni-directional video, free viewpoint video and stereoscopic or multi-view video are formats envisaged for interactive and 3DTV. Omni-directional video refers to a 360-degree view from one single viewpoint or a spherical video. The notion of "free viewpoint video" refers to the possibility for the user to choose an arbitrary viewpoint and/or view direction within a visual scene, creating an immersive environment. A multi-view video together with depth information allows, by using view synthesis techniques, the generation of virtual views of the scene from any viewpoint. This property can be used in a large diversity of applications, including 3DTV, FTV, security monitoring and tracking. This type of 3D content representation is also known as MVD (Multi-View plus Depth). The TEMICS project-team focuses on several algorithmic problems to analyze, represent, compress and render multi-view video content. The team first addresses the problem of depth information extraction. The depth information is associated with each view as a depth map, and transmitted in order to perform virtual view generation and allow the inter-operability between capture (with N cameras) and display (of P views) devices. The huge amount of data contained in multi-view sequences motivates the design of efficient representation and compression algorithms.
Sparse representations, compression, feature extraction and texture description.
Low rate as well as scalable compression remains a widely sought capability. Scalable video compression is essential to allow for optimal adaptation of compressed video streams to varying network characteristics (e.g. to bandwidth variations) as well as to heterogeneous terminal capabilities. Wavelet-based signal representations are well suited for such scalable signal representations. Special effort is thus dedicated to the study of motion-compensated spatio-temporal expansions making use of complete or overcomplete transforms, e.g. wavelets, curvelets and contourlets, and more genrally of sparse signal approximation and representation techniques. The sparsity of the signal representation depends on how well the bases match with the local signal characteristics. Anisotropic waveforms bases, based on directional transforms or on sets of bases optimized in a sparsity-distortion sense are studied. Methods for texture analysis and synthesis, for prediction and for inpainting, which are key components of image and video compression algorithms, based on sparse signal representations are also developed. The amenability of these representations for image texture description is also investigated and measures of distance between sparse vectors are designed for approximate nearest neighbors search and for image retrieval. Beyond sparse image and video signals representations, the problem of quantization of the resulting representations taking into account perceptual models and measures, in order to optimize a trade-off between rate and perceptual quality, is studied.
Joint source-channel coding . The advent of Internet and wireless communications, often characterized by narrow-band, error and/or loss prone, heterogeneous and time-varying channels, is creating challenging problems in the area of source and channel coding. Design principles prevailing so far and stemming from Shannon's source and channel separation theorem must be re-considered. The separation theorem holds only under asymptotic conditions where both codes are allowed infinite length and complexity. If the design of the system is heavily constrained in terms of complexity or delay, source and channel coders, designed in isolation, can be largely suboptimal. The project objective is to develop theoretical and practical solutions for image and video transmission over heterogeneous, time-varying wired and wireless networks. Many of the theoretical challenges are related to understanding the tradeoffs between rate-distortion performance, delay and complexity for the code design. The issues addressed encompass the design of error-resilient source codes, joint source-channel codes and multiply descriptive codes, minimizing the impact of channel noise (packet losses, bit errors) on the quality of the reconstructed signal, as well as of turbo or iterative decoding techniques.
Distributed source and joint source-channel coding. Current compression systems exploit correlation on the sender side, via the encoder, e.g. making use of motion-compensated predictive or filtering techniques. This results in asymmetric systems with respectively higher encoder and lower decoder complexities suitable for applications such as digital TV, or retrieval from servers with e.g. mobile devices. However, there are numerous applications such as multi-sensors, multi-camera vision systems, surveillance systems, with light-weight and low power consumption requirements that would benefit from the dual model where correlated signals are coded separately and decoded jointly. This model, at the origin of distributed source coding, finds its foundations in the Slepian-Wolf and the Wyner-Ziv theorems. Even though first theoretical foundations date back to early 70's, it is only recently that concrete solutions have been introduced. In this context, the TEMICS project-team is working on the design of distributed prediction and coding strategies based on both source and channel codes. Although the problem is posed as a communication problem, classical channel decoders need to be modified. Distributed joint source-channel coding refers to the problem of sending correlated sources over a common noisy channel without communication between the senders. This problem occurs mostly in networks, where the communication between the nodes is not possible or not desired due to its high energy cost (network video camera, sensor network...). For independent channels, source channel separation holds but for interfering channels, joint source-channel schemes (but still distributed) performs better than the separated scheme. In this area, we work on the design of distributed source-channel schemes.
Data hiding and watermarking .
The distribution and availability of digital multimedia documents on open environments, such as the Internet, has raised challenging issues regarding ownership, users rights and piracy. With digital technologies, the copying and redistribution of digital data has become trivial and fast, whereas the tracing of illegal distribution is difficult. Consequently, content providers are increasingly reluctant to offer their multimedia content without a minimum level of protection against piracy. The problem of data hiding has thus gained considerable attention in the recent years as a potential solution for a wide range of applications encompassing copyright protection, authentication, steganography, or a mean to trace illegal usage of the content. This latter application is referred to as fingerprinting. Depending on the application (copyright protection, traitor tracing or fingerprinting, hidden communication), the embedded signal may need to be robust or fragile, more or less imperceptible. One may need to only detect the presence of a mark (watermark detection) or to extract a message. The message may be unique for a given content or different for the different users of the content, etc. These different applications place various constraints in terms of capacity, robustness and security on the data hiding and watermarking algorithms. The robust watermarking problem can be formalized as a communication problem : the aim is to embed a given amount of information in a host signal, under a fixed distortion constraint between the original and the watermarked signal, while at the same time allowing reliable recovery of the embedded information subject to a fixed attack distortion. Applications such as copy protection, copyright enforcement, or steganography also require a security analysis of the privacy of this communication channel hidden in the host signal.
Given the strong impact of standardization in the sector of networked multimedia, TEMICS, in partnership with industrial companies, seeks to promote its results in standardization (jpeg , mpeg ). While aiming at generic approaches, some of the solutions developed are applied to practical problems in partnership with industry (Thomson, France Télécom) or in the framework of national projects (RIAM ESTIVALE ), ANR ESSOR , ANR ICOS-HD , ANR MEDIEVALS , ANR PERSEE , DGE/Region FUTURIMAGES ) and European projects (IST-NEWCOM++ ). The application domains addressed by the project are networked multimedia applications (on wired or wireless Internet) via their various requirements and needs in terms of compression, of resilience to channel noise, or of advanced functionalities such as navigation, protection and authentication.