Section: New Results
Keywords : mobile telecommunication, WCDMA, image indexing, intrusion detection in hardware, biomedical, speech processing.
Study of applications
Applications stemming from third-generation radio-communication systems are good candidates for the study of hardware systems mixing programmable parts executing software code and specialized modules dedicated to the acceleration of time consuming parts of applications.
Data filtering, cryptographic and traffic filtering in high-speed network, speech processing are also under consideration.
Radio-communication systems
Mobile communication systems prototyping (3G, MIMO)
Participants : François Charot, Olivier Berder, Michel Guitton, Daniel Menard, Patrice Quinton, Taofik Saïdi, Pascal Scalart, Olivier Sentieys, Charles Wagner.
Our experiments rely on the use of the SignalMaster prototyping platform(http://www.lyrtech.com/DSP-development/dsp_fpga/signalmaster.php ) that allows applications described using Simulink to be executed on a special-purpose board including a DSP processor and a FPGA chip (SignalMaster platform from Lyrtech Inc. company). Different implementations of the WCDMA emitter-receiver have been realized. This research is a preliminary step in the study of fast estimation techniques for the design of SoC and is described in full details in the Ph.D. thesis of Madeleine Nyamsi in 2005. Arithmetic aspects have been studied for the WCDMA receiver implementation. The aim is to find the fixed-point specification which maintains the application performances and minimizes the operator word-length. This specification is obtained with our approach for floating-point to fixed-point conversion. The real performances are measured through fixed-point simulations.
In the context of wireless communications, using more than one antenna both at the transmitter and at the receiver optimizes the spectral efficiency data transmission. The high complexity of the MIMO (Multiple Input Multiple Output) and multi-antenna algorithms leads to the design of real-time high-performance specific architectures. A flexible MIMO real-time prototype has been designed to operate under the WCDMA (Wideband Code Division Multiple Access) third generation cellular standard. It can be used for uplink (HSUPA) and downlink (HSDPA) communications. The circuit is characterized by a scalable and flexible parallel-pipeline architecture. This system is designed on a rapid prototyping platforms from Lyrtech Inc. company, the SignalMaster platforms, for real-time measurements. This work is done in collaboration with Lyrtech Inc. and with the LRTS laboratory of Laval University in Québec, CA.
Parallel reconfigurable architectures for LDPC decoding
Participants : François Charot, Christophe Wolinski.
LDPC codes are a class of error-correcting code introduced by Gallager [42] with an iterative probability-based decoding algorithm. Their performances combined with their relatively simple decoding algorithm make these codes very attractive for the next satellite and radio digital transmission system generations. LDPC codes were chosen in DVB-S2, 802.11n, 802.16e and 802.3an standards.
The decoding of LDPC codes is an iterative process. For 802.16e standard about 3 000 messages are processed and reordered in each of the 30 iterations. The amount of messages is much more higher in the case of DVB-S2 (of the order of 300 000 messages). These huge data processing and storage requirements are a real challenge for the decoder hardware realization, which has to fulfill a specified throughput (30 Mbit/s for 802.16e and 255 Mbit/s for base station applications in case of DVB-S2).
One major problem is the huge design space composed of many interrelated parameters which enforces drastic design trade-offs. An other important issue is the need for flexibility of the hardware solutions. The aim of our research which is carried out in collaboration with the R-Interface company is the definition of a generic architecture template adapted to the 802.16e standard. The definition of the architecture is based on the use of optimization tools able to schedule the computation and the memory accesses of the algorithm. In order to validate the architecture template, a performance estimation model written in SystemC of the proposed generic architecture is being developed.
Image and multimedia processing
Participants : François Charot, Charles Wagner, Christophe Wolinski.
Image and video processing have been significant application drivers for the reconfigurable computing community since the early 1990's. Reconfigurable computers have been used most widely, and successfully, for accelerating low-level image processing algorithms like local neighborhood functions. These functions are also called sliding window functions or spatial filters, and are used extensively in image processing and computer vision. Local neighborhood functions are applied at a particular pixel location and their output depends on a finite spatial neighborhood. The function is applied independently at all pixel locations and is typically constant across all pixel locations.
Our research concerns the study of a parametric run-time reconfigurable architecture model for local neighborhood image processing. The proposed architecture model is an example of a polymorphous fabric [11] that consists of simple, inter-connected cells, each with an optional local memory. In general, the cells composing the fabric may contain different groups of homogeneous cells. Each cell's data-path may have its own controller, or alternatively, a group of identical cells may share a single controller. A single controller can control one or more cells, to perform computations, to read/write local memory, to synchronize with the processor, and to communicate with other cells. Cells require only local connections and there is very little overhead in terms of resource utilization. In some cases, it is possible to use 96% of the FPGA hardware resources. The Fabric (size of the fabric, number of levels, size of the image, buses and data-paths, etc. are parametric) is composed of optimized generic cells that have two important properties. The first is that they contain the sliding window modules that operate over a partial image stored in local memory. Access to the data inside the sliding window is transparent to the user (that means that the user always addresses the particular pixels relative to the sliding window). The second property is that the different functions such as convolution, minimum, median etc. can be translated into software micro-code for the cells programmable controller if a multi functional data-path is assembled in which case run-time reconfiguration is possible. The developer can select the number and type of functions and connect them to generate a more complex algorithm.
Experimental results [30] , [31] show that for a satellite image feature extraction application, the architecture, implemented on Stratix II and Virtex 2 Field Programmable Gate Arrays, achieves similar performance, hardware resource utilization, and throughput as a fully pipelined systolic array architecture, yet offers improved flexibility to the developer.
Content-based image retrieval hardware acceleration
Participants : Steven Derrien, Auguste Noumsi, Patrice Quinton, Laurent Amsaleg [Texmex].
Content Based Image Retrieval (CBIR) is a technique that allows one to retrieve images of a data base which are (at least) partly similar to a given reference image. CBIR is drawing increasing interest due to its potential application to problems such as image copyright enforcement. Indeed, the large use of Internet resulted in a huge increase of Web available multimedia content, especially images. Checking copyright is therefore a concern for image owners which must be able to identify undue use of images. This identification process relies upon precise and fast image comparison algorithms as Internet is a rapidly changing support and such algorithms need to be run on a daily basis.
Although accurate search techniques based on local image descriptors exist, they suffer from very long execution time (retrieving an image among a 30,000 image data base requires about 1,500 seconds on a standard workstation). To make these techniques attractive, we have been working on the acceleration of CBIR through the use of specific hardware design architectures, the target machines being the RDISK cluster [27] and the ReMIX machine [25] .
Among other results, we have extended the results obtained in 2005, by showing that the encoding of the image descriptors (initially in single precision floating point) could be reduced, through a non-linear transformation, to 3 bits encoding while preserving the search accuracy. Although this analysis was initially done in the context of an hardware implementation, it is to note that this result is also of interest to the image processing community, since it allows descriptors database size to be reduced by a factor of 10.
Intrusion detection system in hardware
Participants : Georges Adouko, François Charot, Christophe Wolinski.
The dynamic feature of security systems is – through anti-intrusion mechanisms (filtering at different levels: packet, connection, and application levels) evolving according to modes and levels of protection–, to our knowledge, a challenge out of reach of classical technologies based on general purpose or network processors. The requirements of security in high-speed networks (from 10 to 40 Gigabit/s) impose the implementation of the filtering rules in the appropriate hardware structures. It is a matter of being able to manage a large variety of complex treatments, and also to guarantee the quality of service. Only dedicated solutions could solve the bottleneck related to the implementation complexity today, at the price of an obvious lack of flexibility and a total absence of evolution.
The aim of our research (Fastnet PRIR Project) is the design of specialized hardware systems for filtering of the network traffic at high-speed. Even if the work especially concerns the study of efficient and predictable filtering techniques and their implementation on FPGA programmable components, our approach rests on a system view of the intrusion detection system and envisions specialized systems combining software and hardware modules. Different approaches to the pattern search problem used for filtering the traffic have been carefully studied and compared from the point of view of the amount of hardware resources, the expressiveness of regular expressions, the throughput and the number of patterns [20] .
Accelerating Statistical Test for Real-Time Estimation of Randomness
Participants : Renaud Santoro, Olivier Sentieys.
Many applications need high-quality random numbers. In cryptography, most of the secured systems are based on unpredictability of a single key, and therefore on the quality of the original random seed. Statistical theory is used to determine the randomness of a stream of numbers. Statistical tests have been developed in literature, and test batteries like Diehard, Nist and FIPS 140-2 are now recognized as a reference. If the random number generator (RNG) passes a number of qualitative statistical tests for randomness, the RNG is considered to be random with some degrees of confidence. However, this procedure is slow and statistical tests are applied on only few bytes. In the case of true random number generators, random sources can vary in time, and the statistical tests must be able to check the RNG quality continuously in real time. During this year, we have investigated the acceleration of some statistical tests in hardware to enhance their efficiency and to detect RNG failures. We have measured the performance and the cost of hardware implementations into FPGA or VLSI, for FIPS 140-2 test suite, autocorrelation and entropy tests. Results show that the system can be used at high-rate for obtaining higher quality random sources.
Intelligent transport system (ITS)
Participants : Olivier Berder, Daniel Ménard, Olivier Sentieys, Tuan-Duc Nguyen.
Transportation systems are playing a critical role in virtually all facets of modern life and significant challenges remain to further improve the efficiency and safety of the current systems. The Brittany Region Council and the Côtes d'Armor Department Council are actually investing in this research area and created recently a Scientific Interest Group on Intelligent Transportation System (ITS), whose head is at ENSSAT, Lannion. Our research team actively participates to this new activity, and especially to projects concerning the deployment of new energy-efficient architectures for ITS.
R2D2 is the leader of the regional research program CAPTIV, which aims at proposing new low-cost and energy-efficient mobile communications solutions to ease and make safer road traffic conditions. Considering "intelligent" road signs and vehicles, i.e. equipped with an autonomous radio communication system, drivers will be able to receive at any time various information about traffic fluidity or road sign identification. In order to reduce deployment cost and increase lifetime of the whole system, Multi-Input Multi-Output (MIMO) signal processing techniques are used. Such techniques allow to dramatically increase the capacity of mobile communication systems or the quality of the transmission, thanks to the well known space-time codes. From another point of view, MIMO systems allow to significantly reduce energy consumed by communications in ad-hoc networks. Considering each crossroads as a communication node, the possible cooperation between road signs allows energy-efficient communications between crossroads. Supported by the Scientific Interest Group GIS ITS-Bretagne and by industrial leaders in ITS domain, regrouping major research laboratories in the region, CAPTIV is a highly applicative program. A first prototype of such a communicating crossroads will be presented in the R oute du Futur in Saint-Brieuc (portion of road devoted to ITS experimentations).