Section: New Results
Keywords : reconfigurable architecture, computation grain, low-power consumption, Network-on-Chip, NoC, CDMA, sensor network, multiple-valued logic, MVL, System-on-Chip, SoC.
New architectures and technologies
New organization of reconfigurable structures
Memory hierarchy in specialized SoC
Participants : Daniel Chillet, Olivier Sentieys, Erwan Grace.
Our research aims at defining a global memory hierarchy model suited to SoC and a methodology which allows the designer to explore the design space. The main objective consists in limiting the energy consumption of the circuit.
SoC architectures already propose large on-chip memory, with several memory banks and memory hierarchical levels. In these systems, the main problem concerns the memory exploration in relation with the application needs and particularly the consumption problem of this part of circuit. Several problems could be addressed in this context, such as cache, scratch-pad, and multi-bank memory. We focus our research on designing methodologies for optimal memory hierarchies. A first model has been defined for dedicated SoC and for large reconfigurable architectures such as FPGA circuits. Another way to solve this problem is to extend the reconfigurable paradigm to the memory part. Recent ITRS(http://www.itrs.net/Links/2005ITRS/Home2005.htm ) report indicates that static power consumption will grow significantly as transistor size shrinks. In order to limit it, we propose to alter dynamically the voltage of some area of the memory. We have demonstrated that this new concept, referred as virtual memory hierarchy, could result in significant consumption reduction  . At present, we investigate and evaluate different low power solutions for our future implementation. This work is done through a collaboration with the Ecole Nationale Polytechnique d'Alger (L. Abdelouel Ph.D.) and with the CEA .
ReMiX: Reconfigurable Memory for Indexing Huge Amount of Data
Participants : Gilles Georges [Symbiose], Steven Derrien.
Indexing is a well-known technique that accelerates searches within large volumes of data such as the ones needed by applications related to genomics, to content-based image or text retrieval.
The ReMiX project proposes the design of a dedicated and very large index memory (several hundred of Giga-bytes, distributed among a cluster of nodes), big enough to entirely store huge indexes and avoid the use of any disk.
In addition, the index memory uses reconfigurable hardware resources to tailor – at the hardware level – the memory management for optimizing the support of the specific properties of the indexing schemes. It also offers the opportunity to implement parallel algorithms.
An hardware platform based on Flash memory technology has been developed by the R2D2 team. It comprises four RMeM node, each node being in the form of a standard PCI board. Each board integrates a high-end Xilinx Virtex-II FPGA coupled to 64 GBytes of Flash memory. The prototype is fully functional and several applications have already been ported on the architecture.
The strength of this approach is that this system combines the benefits of hard-drive storage (non-volatility, density), with those of memory (bandwidth, access time). In particular, large indexed databases, which usually suffer from prohibitive random access time when stored on a standard HDD, will largely benefit from an implementation on ReMIX (ReMIX random access latency is three order of magnitude lower than a commodity HDD).
This three-year project (October 2003 - September 2006), coordinated by the Symbiose project, was funded by the French ministry (ACI Data Mass program). The team R2D2 has been strongly involved in both the design of the hardware platform and the porting of a Content Based Image Retrieval application.
Efficient Coding or Modulation Schemes for On-Chip Interconnection Networks
Participants : Sébastien Pillement, Olivier Sentieys.
We have worked on some new crosstalk(A disturbance, caused by electromagnetic interference, along a circuit or a cable pair.) avoidance coding schemes for on-chip buses  . These schemes consist in encoding sequences of bits on each line of a bus transferring a packet in order to eliminate worst-case crosstalk patterns. They permit to improve the delay on the link at the cost of doubling the number of transmitted bits. The advantage of the presented solutions is that they have no wiring overhead, so they are independent from the bus bit-width. The coding schemes allow an increase of 50% of the data rate for a 1-mm bus. Moreover, the proposed solutions induce a direction in deep-submicron noise that can be used to implement a noise-tolerant system.
In  , we introduced a new ternary link including a binary-to-ternary encoder and a ternary-to-binary decoder in voltage-mode multiple-valued logic (MVL). This link improves the transistor count compared to existing designs and it has no DC current path. The complete link was simulated with SPICE and a 0.13m CMOS technology. It additionally shows interesting advantages on power consumption for global interconnects compared to full-swing signaling binary systems (up to 56.4% less energy consumption). Its low propagation delay is also an advantage in the design of high-speed on-chip links for asynchronous systems.
We have also introduced a new coding scheme that faces simultaneously different issues of interconnection design. It accelerates data transfer on a bus or on a network-on-chip by removing worst-case patterns that cause crosstalk issues. This is achieved by skewing odd and even signals on the link. The implementation of this system is very simple and area-efficient. It enables to improve bandwidth by a factor higher than 2.3 on a metal-2 UMC 0.13m CMOS technology bus with the same number of wires than a shielded bus. Furthermore, the delay propagation is well-controlled since the solution that is used to face crosstalk phenomenon removes all transition patterns but two. It also greatly improves noise tolerance through the use of a combination of two error detecting codes at the expense of a reduced number of additional wires. The first code uses temporal redundancy and the second code is a parity-based scheme. This property enables us to lower the power supply voltage in order to reduce power consumption.
Wireless sensor networks
Participants : Olivier Berder, Mickaël Cartron, François Charot, Ludovic L'Hours, Patrice Quinton, Olivier Sentieys, Charles Wagner, Tuan-Duc Nguyen.
Wireless sensor networks in 2006 became an important domain of research activity for R2D2 because of the great potential of this technology in the future, and the difficult challenges that are posed by it.
Based on the prior activity and experience of Mickaël Cartron and Olivier Sentieys on energy efficiency for sensor nodes, R2D2 was at the initiative of an RNRT project, named SVP (for SurVeiller et Prévenir) together with several companies and teams: CEA LETI, Thales, INRIA, LPBEM, AphyCare, ANACT, Lip6 and Institut Maupertuis. This project aims at developing platforms for sensor network applications. One of these applications is the monitoring of children's physical activity. For this application, R2D2 will have the responsibility of developing the hardware and software infrastructure for the prototype application. This development will be based on the experience accumulated by Michaël Cartron on the prototype already developed in the team. Moreover, R2D2 is also involved in a region sponsored research project named CAPTIV where applications of sensor networks for automotive applications are studied.
Beyond these applications – which are per se interesting research, as few such experimental platforms exist – the aim of our research is to study the relationship between algorithms and energy efficiency in such distributed environments.
Mickaël Cartron presents in his thesis results regarding the optimization of energy efficiency for low-level communications, by trying to find trade-offs between transmission power and error rate of the transmission. He has been able to show that an optimal tradeoff exist where 75% of the power can be saved as compared to the worst case transmission. This study could be the basis of hardware implementations of low-level communications, either using ASIC technology or reconfigurable hardware.
Optimization of energy efficiency will be a research subject in the SVP and CAPTIV projects. In SVP, the problem will be considered at the application level, in order to find out trade-offs between local memorisation and compression and transmission of data. In CAPTIV, the problem will be considered at the low-level transmission layer, by trying to use MIMO (Mutiple-Input Multiple-Output) systems to improve the energy efficiency of communications. In such a scheme, some nodes can cooperate at both the transmission and the reception sides in order to create a distributed MIMO system. By using cooperative MIMO transmission instead of SISO, it is shown that the distance between nodes can be increased and a large amount of the total energy can be saved for middle and long distance transmission. Considering Alamouti and Tarokh space-time block codes (STBC), we proposed an optimal selection of the number of antennas at both the transmission and the reception sides with respect to the transmission distance. The energy efficiency of cooperative MIMO over SISO and multi-hop SISO was proved by simulations, and we proposed a multi-hop technique for cooperative MIMO that represents a very interesting compromise for a limited number of available cooperative nodes.
Notice that all this research activity is pushed by the strong links of R2D2 with the Aphycare(http://www.aphycare.com/ ) company, a spin-off of the R2D2 team, whose activity aims at developing wireless sensors nodes for the care of elder persons.
Multiple-Valued Logic (MVL) architectures and circuits
Participants : Daniel Chillet, Ekué Kinvi-Boh, Olivier Sentieys.
In this work, we focus on the design of new architectures based on the principle of Multiple Valued Logic (MVL). The potential advantages of such architectures can allow a reduction of the number of interconnections in the chip and a reduced packaging. The focused aim is thus to validate by testing fabricated ternary circuits, new concepts known as SUS-LOC (SUpplementary Symmetrical LOgic Circuit) which allow the design of circuits in ternary logic and which are based on the use of depletion and enhanced mode MOSFET transistors. For that, the set up of a methodology and design tools suitable for ternary logic and SUS-LOC concepts, is necessary. This research work articulates around three principal points, developed in the thesis of Ekué Kinvi-Boh  .
The first aspect of our work consisted on designing and characterizing SUS-LOC circuits. A characterization flow allowing to extract the circuit performances such as delay and the energy consumption due to transitions was developed. An experimental library of models of transistors enabled us to quickly define a design methodology based on the method of the switches. Although more complex, the SUS-LOC circuits have delay performance and average energy consumption close to those of their binary equivalents.
The second aspect relates to the modeling and estimation of performance of SUS-LOC circuits. Thus, an estimate flow of performance starting from models of the design circuits described in VHDL language was developed. The data on delay and energy consumption obtained from the characterization of the basic structures of the modelled circuit is integrated in this flow. In the same way, it was developed a VHDL package dedicated to ternary logic. The estimated performance by this flow is the total energy consumed by a circuit over a given duration.
The last aspect of this research relates to the fabrication of integrated circuits, including a four TRITs (TeRnary digITs) adder and a 64 cells ternary SRAM memory. The fabrication flow is based on a ternary design-kit developed in collaboration with the UCL in Louvain-la-Neuve. The SUS-LOC circuits fabricated in SOI 2 m technology required the addition of new masks to the fabrication process. Static and dynamic tests were carried out on fabricated SUS-LOC chips.