Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: Scientific Foundations


The development of complex embedded systems platforms requires putting together many hardware components, processor cores, application specific co-processors, bus architectures, peripherals, etc. The hardware platform of a project is seldom entirely new. In fact, in most cases, 80 percent of the hardware components are re-used from previous projects or simply are COTS (Commercial Off-The-Shelf) components. There is no need to simulate in great detail these already proven components, whereas there is a need to run fast simulation of the software using these components.

These requirements call for an integrated, modular simulation environment where already proven components can be simulated quickly, (possibly including real hardware in the loop), new components under design can be tested more thoroughly, and the software can be tested on the complete platform with reasonable speed.

Modularity and fast prototyping also have become important aspects of simulation frameworks, for investigating alternative designs with easier re-use and integration of third party components.

The project aims at developing such a rapid prototyping, modular simulation platform, combining new hardware components modeling, verification techniques, fast software simulation for proven components, capable of running the real embedded software application without any change.

To fully simulate a complete hardware platform, one must simulate the processors, the co-processors, together with the peripherals such as network controllers, graphics controllers, USB controllers, etc. A commonly used solution is the combination of some ISS (Instruction Set Simulator) connected to a Hardware Description Language (HDL) simulator which can be implemented by software or by using a FPGA [42] simulator. These solutions tend to present slow iteration design cycles, (implementing the FPGA means the hardware has already been designed at low level) and become very costly when using large FPGA platforms. Others have implemented a co-simulation environment, using two separate technologies, typically one using a HDL and another one using an ISS [33] , [35] , [50] . Some communication and synchronization must be designed and maintained between the two using some inter-process communication (IPC), which slows down the process.

The idea we pursue is to combine hardware modeling and fast simulation into a fully integrated, software based (not using FPGA) simulation environment named SimSoC, which uses a single simulation loop thanks to Transaction Level Modeling (TLM) [29] , [22] combined with a new ISS technology designed specifically to fit within the TLM environment.

The most challenging way to enhance simulation speed is to simulate the processors. Processor simulation is achieved with Instruction Set Simulation (ISS). There are several alternatives to achieve such simulation. In interpretive simulation , each instruction of the target program is fetched from memory, decoded, and executed. This method is flexible and easy to implement, but the simulation speed is slow as it wastes a lot of time in decoding. Interpretive simulation is used in Simplescalar  [28] . Another technique to implement a fast ISS is dynamic translation [30] , [49] , [32] . which has been favored by many [46] , [32] , [48] , [49] in the past decade. With dynamic translation, the binary target instructions are fetched from memory at run-time, like in interpretive simulation. They are decoded on the first execution and the simulator translates these instructions into another representation which is stored into a cache. On further execution of the same instructions, the translated cached version is used. If the code is modified during run-time, the simulator invalidates the cached representation. Dynamic translation provides much faster simulation while keeping the advantage of interpretive simulation as it supports the simulation of programs that have either dynamic loading or self-modifying code.

There are typically two variants of the dynamic translation technology: the target code is translated either directly into machine code for the simulation host, or into an intermediate representation that makes it possible to execute the code with fast speed. Dynamic translation introduces a compile time phase as part of the overall simulation time. But as the resulting cached code is re-used, the compilation time is amortized over time.

Processor simulation is also achieved in Virtual Machines such as QEMU [24] and GXEMUL [34] that emulate to a large extent the behavior of a particular hardware platform. The technique used in QEMU is a form of dynamic translation. The target code is translated directly into machine code using some pre-determined code patterns that have been pre-compiled with the C compiler. Both QEMU and GXEMUL include many device models of open-source C code, but this code is hard to reuse. The functions that emulate device accesses do not have the same profile. The scheduling process of the parallel hardware entities is not specified well enough to guarantee the compatibility between several emulators or re-usability of third-party models using the standards from the electronics industry (e.g. IEEE 1666)

A challenge in the development of simulators is to maintain simultaneously fast speed and simulation accuracy. In the FORMES project, we expect to develop a dynamic translation technology satisfying the following additional objectives:

The SimSoC simulator is based on the TLM standard from OSCI [47] . The hardware components are modeled as TLM models, and since TLM is itself based on SystemC, the simulation is driven by the SystemC [38] kernel. We use standard, unmodified, SystemC (version 2.2), hence the simulator has a single simulation loop. The interconnection between components is an abstract bus similar to the TLM TAC abstract bus open sourced by ST Microelectronics [43] . Each processor simulated in the platform is abstracted as a particular TLM class. This class is both an initiator (it can initiate transactions) and a target (it can process transactions). It acts as an initiator to initiate I/Os and it behaves as a target essentially to receive the boot or halt signals and interrupt notifications from the interrupt controller. Memory and I/O controllers are also modeled as TLM classes. The simulated platform can include multiple heterogeneous processors, for example a general purpose CPU and a DSP. Then each processor is abstracted by a TLM class and they communicate among themselves and I/O controllers via TLM transactions. Research work has been done regarding TLM models such as [44] , [52] , [45] .


Logo Inria