Section: Scientific Foundations
Models of Computation and Communication (MoCCs)
Participants : Charles André, Julien Boucaron, Anthony Coadou, Liliana Cucu [ EPI TRIO ] , Robert de Simone, Jean-Vivien Millo, Dumitru Potop-Butucaru, Yves Sorel.
Because of their formal semantics, the various Models of Computation and Communication (MoCCs) considered in our team can be used in a true effective design flow methodology based on model transformation to represent compilation, synthesis, analysis and optimization of concurrent embedded applications onto parallel and multicore embedded architectures. Allocation seen in that sense comprises a physical distribution/placement as well as a temporal scheduling aspect. Timing constraints and requirements may be expressed and have to be checked and preserved in the process.
This type of incremental design flow may be applied to represent a number of existing, theoretical or practical approaches to the design of embedded systems.
Synchronous reactive formalisms
In synchronous reactive models the various concurrent processes all run at the speed of a common global logical clock, which sets up the instantaneous reaction step. Synchronous formalisms provide an accurate representation of both hardware and scheduled embedded concurrent software; in both cases, simultaneous behaviors in a single global instant are allowed, and even often required.
Examples include Esterel/SyncCharts , Lustre/Scade , and Signal/PolyChrony . Esterel and SyncCharts are control-oriented state-based formalisms [46] , [43] , while Lustre and Signal are declarative data-flow based formalisms. Synchronous formalisms were discussed in many articles and book chapters, amongst which [50] , [52] , [45] , [4] , [2] , [8] .
The INRIA spin-off Esterel-EDA now develops and markets the industrial versions Esterel Studio and SCADE together with their programming environments.
GALS and multiclock extensions
The purely single-clock synchronous formalisms often prove to be excessively demanding for users to write large systems descriptions, with different clock domains. Independent logical clocks may be used to represent (total or partial) asynchrony amongst concurrent processes. Globally-Asynchronous/Locally-synchronous (GALS) and polychronous/multiclock models are handy extensions to provide flexibility and modularity in system design. The recently proposed theory of latency-insensitive design (LID) with elastic time is a good example of such an approach: specific protocol elements may be inserted between existing “black-box” IP block components, at a subsequent design time, to make them comply with imperative latencies on the global communications.
In any case the basic synchronous model remains the basic semantic level for behaviors, where the reaction step is defined. But natural properties (such as endochrony/asynchrony) allows to view the GALS and multiclock descriptions as higher-level versions with a natural synchronous interpretation provided by simple scheduling. The monoclock version is then obtained by dedicated scheduling techniques known as clock calculus .
Process Networks
The previous model extensions were often calling for general results from branches of Theoretical Computer Science and Concurrency Theory such as Process Networks and Process Calculi. Process Networks comprise Petri Nets and Kahn Networks, as well as various specializations and generalizations, such as Event/Marked Graphs, Data-Flow Domains (Synchronous, Boolean, CycloStatic, CycloDynamic,...), while Process Algebras such as CCS and CSP gave rise to extensions with simultaneous events in SCCS and Meije. We held former background in this field, and more generally in the use of formal operational semantics in the design of verification and analysis techniques for such systems. We bridged the gap in using such models to study techniques for optimized placement and (static) scheduling of models. The specific features of hardware targets let us phrase these questions in a specific context, where ad-hoc ultimately periodic regimes can be established.
Static k-periodic scheduling and routing
Following our time refinement approach, early untimed causal models may be transformed into multiclock or GALS ones, then precisely scheduled to a uniform single time. This type of approach is used for instance in the static, k-periodic scheduling of dataflow process networks such as Event Graphs [47] , [44] , or Synchronous DataFlow graphs [54] and various extensions in UC Berkeley's Ptolemy . We extended the approach by providing means for the designers to provide his/her own time constraints on a given modeling framework, and to express the actual refinement from a given (abstract) time frame to another, more concrete one.
This theory of modulo and k-periodic static scheduling for process networks (mostly Marked/Event Graphs) recently got a renewal of interest due to its application in the context of Latency-Insensitive Design [48] of SoCs. The nature of communication channels, used there for interconnect fabric, demands optimal buffer/place sizing, with corresponding flow control. We contributed several results in this direction, with fine characterization of optimal algorithmic techniques to provide such ultimately k-periodic schedules. They are progressively implemented in the K-Passa prototype software, described in 5.2 .
AAA models
The AAA (Algorithm-Architecture Adequation ) methodology which is intended for optimizing distributed real-time embedded systems relies on three models.
The Algorithm model is an extension of the well known data-flow model from Dennis [49] . It is a directed acyclic hyper-graph (DAG) that we call “conditioned factorized data dependence graph”, whose vertices are “operations” and hyper-edges are directed “data or control dependences” between operations. The data dependences defines a partial order on the operations execution. The basic data-flow model was extended in three directions: first infinite (resp. finite) repetition of a sub-graph pattern in order to specify the reactive aspect of real-time systems (resp. in order to specify the finite repetition of a sub-graph consuming different data similar to a loop in imperative languages), second “state” when data dependences are necessary between different infinite repetitions of the sub-graph pattern introducing cycles which must be avoided by introducing specific vertices called “delays” (similar to z-n in automatic control), third “conditioning” of an operation by a control dependence similar to conditional control structure in imperative languages, allowing the execution of alternative subgraphs. Delays combined with conditioning allow the programmer to specify automata necessary for describing “mode changes”.
The Architecture model is a directed graph, whose vertices are of two types: “processor” (one sequencer of operations and possibly several sequencers of communications) and “medium” (support of communications), and whose edges are directed connections.
The implementation model [5] is also a directed graph, obtained through an external compositional law, where an architecture graph operates on an algorithm graph in order to give, as a result, a new algorithm graph, which corresponds to the initial algorithm graph, distributed and scheduled according to the architecture graph.
Distributed Real-Time Scheduling and Optimization
We adress two main issues: monoprocessor real-time scheduling and multiprocessor real-time scheduling where constraints must mandatorily be met otherwise dramatic consequences may occur (hard real-time) and where resources must be minimized because of embedded features.
In our monoprocessor real-time scheduling work, beside the classical deadline constraint, often equal to a period, we take into consideration dependences beetween tasks and several, possibly related, latencies. A latency is a generalization [3] of the typical “end-to-end” constraint. Dealing with multiple real-time constraints raises the complexity of that issue. Moreover, because the preemption leads to a waste of resources due to its approximation in the WCET (Worst Execution Time) of every task as proposed by Liu and Leyland [55] , we first studied non-preemtive real-time scheduling with dependences, periodicities, and latencies constraints. Although a bad approximation may have dramatic consequences on real-time scheduling, there are only few researches on this topic. We have been investigating preemptive real-time scheduling since few years, but seeking the exact cost of the preemption such that it can be integrated in schedulability conditions, and in the corresponding scheduling algorithms. More generally, we are interested in integrating in the schedulability analyses the cost of the RTOS (Real-Time Operating System), for which the exact cost of preemption is the most difficult part because it varies according to the instance of each task [7] . Finally, we investigate also the problem of mixing hard real-time and soft real-time constraints that arises in the most complex applications.
The second research area is devoted to distributed real-time scheduling with embedding constraints. We use the results obtained in the monoprocessor case in order to derive solutions for the problem of multiprocessor (distributed) real-time scheduling. In addition to satisfy the multiple real-time constraints mentioned in the monoprocessor case, we have to minimize the total execution time (makespan) since we deal with automatic control applications involving feedback. Furthermore, the domain of embedded systems leads to solve minimization resources problems. Since these optimization problems are of NP-hard complexity we develop exact algorithms (B & B, B & C) which are optimal for simple problems, and heuristics which are sub-obtimal for realistic problems corresponding to industrial needs. Long time ago we proposed a very fast “greedy” heuristics [51] whose results were regularly improved, and extended with local neighborhood heuristics [6] , or used as initial solutions for metaheuristics such as variants of “simulated annealing”.
Finally, since real-time distributed architectures are prone to failures we study the possibility to tolerate faults in such systems. We focus on software redondance rather than hardware redondance to guarantee the same real-time behaviour of the system, in the presence of a certain number of faulty processors and of communication media beeing specified by the designer. We investigate fail silent, transient, intermittent, and Byzantine faults.