Activity Report 2013

Project-Team AOSTE

Models and methods of analysis and optimization for systems with real-time and embedding constraints

IN COLLABORATION WITH: Laboratoire informatique, signaux systèmes de Sophia Antipolis (I3S)

RESEARCH CENTERS
Sophia Antipolis - Méditerranée
Paris - Rocquencourt

THEME
Embedded and Real-time Systems
Table of contents

1. Members .................................................................................................................. 1
2. Overall Objectives .................................................................................................... 2
   2.1. Embedded System Design ..................................................................... 2
   2.2. Highlights of the Year ............................................................................. 2
3. Research Program ..................................................................................................... 2
   3.1. Models of Computation and Communication (MoCCs) ....................... 2
       3.1.1. K-periodic static scheduling and routing in Process Networks ...... 3
       3.1.2. Endochrony and GALS implementation of conflict-free polychronous programs .................. 3
   3.2. Logical Time in Model-Driven Embedded System Design .................. 4
   3.3. The AAA (Algorithm-Architecture Adequation) methodology and Real-Time Scheduling .......... 4
       3.3.1. Algorithm-Architecture Adequation ........................................ 5
       3.3.2. Distributed Real-Time Scheduling and Optimization .................. 5
4. Application Domains .................................................................................................. 6
   4.1. Multicore System-on-Chip design .............................................................. 6
   4.2. Automotive and avionic embedded systems ............................................. 7
5. Software and Platforms .............................................................................................. 7
   5.1. TimeSquare ................................................................................................. 7
   5.2. K-Passa ........................................................................................................ 7
   5.3. SynDEx ......................................................................................................... 8
   5.4. Lopht ............................................................................................................ 9
   5.5. SAS ............................................................................................................. 9
6. New Results ................................................................................................................ 9
   6.1. Process Networks with routing for parallel architectures .................... 9
   6.2. Formal analysis of MARTE Time Model and CCSL ............................. 10
   6.3. Logical time in Model-Driven Engineering of embedded systems ........ 10
   6.4. Multiview modeling and power intent in Systems-on-chip .................. 11
   6.5. Performance variability analysis on manycore architectures ............... 11
   6.6. Off-line (static) mapping of real-time applications onto NoC-based many-cores ......... 12
   6.7. WCET estimation for parallel code ......................................................... 12
   6.8. Real-time scheduling and code generation for time-triggered platforms ... 13
   6.9. Uniprocessor Real-Time Scheduling ....................................................... 13
       6.9.2. Real-Time Scheduling with Exact Preemption Cost ...................... 13
   6.10. Multiprocessor Real-Time Scheduling .................................................. 14
       6.10.1. Multiprocessor Partitioned Scheduling with Exact Preemption Cost ...... 14
       6.10.2. Multiprocessor Semi-Partitioned Mixed Criticality Scheduling .......... 14
       6.10.3. Gateway with Modeling Languages for Certified Code Generation .... 14
       6.10.4. SynDEx updates with new results .............................................. 14
   6.11. Probabilistic Real-Time Systems ............................................................. 14
7. Bilateral Contracts and Grants with Industry ............................................................ 15
   7.1. Kalray MPPA256 experiments ................................................................. 15
   7.2. Astrium/CNES PostDoc ......................................................................... 15
   7.3. Kontron CIFRE ....................................................................................... 15
8. Partnerships and Cooperations ................................................................................. 16
   8.1. Regional Initiatives .................................................................................. 16
   8.2. National Initiatives .................................................................................. 16
       8.2.1. ANR .............................................................................................. 16
           8.2.1.1. HeLP ................................................................................... 16
           8.2.1.2. HOPE .............................................................................. 16
8.2.1.3. GeMoC  
8.2.2. FUI  
8.2.2.1. FUI P  
8.2.2.2. FUI PARSEC  
8.2.2.3. FUI CLISTINE  
8.2.3. Investissements d’Avenir  
8.3. European Initiatives  
8.3.1. FP7 Projects  
8.3.2. Collaborations in European Programs, except FP7  
8.4. International Initiatives  
8.4.1. Inria Associate Teams  
8.4.2. Inria International Labs  
8.5. International Research Visitors  
9. Dissemination  
9.1. Scientific Animation  
9.2. Teaching - Supervision - Juries  
9.2.1. Teaching  
9.2.2. Supervision  
9.2.3. Juries  
9.3. Popularization  
10. Bibliography
Project-Team AOSTE

Keywords: Embedded Systems, Synchronous Languages, Model-Driven Engineering, Real-Time, Scheduling, Concurrency, Model Of Computation

Aoste is a joint team with the University of Nice/Sophia-Antipolis (UNS) and the UMR CNRS 7270: I3S. It is also co-located between the two Inria centers of Sophia-Antipolis and Rocquencourt.

Creation of the Project-Team: 2004 July 01.

1. Members

Research Scientists
- Robert de Simone [Team leader, Inria, Senior Researcher, Sophia Antipolis - Méditerranée, HdR]
- Yves Sorel [Team leader, Inria, Senior Researcher, Paris - Rocquencourt]
- Liliana Cucu [Inria, Researcher, since Sept 2013, Paris - Rocquencourt]
- Dumitru Potop Butucaru [Inria, Researcher, Paris - Rocquencourt]

Faculty Members
- Charles André [UNS, Professor, Emeritus until Aug 2013, Sophia Antipolis - Méditerranée, HdR]
- Julien Deantoni [UNS, Associate Professor, Sophia Antipolis - Méditerranée]
- Frédéric Mallet [UNS, Associate Professor, Sophia Antipolis - Méditerranée, HdR]
- Marie-Agnès Peraldi Frati [UNS, Associate Professor, Sophia Antipolis - Méditerranée]
- Sid Touati [UNS, Professor, Sophia Antipolis - Méditerranée, HdR]

External Collaborator
- Laurent George [Univ. Paris XII, Associate Professor, Paris - Rocquencourt, HdR]

Engineers
- Abderraouf Benyahia [Inria, FUI P project, Paris - Rocquencourt]
- Daniel de Rauglaudre [Inria, Paris - Rocquencourt]
- Cécile Stentzel [Inria, Paris - Rocquencourt]
- Adriana Gogonel [Inria, DEPARTS/PROXIMA projects, since May 2013, Paris - Rocquencourt]
- Arda Goknil [Inria, ARTEMIS PRESTO contract, until Oct 2013, Sophia Antipolis - Méditerranée]
- Luc Hogie [CNRS, Sophia Antipolis - Méditerranée]
- Cadé Lo [Inria, FP7 PROXIMA project, until Jul 2013, Paris - Rocquencourt]
- Cristian Maxim [Inria, DEPARTS/PROXIMA projects, since Aug 2013, Paris - Rocquencourt]
- Amin Oueslati [Inria, Kalray contract, since Mar 2013, Sophia Antipolis - Méditerranée]
- Meriem Zidouni [Inria, Paris - Rocquencourt]

PhD Students
- Mohamed Bergach [CIFRE contract with Kontron company, Sophia Antipolis - Méditerranée]
- Thomas Carle [Inria, Paris - Rocquencourt]
- Manel Djemal [Inria, Paris - Rocquencourt]
- Carlos Gomez Cardenas [UNS, Moniteur, Sophia Antipolis - Méditerranée]
- Amani Khecharem [Inria, ANR HOPE project, Sophia Antipolis - Méditerranée]
- Émilien Kofman [UNS, LabEx UCN@sophia, from Oct 2013, Sophia Antipolis - Méditerranée]
- Dorin Maxim [Inria, DEPARTS project, since Dec 2013, Paris - Rocquencourt]
- Falou Ndyoe [Inria, Paris - Rocquencourt]
- Matias Vara Larsen [CNRS, ANR GeMoC project, Sophia Antipolis - Méditerranée]

Post-Doctoral Fellows
- Jean-Vivien Millo [until Sept 2013, then ATER UNS, Sophia Antipolis - Méditerranée]
- Raúl Gorcitz [CNES, from Sep 2013, Paris - Rocquencourt]
- Zhen Zhang [Inria, until Jul 2013, Paris - Rocquencourt]
2. Overall Objectives

2.1. Embedded System Design

Typical embedded software applications display a mix of multimedia signal/data processing with modal interfaces, resulting in heterogenous concurrent data-flow streaming models, and often stringent real-time constraints. Similarly, embedded architectural platforms are becoming increasingly parallel, with dedicated hardware accelerators and manycore processors. The optimized compilation of such kinds of applications onto such execution platforms involves complex mapping issues, both in terms of spatial distribution and in terms of temporal scheduling. Currently, it is far from being a fully automatic compilation process as in the case of commodity PC applications. Models are thus needed, both as formal mathematical objects from theoretical computer science to provide foundations for embedded system design, and also as engineering models to support an effective design flow.

Our general approach is directly inspired from the theories of synchronous languages, process networks, and of real-time distributed scheduling. We insist on the introduction of logical time as functional design ingredient to be explicitly considered as first-class modeling element of systems. Logical time is based on logical clocks, where such a clock can be defined as any meaningful sequence of event occurrences, usually meant as activation/trIGGERing conditions for actions and operations in the systems. So logical time can be multiform, a global partial order built from local total orders of clocks. In the course of the design flow time refinement takes place, as decision are made towards placement and timing of various tasks and operations. This solves in part the constraints between clocks, committing to schedule and placement decisions. The final version should be totally ordered, and then subjected to physical timing verification as to physical constraints.

The general (logical) Time Model has been standardized as part of the OMG profile for Modeling and Analysis of Real-Time Embedded systems (MARTE).

Work on polychronous formalisms (descending from ESTEREL), on a Clock Constraint Specification Language handling logical time, on Application-Architecture Adequation approach and real-time scheduling results has been progressed over the years, resulting in software environments such as SYNDEX or TimeSquare.

2.2. Highlights of the Year

The 2013 edition of the RTNs conference was organized in Sophia-Antipolis, with Robert de Simone as General Chair and Liliana Cucu-Grosjean as keynote speaker. Rob Davis, from the University of York, was granted a five-year Inria International Chair position in our team at Rocquencourt.

3. Research Program

3.1. Models of Computation and Communication (MoCCs)

Participants: Charles André, Robert de Simone, Jean-Vivien Millo, Dumitru Potop Butucaru.
Esterel, SyncCharts, synchronous formalisms, Process Networks, Marked Graphs, Kahn networks, compilation, synthesis, formal verification, optimization, allocation, refinement, scheduling

Formal Models of Computation form the basis of our approach to Embedded System Design. Because of the growing importance of communication handling, it is now associated with the name, MoCC in short. The appeal of MoCCs comes from the fact that they combine features of mathematical models (formal analysis, transformation, and verification) with those of executable specifications (close to code level, simulation, and implementation). Examples of MoCCs in our case are mainly synchronous reactive formalisms and dataflow process networks. Various extensions or specific restrictions enforce respectively greater expressivity or more focused decidable analysis results.

DataFlow Process Networks and Synchronous Reactive Languages such as Esterel/SyncCharts and SIGNAL/POLYCHRONY [64], [65], [59], [15], [4], [13] share one main characteristic: they are specified in a self-timed or loosely timed fashion, in the asynchronous data-flow style. But formal criteria in their semantics ensure that, under good correctness conditions, a sound synchronous interpretation can be provided, in which all treatments (computations, signaling communications) are precisely temporally mapped. This is referred to as clock calculus in synchronous reactive systems, and leads to a large body of theoretical studies and deep results in the case of DataFlow Process Networks [60], [58] (consider SDF balance equations for instance [67]).

As a result, explicit schedules become an important ingredient of design, which ultimately can be considered and handled by the designer him/herself. In practice such schedules are sought to optimize other parts of the design, mainly buffering queues: production and consumption of data can be regulated in their relative speeds. This was specially taken into account in the recent theories of Latency-Insensitive Design [61], or N-synchronous processes [62], with some of our contributions [6].

Explicit schedule patterns should be pictured in the framework of low-power distributed mapping of embedded applications onto manycore architectures, where they could play an important role as theoretical formal models on which to compute and optimize allocations and performances. We describe below two lines of research in this direction. Striking in these techniques is the fact that they include time and timing as integral parts of early functional design. But this original time is logical, multiform, and only partially ordering the various functional computations and communications. This approach was radically generalized in our team to a methodology for logical time based design, described next (see 3.2).

3.1.1. K-periodic static scheduling and routing in Process Networks

In the recent years we focused on the algorithm treatments of ultimately k-periodic schedule regimes, which are the class of schedules obtained by many of the theories described above. An important breakthrough occurred when realizing that the type of ultimately periodic binary words that were used for reporting static scheduling results could also be employed to record a completely distinct notion of ultimately k-periodic route switching patterns, and furthermore that commonalities of representation could ease combine them together. A new model, by the name of K-periodical Routed marked Graphs (KRG) was introduced, and extensively studied for algebraic and algorithmic properties [5].

The computations of optimized static schedules and other optimal buffering configurations in the context of latency-insensitive design led to the K-Passa software tool development [5].

3.1.2. Endochrony and GALS implementation of conflict-free polychronous programs

The possibility of exploring various schedulings for a given application comes from the fact that some behaviors are truly concurrent, and mutually conflict-free (so they can be executed independently, with any choice of ordering). Discovering potential asynchronous inside synchronous reactive specifications then becomes something highly desirable. It can benefit to potential distributed implementation, where signal communications are restricted to a minimum, as they usually incur loss in performance and higher power consumption. This general line of research has come to be known as Endochrony, with some of our contributions [11].
3.2. Logical Time in Model-Driven Embedded System Design

Participants: Charles André, Julien Deantoni, Frédéric Mallet, Marie-Agnès Peraldi Frati, Robert de Simone.

Starting from specific needs and opportunities for formal design of embedded systems as learned from our work on MoCCs (see 3.1), we developed a Logical Time Model as part of the official OMG UML profile MARTE for Modeling and Analysis of Real-Time Embedded systems. With this model is associated a Clock Constraint Specification Language (CCSL), which allows to provide loose or strict logical time constraints between design ingredients, be them computations, communications, or any kind of events whose repetitions can be conceived as generating a logical conceptual clock (or activation condition). The definition of CCSL is provided in [1].

Our vision is that many (if not all) of the timing constraints generally expressed as physical prescriptions in real-time embedded design (such as periodicity, sporadicity) could be expressed in a logical setting, while actually many physical timing values are still unknown or unspecified at this stage. On the other hand, our logical view may express much more, such as loosely stated timing relations based on partial orderings or partial constraints.

So far we have used CCSL to express important phenomena as present in several formalisms: AADL (used in avionics domain), EAST-ADL2 (proposed for the AutoSar automotive electronic design approach), IP-Xact (for System-on-Chip (SoC) design). The difference here comes from the fact that these formalisms were formerly describing such issues in informal terms, while CCSL provides a dedicated formal mathematical notation. Close connections with synchronous and polychronous languages, especially Signal, were also established; so was the ability of CCSL to model dataflow process network static scheduling.

In principle the MARTE profile and its Logical Time Model can be used with any UML editor supporting profiles. In practice we focused on the Papyrus open-source editor, mainly from CEA LIST. We developed under Eclipse the TIMESQUARE solver and emulator for CCSL constraints (see 5.1), with its own graphical interface, as a stand-alone software module, while strongly coupled with MARTE and Papyrus.

While CCSL constraints may be introduced as part of the intended functionality, some may also be extracted from requirements imposed either from real-time user demands, or from the resource limitations and features from the intended execution platform. Sophisticated detailed descriptions of platform architectures are allowed using MARTE, as well as formal allocations of application operations (computations and communications) onto platform resources (processors and interconnects). This is of course of great value at a time where embedded architectures are becoming more and more heterogeneous and parallel or distributed, so that application mapping in terms of spatial allocation and temporal scheduling becomes harder and harder. This approach is extensively supported by the MARTE profile and its various models. As such it originates from the Application-Architecture-Adequation (AAA) methodology, first proposed by Yves Sorel, member of Aoste.

AAA aims at specific distributed real-time algorithmic methods, described next in 3.3.

Of course, while logical time in design is promoted here, and our works show how many current notions used in real-time and embedded systems synthesis can naturally be phrased in this model, there will be in the end a phase of validation of the logical time assumptions (as is the case in synchronous circuits and SoC design with timing closure issues). This validation is usually conducted from Worst-Case Execution Time (WCET) analysis on individual components, which are then used in further analysis techniques to establish the validity of logical time assumptions (as partial constraints) asserted during the design.

3.3. The AAA (Algorithm-Architecture Adequation) methodology and Real-Time Scheduling

Participants: Laurent George, Dumitru Potop Butucaru, Yves Sorel.

Note: The AAA methodology and the SynDEx environment are fully described at http://www.syndex.org/, together with relevant publications.
3.3.1. Algorithm-Architecture Adequation

The AAA methodology relies on distributed real-time scheduling and relevant optimization to connect an Algorithm/Application model to an Architectural one. We now describe its premises and benefits.

The Algorithm model is an extension of the well known data-flow model from Dennis [63]. It is a directed acyclic hyper-graph (DAG) that we call “conditioned factorized data dependence graph”, whose vertices are “operations” and hyper-edges are directed “data or control dependences” between operations. The data dependences define a partial order on the operations execution. The basic data-flow model was extended in three directions: first infinite (resp. finite) repetition of a sub-graph pattern in order to specify the reactive aspect of real-time systems (resp. in order to specify the finite repetition of a sub-graph consuming different data similar to a loop in imperative languages), second “state” when data dependences are necessary between different infinite repetitions of the sub-graph pattern introducing cycles which must be avoided by introducing specific vertices called “delays” (similar to $z^{-n}$ in automatic control), third “conditioning” of an operation by a control dependence similar to conditional control structure in imperative languages, allowing the execution of alternative subgraphs. Delays combined with conditioning allow the programmer to specify automata necessary for describing “mode changes”.

The Architecture model is a directed graph, whose vertices are of two types: “processor” (one sequencer of operations and possibly several sequencers of communications) and “medium” (support of communications), and whose edges are directed connections.

The resulting implementation model [9] is obtained by an external compositional law, for which the architecture graph operates on the algorithm graph. Thus, the result of such compositional law is an algorithm graph, ”architecture-aware”, corresponding to refinements of the initial algorithm graph, by computing spatial (distribution) and timing (scheduling) allocations of the operations onto the architecture graph resources. In that context ”Adequation” refers to some search amongst the solution space of resulting algorithm graphs, labelled by timing characteristics, for one algorithm graph which verifies timing constraints and optimizes some criteria, usually the total execution time and the number of computing resources (but other criteria may exist). The next section describes distributed real-time schedulability analysis and optimization techniques for that purpose.

3.3.2. Distributed Real-Time Scheduling and Optimization

We address two main issues: uniprocessor and multiprocessor real-time scheduling where constraints must mandatorily be met, otherwise dramatic consequences may occur (hard real-time) and where resources must be minimized because of embedded features.

In the case of uniprocessor real-time scheduling, besides the classical deadline constraint, often equal to a period, we take into consideration dependences between tasks and several, latencies. The latter are complex related “end-to-end” constraints. Dealing with multiple real-time constraints raises the complexity of the scheduling problems. Moreover, because the preemption leads, at least, to a waste of resources due to its approximation in the WCET (Worst Execution Time) of every task, as proposed by Liu and Leyland [68], we first studied non-preemptive real-time scheduling with dependences, periodicities, and latencies constraints. Although a bad approximation of the preemption cost, may have dramatic consequences on real-time scheduling, there are only few researches on this topic. We have been investigating preemptive real-time scheduling since few years, and we focus on the exact cost of the preemption. We have integrated this cost in the schedulability conditions that we propose, and in the corresponding scheduling algorithms. More generally, we are interested in integrating in the schedulability analyses the cost of the RTOS (Real-Time Operating System), for which the cost of preemption is the most difficult part because it varies according to the instance (job) of each task.

In the case of multiprocessor real-time scheduling, we chose at the beginning the partitioned approach, rather than the global approach, since the latter allows task migrations whose cost is prohibitive for current commercial processors. The partitioned approach enables us to reuse the results obtained in the uniprocessor case in order to derive solutions for the multiprocessor case. We consider also the semi-partitioned approach
which allows only some migrations in order to minimize the overhead they involve. In addition to satisfy the multiple real-time constraints mentioned in the uniprocessor case, we have to minimize the total execution time (makespan) since we deal with automatic control applications involving feedback loops. Furthermore, the domain of embedded systems leads to solving minimization resources problems. Since these optimization problems are NP-hard we develop exact algorithms (B & B, B & C) which are optimal for simple problems, and heuristics which are sub-optimal for realistic problems corresponding to industrial needs. Long time ago we proposed a very fast “greedy” heuristics [8] whose results were regularly improved, and extended with local neighborhood heuristics, or used as initial solutions for metaheuristics.

In addition to the spatial dimension (distributed) of the real-time scheduling problem, other important dimensions are the type of communication mechanisms (shared memory vs. message passing), or the source of control and synchronization (event-driven vs. time-triggered). We explore real-time scheduling on architectures corresponding to all combinations of the above dimensions. This is of particular impact in application domains such as automotive and avionics (see 4.2).

The arrival of complex hardware responding to the increasing demand for computing power in next generation systems exacerbates the limitations of the current worst-case real-time reasoning. Our solution to overcome these limitations is based on the fact that worst-case situations may have an extremely low probability of appearance within one hour of functioning ($10^{-45}$ [17]), compared to the certification requirements for instance ($10^{-9}$ for the highest level of certification in avionics ). Thus we model and analyze the real-time systems using probabilistic models and we propose results that are fundamental for the probabilistic worst-case reasoning over a given time window.

4. Application Domains

4.1. Multicore System-on-Chip design

Synchronous formalisms and GALS or multicycle extensions are natural model representations of hardware circuits at various abstraction levels. They may compete with HDLs (Hardware Description Languages) at RTL and even TLM levels. The main originality of languages built upon these models is to be based on formal synthesis semantics, rather than mere simulation forms.

The flexibility in formal Models of Computation and Communication allows specification of modular Latency-Insensitive Designs, where the interconnect structure is built up and optimized around existing IP components, respecting some mandatory computation and communication latencies prescribed by the system architect. This allows a real platform view development, with component reuse and timing-closure analysis. The design and optimization of interconnect fabric around IP blocks transform at modeling level an (untimed) asynchronous versions into a (scheduled) multicycle timed one.

Also, Network on Chip (NoC) design may call for computable switching patterns, just like computable scheduling patterns were used in (predictable) Latency-Insensitive Design. Here again formal models, such as Cyclo-static dataflow graphs and extended Kahn networks with explicit routing schemes, are modeling elements of choice for a real synthesis/optimization approach to the design of systems. New parallel architecture paradigms, such as GPU co-processors or Massive Parallel Processor Arrays (MPPA) form natural targets as NoC-based platforms.

Multicore embedded architecture platform may be represented as Marte UML component diagrams. The semantics of concurrent applications may also be represented as Marte behavior diagrams embodying precise MoCCs. Optimized compilations/syntheses rely on specific algorithms, and are represented as model transformations and allocation (of application onto architecture).

Our current work aims thus primarily at providing Theoretical Computer Science foundations to this domain of multicore embedded SoCs, with possibly efficient application in modeling, analysis and compilation wherever possible due to some natural assumptions. We also deal with a comparative view of Esterel and SystemC TLM for more practical modeling, and the relation between the Spirit IP-Xact interface standard in SoC domain with its Marte counterpart.
4.2. Automotive and avionic embedded systems

Model-Driven Engineering is in general well accepted in the transportation domains, where design of digital software and electronic parts is usually tightly coupled with larger aspects of system design, where models from physics are being used already. The formalisms AADL (for avionics) and AutoSar [66] (for automotive) are providing support for this, unfortunately not always with a clean and formal semantics. Thus there is a strong need here for approaches that bring closer together formal methods and tools on the one hand, engineering best practices on the other hand.

From a structural point of view AUTOSAR succeeded in establishing a framework that provides significant confidence in the proper integration of software components from a variety of distinct suppliers. But beyond those structural (interface) aspects, dynamic and temporal views are becoming more of a concern, so that AUTOSAR has introduced the AUTOSAR Specification of Timing Extension. AUTOSAR (discrete) timing models consist of timing descriptions, expressed by events and event chains, and timing constraints that are imposed on these events and event chains.

An important issue in all such formalisms is to mix in a single design framework heterogeneous time models and tasks: based on different timebases, with different triggering policy (event-triggered and time-triggered), and periodic and/or aperiodic tasks, with distinct periodicity if ever. Adequate modeling is a prerequisite to the process of scheduling and allocating such tasks onto complex embedded architectural platforms (see AAA approach in foundation section 3.3). Only then can one devise powerful synthesis/analysis/verification techniques to guide designers towards optimized solutions.

Traceability is also an important concern, to close the gap between early requirements and constraints modelling on the one hand, verification and correct implementation of these constraints at the different levels of the development on the other hand.

5. Software and Platforms

5.1. TimeSquare

Participants: Charles André, Nicolas Chleq, Julien Deantoni, Frédéric Mallet [correspondant].

TimeSquare is a software environment for the modeling and analysis of timing constraints in embedded systems. It relies specifically on the Time Model of the MARTE UML profile (see section 3.2), and more accurately on the associated Clock Constraint Specification Language (CCSL) for the expression of timing constraints.

TimeSquare offers four main functionalities:

1. graphical and/or textual interactive specification of logical clocks and relative constraints between them;
2. definition and handling of user-defined clock constraint libraries;
3. automated simulation of concurrent behavior traces respecting such constraints, using a Boolean solver for consistent trace extraction;
4. call-back mechanisms for the traceability of results (animation of models, display and interaction with waveform representations, generation of sequence diagrams...).

In practice TimeSquare is a plug-in developed with Eclipse modeling tools. The software is registered by the Agence pour la Protection des Programmes, under number IDDN.FR.001.170007.000.S.P.2009.001.10600. It can be downloaded from the site http://timesquare.inria.fr/. It has been integrated in the OpenEmbeDD ANR RNTL platform, and other such actions are under way.

5.2. K-Passa

Participants: Jean-Vivien Millo [correspondant], Robert de Simone.
This software is dedicated to the simulation, analysis, and static scheduling of Event/Marked Graphs, SDF and KRG extensions. A graphical interface allows to edit the Process Networks and their time annotations (latency, ...). Symbolic simulation and graph-theoretic analysis methods allow to compute and optimize static schedules, with best throughputs and minimal buffer sizes. In the case of KRG the (ultimately k-periodic) routing patterns can also be provided and transformed for optimal combination of switching and scheduling when channels are shared. KPASSA also allows for import/export of specific description formats such as UML-MARTE, to and from our other TimeSquare tool.

The tool was originally developed mainly as support for experimentations following our research results on the topic of Latency-Insensitive Design. This research was conducted and funded in part in the context of the CIM PACA initiative, with initial support from ST Microelectronics and Texas Instruments.

KPASSA is registered by the Agence pour la Protection des Programmes, under the number IDDN.FR.001.310003.000.S.P.2009.000.20700. It can be downloaded from the site http://www-sop.inria.fr/aoste/index.php?page=software/kpassa.

5.3. SynDEx

Participants: Maxence Guesdon, Yves Sorel [correspondant], Cécile Stentzel, Meriem Zidouni.

SynDEx is a system level CAD software implementing the AAA methodology for rapid prototyping and for optimizing distributed real-time embedded applications. Developed in OCaML it can be downloaded free of charge, under Inria copyright, from the general SynDEx site http://www.syndex.org.

The AAA methodology is described in section 3.3. Accordingly, SYNDEX explores the space of possible allocations (spatial distribution and temporal scheduling), from application elements to architecture resources and services, in order to match real-time requirements; it does so by using schedulability analyses and heuristic techniques. Ultimately it generates automatically distributed real-time code running on real embedded platforms. The last major release of SYNDEX (V7) allows the specification of multi-periodic applications.

Application algorithms can be edited graphically as directed acyclic task graphs (DAG) where each edge represents a data dependence between tasks, or they may be obtained by translations from several formalisms such as Scicos (http://www.scicos.org), Signal/Polychrony (http://www.irisa.fr/espresso/Polychrony/download.php), or UML2/MARTE models (http://www.omg.org/technology/documents/profile_catalog.htm).

Architectures are represented as graphical block diagrams composed of programmable (processors) and non-programmable (ASIC, FPGA) computing components, interconnected by communication media (shared memories, links and busses for message passing). In order to deal with heterogeneous architectures it may feature several components of the same kind but with different characteristics.

Two types of non-functional properties can be specified for each task of the algorithm graph. First, a period that does not depend on the hardware architecture. Second, real-time features that depend on the different types of hardware components, ranging amongst execution and data transfer time, memory, etc.. Requirements are generally constraints on deadline equal to period, latency between any pair of tasks in the algorithm graph, dependence between tasks, etc.

Exploration of alternative allocations of the algorithm onto the architecture may be performed manually and/or automatically. The latter is achieved by performing real-time multiprocessor schedulability analyses and optimization heuristics based on the minimization of temporal or resource criteria. For example while satisfying deadline and latency constraints they can minimize the total execution time (makespan) of the application onto the given architecture, as well as the amount of memory. The results of each exploration is visualized as timing diagrams simulating the distributed real-time implementation.

Finally, real-time distributed embedded code can be automatically generated for dedicated distributed real-time executives, possibly calling services of resident real-time operating systems such as Linux/RTAI or Osek for instance. These executives are deadlock-free, based on off-line scheduling policies. Dedicated executives induce minimal overhead, and are built from processor-dependent executive kernels. To this date, executives kernels are provided for: TMS320C40, PIC18F2680, i80386, MC68332, MPC555, i80C196 and Unix/Linux.
workstations. Executive kernels for other processors can be achieved at reasonable cost following these examples as patterns.

5.4. Lopht

**Participants:** Thomas Carle, Manel Djemal, Zhen Zhang, Dumitru Potop Butucaru [correspondant].

The Lopht (Logical to Physical Time Compiler) has been designed as an implementation of the AAA methodology. Lopht is similar to SynDEx by relying on off-line allocation and scheduling techniques to allow real-time implementation of dataflow synchronous specifications onto multiprocessor systems. But it has two significant originality points: a stronger focus on efficiency (but without compromising correctness), and a focus on novel target architectures (many-core chips and time-triggered embedded systems).

Improved efficiency is attained through the use of classical and novel data structures and optimization algorithms pertaining to 3 fields: synchronous language compilation, classical compiler theory, and real-time scheduling. A finer representation of execution conditions allows us to make a better use of double resource reservation and thus improve latency and throughput. The use of software pipelining allows the improvement of computation throughput. The use of post-scheduling optimisations allows a reduction in the number of preemptions. The focus on novel architectures means that architecture descriptions need to define novel communication media such as the networks-on-chips (NoCs), and that real-time characteristics must include those specific to a time-triggered execution model, such as the Major Time Frame (MTF).

Significant contributions to the Lopht tool have been brought by T. Carle (the extensions concerning time-triggered platforms), M. Djemal (the extensions concerning many-core platforms), and Zhen Zhang under the supervision of D. Potop Butucaru. The tool has been used and extended during the PARSEC project. It is currently used in the IRT SystemX/FSF project, in the collaboration with Astrium Space Transportation (Airbus Defence and Space), and in the collaboration with Kalray SA. It has been developed in OCaml.

5.5. SAS

**Participants:** Daniel de Rauglaudre [correspondant], Yves Sorel.

The SAS (Simulation and Analysis of Scheduling) software allows the user to perform the schedulability analysis of periodic task systems in the monoprocessor case.

The main contribution of SAS, when compared to other commercial and academic softwares of the same kind, is that it takes into account the exact preemption cost between tasks during the schedulability analysis. Beside usual real-time constraints (precedence, strict periodicity, latency, etc.) and fixed-priority scheduling policies (Rate Monotonic, Deadline Monotonic, Audsley++, User priorities), SAS additionally allows to select dynamic scheduling policy algorithms such as Earliest Deadline First (EDF). The resulting schedule is displayed as a typical Gantt chart with a transient and a permanent phase, or as a disk shape called "dameid", which clearly highlights the idle slots of the processor in the permanent phase.

For a schedulable task system under EDF, when the exact preemption cost is considered, the period of the permanent phase may be much longer than the least commun multiple (LCM) of the periods of all tasks, as often found in traditional scheduling theory. Specific effort has been made to improve display in this case. The classical utilization factor, the permanent exact utilization factor, the preemption cost in the permanent phase, and the worst response time for each task are all displayed when the system is schedulable. Response times of each task relative time can also be displayed (separately).

SAS is written in OCaML, using CAMLP5 (syntactic preprocessor) and OLIBRT (a graphic toolkit under X). Both are written by Daniel de Rauglaudre. It can be downloaded from the site [http://pauillac.inria.fr/~ddr/sas-dameid/](http://pauillac.inria.fr/~ddr/sas-dameid/).

6. New Results

6.1. Process Networks with routing for parallel architectures

**Participants:** Robert de Simone, Emilien Kofman, Jean-Vivien Millo.
In the past we developed a dedicated Process Network (PN) formalism with explicit static switching/routing schemes for data flow. This year we considered the practical use of our formalism to model data-streams in specific applicative contexts.

In a first direction we considered the case of stencil algorithms, usually modeled with cellular automata (CA) (as in heat or gas propagation models for instance). In that case, the application itself is modeled in a way strongly similar to a physical architecture consisting of a regular mesh/array of parallel processors (MPPA). Mapping can seem to be straightforward then, safe that the neighborhood and connection topology may differ from the CA model to the MPPA. Our results consider efficient routing and propagation schemes on a given MPPA interconnect fabric, so as to match all-to-all broadcast patterns up to a given distance (on the CA topology). They are described in [20], and were implemented on Kalray MPPA256 prototype architecture.

A similar modeling effort was conducted, this time on FFT algorithm models (again described as parallel pipe-lined tasks). Again switching/routing schemes were provided in our formal PN model to map the virtual logical dependences onto concrete connection patterns in a MPPA256 model. This was the subject of Emilien Kofman internship, of which preliminary results were presented in a junior workshop [36].

6.2. Formal analysis of MARTE Time Model and CCSL

Participants: Frédéric Mallet, Robert de Simone, Yuliia Romenska, Jean-Vivien Millo, Ling Yin.

We have worked on building analysis methods and tools for running exhaustive analyses on MARTE/CCSL specifications. This was done by endowing CCSL with a State-Based semantics [51]. Each operator is described as a boolean state machine, some operators require an infinite number of states. When this is the case we rely on a lazy representation technique to capture symbolically the infinite number of states [45]. The semantics of a CCSL specification is then expressed as the synchronized product of the (infinite) state machines for each operator. Even though the operators are infinite, their composition can sometimes be bounded. When the synchronized product has only a finite number of reachable states, it is said to be safe. We have identified a set of representative and frequently used examples where this is the case [38]. When the product is not finite, our (semi-)algorithm to build the product does not terminate, therefore it is important to be able to know in advance whether or not the product is safe. We have thus proposed an algorithm to decide whether a CCSL specification is safe [37]. It relies on an intermediate representation called Clock Causality Graph and uses results from marked graph theory.

Building the product for a CCSL specification is exponential in the number of clocks and is not practical for large specifications. So, to avoid building explicitly the product we have proposed another technique to explore symbolically the state-space of a CCSL specification [49]. This relies on a liveness condition where no conflict may prevent an infinite clock from ticking infinitely often. Branches that may lead to states where an infinite clock dies are pruned by a fix-point algorithm.

These two solutions focus on the logical and discrete aspects of MARTE/CCSL, which was devised to unify logical and physical time constraints. An attempt to support verification of the physical time constraints of MARTE/CCSL was conducted through the use of UppAal timed automata and model-checker [46]. The proposed technique combines the logical clocks of CCSL with the real-valued clocks of timed automata. Synchronous/Polychronous aspects are solved with TimeSquare 5.1 while the UppAal model-checker is used to explore the space derived from the real-valued clocks.

6.3. Logical time in Model-Driven Engineering of embedded systems

Participants: Frédéric Mallet, Julien Deantoni, Robert de Simone, Marie-Agnès Peraldi Frati, Matias Varal Larsen, Arda Goknil.

In the context of our approach based on logical time to specify causalities and synchronizations on models, 3.2, we developed an extension of the OMG OCL Object Constraint Language. Named ECL (Event Constraint Language) it provides such specifications of causality and synchronization at syntactic language level, which enabled then automatic generation of semantic logical time constraints for any model that conforms the language.
This year, we extended to a new challenge, using logical time constraints to coordinate models of several distinct languages used jointly for a large heterogeneous system description. This work is reported in [25], [52].

It was illustrated in practice in the automotive domain by coordinating together the Timed Augmented Description Language (TADL2) and the EAST-ADL language [34], [32] (the formalisms are rather similar, but still with clear distinctions at places).

Finally, we proposed a pattern to assemble the (possibly concurrent) semantics of a language associating our logical time constraints (based on pure clocks) with a syntactic action language (providing behavior content). By reifying events and constraints, this specification of the semantics is amenable to its composition [25]. Such approach has been, again, recently used for a first attempt to coordinate distinct behavioral models [47].

As part of our collaboration in the DAESD associated-team with ECNU Shone-SEI in Shanghai we studied the coupling of discrete-logical with continuous-physical time models, ending with a proposal of Hybrid MARTE statecharts [19] specified in a style much like a combination of MARTE state diagrams and timed automata.

In another setting we presented a new model of scenarios [21], dedicated to the specification and verification of system behaviours in the context of software product lines (SPL). The formalism uses the logical time modeling approach, with a strong link to synchronous semantics. We draw our inspiration from some techniques that are mostly used in the hardware community, and we show how they could be applied to the verification of software components and product line variability. We point out the benefits of synchronous languages and models to bridge the gap between both worlds.

### 6.4. Multiview modeling and power intent in Systems-on-chip

**Participants:** Carlos Gomez Cardenas, Ameni Khecharem, Emilien Kofman, Frédéric Mallet, Julien Dean-toni, Robert de Simone.

Power models for embedded architectures (where power consumption is highly constrained) provide an ideal example of a non-functional modeling framework with strong interactions with the functional and performance models: more speed in computation comes at the cost of larger energy consumption. There was also a demand for a framework allowing combination of models, each representing a distinct view of the system. We demonstrated as part of the HeLP ANR project 8.2.1.1, followed by the newly started HOPE ANR project 8.2.1.2, how such multiview modeling could be done, and how it could be connected down to more concrete simulation code or model, as in SystemC, Docea Power AcePlorer, or Scilab code. The multiview modeling applied to power intent and power managers was described in [35], and led to the PhD defense of Carlos Gomez Cardenas in December 2013 [16].

### 6.5. Performance variability analysis on manycore architectures

**Participants:** Sid Touati, Amin Oueslati, Franco Pestarini, Robert de Simone, Emilien Kofman.

In the context of the collaboration with Kalray (see 7.1.1), we conducted a systematic benchmarking campaign to test the stability (or low variability) of the performances of the MPPA256 prototype manycore processor. We first addressed issues of memory access and network latency, then programmed a distributed version of the classical ALL_PAIRS_SHORTEST_PATH parallel algorithm with an hybrid OpenMP/MPI style. This was the object of Amin Oueslati Master2 internship. Results were encouraging, and showed stability of performance over a large set of runs.

This work is currently extended during the International Internship grant of Franco Pescarini. Specific on-chip communication modes offered by the MPPA256 processor (namely portal and channel communication modes) are being extensively benchmarked. Results show time predictability on the case of light on-chip communication traffic, but stability gets degraded as performance decreases in presence of heavy traffic and congestion (various runs show quite different execution time).
In another effort we conducted during the internship period of Emilien Kofman an experiment on MPPA256 quite similar to the work conducted as part of the collaboration with Kontron (see 7.1.3), exploring various mapping options of FFT algorithm variants, with the goal of figuring how to best map (in the future) several such algorithms onto the computation fabric of the many-cores available.

6.6. Off-line (static) mapping of real-time applications onto NoC-based many-cores

Participants: Thomas Carle, Manel Djemal, Dumitru Potop Butucaru, Robert de Simone, Zhen Zhang.

Modern computer architectures are increasingly relying on multi-processor systems-on-chips (MPSoCs, also called chip-multiprocessors), with data transfers between cores and RAM banks managed by on-chip networks (NoCs). This reflects in part a convergence between embedded, general-purpose PC, and high-performance computing (HPC) architecture designs. In past years we have identified and compared the hardware mechanisms supporting precise timing analysis and efficient resource allocation in existing NoCs. We determined that the NoC should ideally provide the means of enforcing a global communications schedule that is computed off-line and which is synchronized with the scheduling of computations on CPU cores (and we have built such a NoC).

This year we have focused on the problem of mapping applications onto NoC-based MPSoCs (discussed in this section) and on the associated problem of timing analysis of the resulting parallel implementations (discussed in section 6.7). On-chip networks used in MPSoCs pose significant challenges to both on-line and off-line real-time scheduling approaches. They have large numbers of potential contention points, have limited internal buffering capabilities, and network control operates at the scale of small data packets. Therefore, precise schedulability analysis requires scalable algorithms working on hardware models with a level of detail that is unprecedented in real-time scheduling.

We considered an off-line scheduling approach, and we targeted massively parallel processor arrays (MPPAs), which are MPSoCs with large numbers (hundreds) of processing cores. We proposed a novel allocation and scheduling method capable of synthesizing such global computation and communication schedules covering all the execution, communication, and memory resources in an MPPA. To allow an efficient use of the hardware resources, our method takes into account the specificities of MPPA hardware and implements advanced scheduling techniques such as pre-computed preemption of data transmissions and pipelined scheduling.

Our method has been implemented within the Lopht tool presented in section 5.4, and first results are presented in [54]. One of the objectives of the collaboration with Kalray SA is the evaluation of the possibility of porting Lopht onto the Kalray MPPA platform.

6.7. WCET estimation for parallel code

Participant: Dumitru Potop Butucaru.

This is joint work with Isabelle Puaut, Inria, EPI ALF.

Classical timing analysis techniques for parallel code isolate micro-architecture analysis from the analysis of synchronizations between cores by performing them in two separate analysis phases (WCET – worst-case execution time – and WCRT – worst-case response time analyses). This isolation has its advantages, such as a reduction of the complexity of each analysis phase, and a separation of concerns that facilitates the development of analysis tools. But isolation also has a major drawback: a loss in precision which can be significant. To consider only one aspect, to be safe the WCET analysis of each synchronization-free sequential code region has to consider an undetermined micro-architecture state. This may result in overestimated WCETs, and consequently on pessimistic execution time bounds for the whole parallel application.
The contribution of this work [56], [44] is an integrated WCET analysis approach that considers at the same time micro-architectural information and the synchronizations between cores. This is achieved by extending a state-of-the-art WCET estimation technique and tool to manage synchronizations and communications between the sequential threads running on the different cores. The benefits of the proposed method are twofold. On the one hand, the micro-architectural state is not lost between synchronization-free code regions running on the same core, which results in tighter execution time estimates. On the other hand, only one tool is required for the temporal validation of the parallel application, which reduces the complexity of the timing validation toolchain.

Such a holistic approach is made possible by the use of deterministic and composable software and hardware architectures (many-cores with no cache sharing and time-predictable interconnect, static assignment of the code and data to the memory banks). Such code can be written by hand or automatically synthesized using the Lopht tool 5.4 or other automatic parallelization techniques.

6.8. Real-time scheduling and code generation for time-triggered platforms

Participants: Thomas Carle, Raul Gorcitz, Dumitru Potop Butucaru, Yves Sorel.

We have continued this year the work on real-time scheduling and code generation for time-triggered platforms. This work was mainly carried out as part of a bilateral collaboration with Astrium Space Transportation (now part of Airbus Defence and Space), which co-funded with the CNES the post-doctorate of Raul Gorcitz (started in September).

The work focused this year on the improvement of the real-time scheduling and code generation (the PhD work of T. Carle), and on determining their adequacy to Astrium’s industrial needs (the post-doc of Raul Gorcitz). We have improved our specification, mapping, and code generation technique at all levels. We have extended the Lopht tool to allow automatic mapping and code generation for single-processor and multi-processor partitioned targets (using an ARINC 653-compliant OS).

6.9. Uniprocessor Real-Time Scheduling

Participants: Yves Sorel, Falou Ndoye, Daniel de Rauglaudre.

6.9.1. Formal Proofs of Uniprocessor Real-Time Scheduling Theorems

We continued writing a monograph about three formal proofs, done in 2011/2012, in Coq on scheduling of fixed priority real-time preemptive tasks: one about the scheduling conditions of strict periodicity and two about the worst response time in the case of preemptive deadline monotonic scheduling. This document contains about 120 pages for the moment.

6.9.2. Real-Time Scheduling with Exact Preemption Cost

We proposed a new schedulability condition for dependent tasks executed on a uniprocessor which takes into account the exact preemption cost. Unlike the work presented in [10] which achieves that goal only for fixed priority tasks, our schedulability condition considers fixed as well as dynamic priorities tasks. Thus, we can overcome priority inversions involved by data dependent tasks. The schedulability analysis based on this schedulability condition led to an off-line scheduler [42] described by a scheduling table. Therefore, we have proposed an on-line time-trigger scheduler which implements this scheduling table. Compared to classical on-line schedulers, the proposed approach has two benefits. On the one hand the cost of the task selection amounts only to read the task to be executed in the scheduling table built off-line, rather than using on-line a scheduling algorithm like RM, DM, EDF, etc. On the other hand this cost is fixed since it does not depend on the number of ready tasks. In addition, with our on-line scheduler we do not need to synchronize, on-line, the utilization of the shared memory data, due to dependences, because this synchronization is performed during the off-line schedulability analysis.
6.10. Multiprocessor Real-Time Scheduling

Participants: Yves Sorel, Laurent George, Dumitru Potop-Butucaru, Falou Ndoye, Aderraouf Benyahia, Cécile Stentzel, Meriem Zidouni.

6.10.1. Multiprocessor Partitioned Scheduling with Exact Preemption Cost

We finalized the work started in previous years on multiprocessor scheduling of preemptive independent real-time tasks with exact preemption cost [43].

This year we proposed a heuristic for the multiprocessor scheduling of preemptive dependent real-time tasks with exact preemption cost. We chose the partitioned approach that avoids migration of tasks and allows the utilization of the uniprocessor schedulability condition, previously proposed, that takes into account the exact preemption cost. In addition, this schedulability condition takes into account the inter-processor communications and guarantees that no data is lost. The result of such an off-line scheduling provided by the heuristic, is a scheduling table for every processor which includes also inter-processor communication tasks. We compared our multiprocessor scheduling heuristic with a Branch & Bound exact algorithm using the same schedulability condition. Our heuristic provides similar results and is very much faster.

6.10.2. Multiprocessor Semi-Partitioned Mixed Criticality Scheduling

We mainly focused on the mixed criticality scheduling problem applied to semi-partitioned scheduling considering a static pattern of migration for jobs. We have studied this problem in the context of Mixed Criticality (MC) scheduling, a promising approach that can be used to take into account applications of different criticality levels on the same platform. The goal of MC approach is to better utilize computing resources by allowing low criticality tasks to execute in conjunction with high criticality tasks when the system criticality is not high.

6.10.3. Gateway with Modeling Languages for Certified Code Generation

This work was carried out in the P FUI project 8.2.2. We defined a SynDEx UML profile for functional specifications. We developed a gateway between the P pivot formalism and SynDEx. This gateway deals with the data-flow modeling part of the P formalism which is compliant with the Simulink subset blocks supported by the P project, except for the IF, FOR, MERGE and MUX blocks. Presently, we enhance the gateway to include these blocks and we collaborate with the other partners to define the architectural part of the P formalism. This part is intended to replace the non functional specifications, presently described with the UML profile MARTE (Modeling and Analysis of Real-Time Embedded Systems).

6.10.4. SynDEx updates with new results

We released an alpha version of SynDEx V8. This version is based on a new textual language whose compiler may be launched with commands-lines featuring various options. In Syndex V8, the adequation heuristic which performs the multiprocessor real-time schedulability analysis on multi-periodic applications, is based on the theorems and algorithms provided in the Mohamed Marouf’s thesis defended last year in the team. These algorithms have been deeply improved for better consideration of data dependencies in the case of multiprocessor architectures. On the other hand, the new heuristic generates a scheduling table composed of, in addition to the usual permanent phase, a transient phase that takes into account the distribution constraints defined by the user in the multi-periodic applications as well as in the mono-periodic applications.

6.11. Probabilistic Real-Time Systems

Participants: Liliana Cucu-Grosjean, Adriana Gogonel, Codé Lo, Dorin Maxim, Cristian Maxim.
The advent of complex hardware, in response to the increasing demand for computing power in next generation systems, exacerbates some of the limitations of static timing analysis for the estimation of the worst-case execution time (WCET) estimation. In particular, the effort of acquiring (1) detail information on the hardware to develop an accurate model of its execution latency as well as (2) knowledge of the timing behaviour of the program in the presence of varying hardware conditions, such as those dependent on the history of previously executed instructions. These problems are also known as the timing analysis walls. The probabilistic timing analysis, a novel approach to the analysis of the timing behaviour of next-generation real-time embedded systems, provides answers to timing analysis walls. In [17], [48], [31] timing analysis attacks the timing analysis walls. We have also presented experimental evidence that shows how probabilistic timing analysis reduces the extent of knowledge about the execution platform required to produce probabilistically-safe and tight WCET estimations. Based on existing estimations of WCET or minimal inter-arrival time, one may propose different probabilistic schedulability analyses [39]. These results were reported in the (PhD thesis of Dorin Maxim, mostly conducted in the Inria TRIO team (before its completion and the move to Aoste in Sept 2013). 2013 was also the year when through several invited talks [26], [28], [27], we had the opportunity to underline historical misunderstandings on probabilistic real-time systems. The most common is related to the notion of independence that is used with a wrong meaning by different papers.

7. Bilateral Contracts and Grants with Industry

7.1. Bilateral Contracts with Industry

7.1.1. Kalray MPPA256 experiments

As part of a larger collaborative programme between Inria and this company, new experimental machines equipped with Kalray MPPA256 manycore processor were provided to a small number of Inria teams. The processor itself consists of 16 processing clusters, each itself a 16-core processor (hence 256 cores altogether). The clusters are connected by an on-chip network, and the whole architecture (driven by a host, out-of-chip main CPU) may be programmed according to several computation models, some quite close from the MoCCs considered in our researches. Part of this 10-month contract was meant to fund two internships, in our case on:

- The evaluation of performance (and most of all performance variability) of the various parts of the chip (in the Sophia Antipolis branch of the team). Results are discussed in section 6.5.
- The evaluation of the possibility of code generation for the MPPA256 platform using the Lopht tool described in sections 5.4, 6.6.

7.1.2. Astrium/CNES PostDoc

Astrium Space Transportation (now part of Airbus Defence and Space) asked us if we could provide automatic methods for the design and implementation of embedded software and system/network configuration in an aerospace context. The objective is to reduce the design and validation costs (especially in case of system evolutions), while preserving an assurance level superior to that of the Ariane 5 flight program. We are exploring automation of the real-time allocation, scheduling, and code generation using the novel algorithms developed and implemented in the Lopht tool. The post-doctoral position of Raul Gorcitz was funded on this contract.

7.1.3. Kontron CIFRE

This contract provides us means to partially support the PhD thesis of Mohamed Bergach (which is physically most of the time at Kontron Toulon). The topic is to study how to efficiently implement various sizes of the FFT (Fast Fourier Transform) algorithm on multicore and GP-GPU architectures from the range of processors used at Kontron, in order to understand in a second phase how to best allocate several such algorithms in parallel, as part of a single application, in the most efficient way (regarding performance but also power consumption and thermal constraints).
8. Partnerships and Cooperations

8.1. Regional Initiatives

8.1.1. CIM PACA Design Platform

Participants: Robert de Simone, Ameni Khecharem, Carlos Gomez Cardenas, Emilien Kofman.

This ambitious regional initiative is intended to foster collaborations between local PACA industry and academia partners on the topics of microelectronic design, though mutualization of equipments, resources and R&D concerns. We are active in the Design Platform (one of three platforms), of which Inria is a founding member. This provides opportunities for interactions with local companies, leading indirectly to more formal collaborations at times. Phase 3 of the CIM PACA programme should be launched in 2014, and was subject of extensible preparation at the end of 2013.

The ANR HOPE project 8.2.1.2 is conducted under the auspices of the CIM PACA Design Platform, which also hosts prototype and commercial software products contributed by project members (Synopsys, Docea Power, and Magillem, see 8.2.1.2). Similarly, the CLISTINE FUI project was recently accepted, and supported by the platform.

8.2. National Initiatives

8.2.1. ANR

8.2.1.1. HeLP

Participants: Carlos Gomez Cardenas, Ameni Khecharem, Robert de Simone, Jean-Vivien Millo.

The ANR HeLP project dealt with joint modeling of functional behavior and energy consumption for the design of low-power heterogeneous SoCs. Partners were ST Microelectronics and Docea Power (SME) as industrial; Inria, UNS (UMR LEAT), and VERIMAG (coordinator) as academics. Our goal in this project was twofold: first, combine SoC modeling with temporal behavior and logical time with energy/power modeling as extra annotations on MARTE models; second, link the modeling abilities of MARTE with those of the domain-specific standard IP-XACT.

The project ended in April 2013, with some of its findings taken up and extended in the more recent ANR project HOPE.

8.2.1.2. HOPE

Participants: Carlos Gomez Cardenas, Ameni Khecharem, Emilien Kofman, Robert de Simone.

The ANR HOPE project focuses on hierarchical aspects for the high-level modeling and early estimation of power management techniques, with potential synthesis in the end if feasible.

The PhD defense of Carlos Gomez Cardenas was held in Dec 2013 [16], in strong connection with the project (as a follow-up of HeLP).

Although this project was officially started in November, it was in part postponed due to the replacement of a major partner (Texas Instruments) by another one (Intel). Current partners are CNRS/UNS UMR LEAT, Intel, Synopsys, Docea Power, Magillem, and ourselves.

8.2.1.3. GeMoC

Participants: Matias Vara Larsen, Julien Deantonio, Frédéric Mallet.

This project is administratively handled by CNRS for our joint team, on the UMR I3S side. Partners are Inria (Triskell EPI), ENSTA-Bretagne, IRIT, Obéo, Thales TRT.

The project focuses on the modeling of heterogeneous systems using Models of Computation and Communication for embedded and real-time systems, described using generic means of MDE techniques (and in our case the MARTE profile, and most specifically its Time Model, which allows to specify precise timely constraints for operational semantic definition).
8.2.2. FUI

8.2.2.1. FUI P

Participants: Abderraouf Benyahia, Dumitru Potop Butucaru, Yves Sorel.

The goal of project P is to support the model-driven engineering of high-integrity embedded real-time systems by providing an open code generation framework able to verify the semantic consistency of systems described using safe subsets of heterogeneous modeling languages, then to generate optimized source code for multiple programming (Ada, C/C+++) and synthesis (VHDL, SystemC) languages, and finally to support a multi-domain (avionics, space, and automotive) certification process by providing open qualification material. Modeling languages range from behavioural to architectural languages and present a synchronous and asynchronous semantics (Simulink/Matlab, Scicos, Xcos, SysML, MARTE, UML). See also: http://www.open-do.org/projects/p/

Partners of the project are: industrial partners (Airbus, Astrium, Continental, Rockwell Collins, Safran, Thales), SMEs (AdaCore, Altair, Scilab Enterprise, STI), service companies (ACG, Aboard Engineering, Atos Origins) and research centers (CNRS, ENPC, Inria, ONERA).

8.2.2.2. FUI PARSEC

Participants: Dumitru Potop Butucaru, Thomas Carle, Zhen Zhang, Yves Sorel.

The PARSEC Project aims at providing development tools for critical real-time distributed systems requiring certification according to the most stringent standards such as DO-178B (avionics), IEC 61508 (transportation) or Common Criteria for Information Technology Security Evaluation. The approach proposed by PARSEC provides an integrated toolset that helps software engineers to meet the requirements associated to the certification of critical embedded software. Partners of the project are: Alstom, Thales, Ellidiss, OpenWide, Systereel, CEA, InriaS, Telecom ParisTech. See also: http://www.systematic-paris-region.org/sites/default/files/exports/projets/fichiers/ProjetPARSEC_BookSystematic2012.pdf.

8.2.2.3. FUI CLISTINE

Participants: Robert de Simone, Amin Oueslati, Emilien Kofman.

This contract has just been accepted, with a kick-off meeting in Dec 2013. Partners are SynergieCAD (coordinator), Avantis, Optis, and the two EPIs Aoste and Nachos. The goal is to study the feasibility of building a low-cost, low-power "supercomputer", reusing ideas from SoC design, but this time with out-of-chip network "on-board", and out-of-the-shelf processor elements organized as an array. The network itself should be time predictable and highly parallel (far more than PCI-e for instance).

8.2.3. Investissements d’Avenir

8.2.3.1. DEPARTS

Participants: Liliana Cucu-Grosjean, Adriana Gogonel, Codé Lo, Cristian Maxim.

This project is funded by the BGLE Call (Briques Logicielles pour le Logiciel Embarqué of the national support programme Investissements d’Avenir. Formally started on October 1st, 2012, but the kick-off meeting was only held on April, 2013 for administrative reasons. Initially this contract was handled by the TRIO team in Nancy, but at this end of TRIO moved to Aoste Rocquencourt with the people involved. Research will target solutions for probabilistic component-based models, and a Ph.D. thesis will start early 2014 on this topic. The goal is to allow designers to unify in a common framework probabilistic scheduling techniques with compositional assume/guarantee contracts that have different levels of criticality. Our contribution is based on the schedulability analysis presented in [39].
8.3. European Initiatives

8.3.1. FP7 Projects

8.3.1.1. PROXIMA

Participants: Liliana Cucu-Grosjean, Adriana Gogonel, Codé Lo, Cristian Maxim.
Type: COOPERATION
Defi: Mixed-Criticality Systems
Instrument: Integrated Project
Objectif: Development of probabilistic approaches for mixed-criticality systems on multi-core and many-core platforms
Duration: October 2013 - September 2016
Coordinator: Barcelona Supercomputing Center (Spain)
Inria contact: Liliana Cucu-Grosjean PROXIMA started on October 1st, 2013 with a kick-off meeting in November 2013.

The project claims that probabilistic analysis techniques can provide efficient (tractable) and effective (tight) analysis of the temporal behaviour of complex mixed-criticality applications, while running on novel multicore and manycore platforms. Solid research results from the former FP7 STREP PROARTIS project sustain this claim. The concept is based on using probabilistic analysis techniques to derive safe and tight bounds on the temporal behaviour of applications. Such bounds should reflect requirements on failure rates commensurate with their criticality.

PROXIMA defines architectural paradigms that break causal dependence in the timing behaviour of execution components at hardware and software level that can give rise to pathological cases. The risk is then reduced to quantifiably small levels. The changes needed in the hardware and software components beneath the application (processing cores, interconnects, memory hierarchies and controllers, real-time operating system, middleware, compilers) remain modest.

8.3.2. Collaborations in European Programs, except FP7

8.3.2.1. ARTEMIS PRESTO

Participants: Frédéric Mallet, Arda Goknil, Julien Deantoni, Marie-Agnès Peraldi Frati, Robert de Simone, Jean-Vivien Millo.
Type: ARTEMIS
Project title: PRESTO
Duration: April 2011 - March 2014
Coordinator: Miltech (Greece)
Others partners: TELETEL S.A. (Greece), THALES Communications (France), Rapita Systems Ltd. (United Kingdom), VTT (Finland), Softeam (France), THALES (Italy), MetaCase (Finland), Inria (France), University of LâAquila (Italy), MILTECH HELLAS S.A (Greece), PragmaDev (France), Prizmtech (United Kingdom), Sarokal Solutions (Finland).
See also: http://www.cesarproject.eu/

Abstract: The PRESTO project aims at improving test-based embedded systems development and validation, while considering the constraints of industrial development processes. This project is based on the integration of test traces exploitation, along with platform models and design space exploration techniques. Such traces are obtained by execution of test patterns, during the software integration design phase, meant to validate system requirements. The expected result of the project is to establish functional and performance analysis and platform optimisation at early stage of the design development. The approach of PRESTO is to model the software/hardware allocation, by the use of modelling frameworks, such as the UML profile for model-driven development of Real Time and Embedded Systems (MARTE). The analysis tools, among them timing analysis including Worst Case Execution Time (WCET) analysis, scheduling analysis and possibly more abstract system-level timing analysis techniques will receive as inputs on the one hand information from the performance modelling of the HW/SW-platform, and on the other hand behavioural information of the software design from tests results of the integration test execution.
8.4. International Initiatives

8.4.1. Inria Associate Teams

8.4.1.1. DAESD

Title: Distributed/Asynchronous and Embedded/synchronous Systems Development
Inria principal investigator: Robert de Simone (Aoste) / Eric Madelaine (Oasis)
International Partner (Institution - Laboratory - Researcher):
East China Normal University (China) - SEI-Shone - Robert De Simone
Duration: 2012 - 2014
See also: https://team.inria.fr/DAESD/

The development of concurrent and parallel systems has traditionally been clearly split in two different families: distributed and asynchronous systems on one hand, now growing very fast with the recent progress of the Internet towards large scale services and clouds; embedded, reactive, or hybrid systems on the other hand, mostly of synchronous behaviour. The frontier between these families has attracted less attention, but recent trends, e.g. in industrial systems, in Cyber-Physical systems (CPS), or in the emerging Internet of Things, give a new importance to research combining them.

The aim of the DAESD associate team is to combine the expertise of the Oasis and Aoste teams at Inria, the SEI-Shone team at ECNU-Shanghai, and to build models, methods, and prototype software tools inheriting from synchronous and asynchronous models. We plan to address modelling formalisms and tools, for this combined model; to establish a method to analyze temporal and spatial consistency of embedded distributed real-time systems; to develop scheduling strategies for multiple tasks in embedded and distributed systems with mixed constraints.

A dedicated Spring School was organized this year in Shanghai (April 27-30th), with participation of Robert de Simone and Frédéric Mallet from Aoste.

8.4.2. Inria International Labs

8.4.2.1. LIAMA

The DAESD associated-team goals have been extended to a LIAMA project named HADES (Heterogeneous Asynchronous Distributed / Embedded Synchronous), again with the SEI-Shone lab of ECNU Shanghai. The kick-off meeting was held next to the thematic Spring School (see 8.4.1.1), in presence of Chinese and French officials.

8.5. International Research Visitors

8.5.1. Visits of International Scientists

8.5.1.1. Internships

Franco Pestarini
Subject: Threads scheduling on multicore processors
Date: from Feb 2013 until Jul 2013
Institution: Universidad National de Rosario (Argentina)

9. Dissemination

9.1. Scientific Animation

Robert de Simone
General Chair: RTNS 2013.
Board of Administrators: CIM PACA Design Platform association

Yves Sorel
Technical Program Committee: RTNS 2013, DASIP 2013
Editorial Board: Traitement du Signal Journal
Steering Committee: OCDS/SYSTEM@TIC Paris-Region Cluster

Liliana Cucu-Grosjean
Inria Evaluation Commission: elected member
Steering Committee and co-chair: RTSOPS2013, WMC2013 (workshops)

Dumitru Potop Butucaru
Technical Program Committee: Memocode 2013, ACSD 2013, EslSyn 2013, APRES 2013

Julien Deantoni
General Chair: organizer and chair: First international workshop on globalization of modeling languages (GeMoC)
Technical Program Committee: CIEL2013, GlobalDSL2013, Journal of System and Software

Laurent George
General Chair: ECRTS 2013
Program Committee Chair: ReTiMiCS 2013
Scientific Co-chair: ACTRISS group, supported by GDR ASR (CNRS, France) (http://www.actriss.org/).

9.2. Teaching - Supervision - Juries

9.2.1. Teaching
Licence: Julien Deantoni, Computer Environnement, 30 h, L2 level, Polytech engineering school of University Nice/Sophia-Antipolis (UNS EPU) France.
Master: Julien Deantoni, Model Driven Engineering, 22 h, M2, UNS EPU.
Master: Julien Deantoni, C++ and Object Oriented Programming, 54 h, M1, UNS EPU.
Master: Julien Deantoni, Embedded Software and systems, 7 h, M2, UNS EPU.
Master: Julien Deantoni, VHDL, 40 h, M1, UNS EPU.
Master: Sid-Ahmed-Ali Touati, Programmation efficaces pour programmes embarqués et hautes performances, 16h, M1 Master ISI.
Master: Sid-Ahmed-Ali Touati, Systèmes d’exploitation avancés, 39h, M1, UNS Master ISI.
Master: Sid-Ahmed-Ali Touati, Programmation efficace et Optimisation de code, 16h, M1, UNS Master ISI.
Master: Sid-Ahmed-Ali Touati, Architecture des Processeurs, 15h, M1, UNS EPU.
Licence : Frédéric Mallet, Introduction à la Programmation Objet, 45h, L1, UNS.
Licence: Frédéric Mallet, Architecture des ordinateurs, 45h, L3, UNS.
Master: Frédéric Mallet, Programmation Avancée et Design Patterns, 93h, M1, UNS.
Master: Frédéric Mallet, Java pour l’Informatique Industrielle, 24h, M1, UNS.
Master: Frédéric Mallet, Architectures des ordinateurs, 12h, M1, UNS.
Master: Frédéric Mallet, Formal Models for Network-On-Chips, 3h, M2, UNS.
Licence : Marie-Agnes Peraldi-Frati, Algorithms and programming 60h,L1, UNS Institute of technology.
Licence : Marie-Agnes Peraldi-Frati, System and Networks administration 80h, L2, UNS Institute of technology .
Licence :Marie-Agnes Peraldi-Frati, Web Programming 50 h , L2, UNS Institute of technology.
Master : Robert de Simone, Formal Models for Networks-on-Chip, 24h , M2, UNS.
Master : Robert de Simone, Semantics of Embedded and Distributed Systems, 24 h, M1, UNS.
Master: Yves Sorel, Distributed real-time systems, 26H, M2, University Paris Est
Master: Yves Sorel, Correct by construction design of reactive systems, 18H, M2, ESIEE Engineering School Noisy-Le-Grand
Master: Dumitru Potop and Thomas Carle, Programmation synchrone des systèmes temps-réel, 8h, M1, EPITA Engineering School Paris
Licence: Laurent George, Java and Shell programming 48h, L1, IUT RT UPEC
Master: Laurent George, Distributed Real-Time Systems, 24h, M2, UPEC

9.2.2. Supervision

PhD: Carlos Ernesto Gomez-Cardenas, Environnement multi-vues pour la métamodélisation séman- tique formelle de systèmes embarqués, UNS, defended Dec 20th 2013, supervised by Frédéric Mal- let, co-supervised by Julien deAntoni.
PhD in progress: Matias Vara-Larsen, Toward a formal and hierarchical timed model for concurrent heterogeneous model, ANR/CNRS, started November 2012, supervised by Frédéric Mallet, co-supervised by Julien Deantoni.
PhD in progress: Ameni Khecharem, High-Level modeling of hierarchical power management policies in SoCs, UNS, started October 2012, supervised by Robert de Simone.
PhD in progress: Ying Lin, Formal Analysis of polychronous models with MARTE/CCSL, East China Normal University, started September 2011, supervised by Jing Liu (ECNU), co-supervised by Frédéric Mallet.
PhD in progress: Emilien Kofman, Conception Haut Niveau Low Power d’objets mobiles communi- cants, UNS, started Oct 2013, supervised by Robert de Simone, co-supervised by François Verdier (UMR CNRS/UNS LEAT).
PhD in progress: Amin Oueslati, Modélisation conjointe d’applications et d’architectures parallèles embarqués en pratique, UNS, started Jan 2014, supervised by Robert de Simone.
PhD in progress: Falou Ndoye, Multiprocessor real-time scheduling taking into account preemption cost, started January 2011, supervised by Yves Sorel.
PhD in progress: Manel Djemal, Distributed real-time scheduling onto NoC architectures, EDITE/UPMC, started Nov. 2010, co-supervised by Alix Munier (UPMC/Lip6) and D. Potop-Butucaru.
PhD in progress: Thomas Carle, Real-time implementation of embedded control applications with conditional control onto time-triggered architectures, EDITE/UPMC, started Sep. 2011, supervised by D. Potop-Butucaru.
PhD: Dorin Maxim, Probabilistic Real-Time Systems, University of Lorraine, December 10th, 2013, co-supervised by Liliana Cucu-Grosjean and Françoise Simonot-Lion

9.2.3. Juries

Robert de Simone
PhD reviewer: Adnan Bouakaz (U. Rennes 1), Jagadish Suryadevara (Malardalen U., Sweden)
HDR reviewer: Sébastien Gérard (U. Paris 11 Orsay).

Frédéric Mallet
PhD reviewer: Xin An (U. Grenoble 1), Clément Guy (U. Rennes 1)
PhD examiner: João Claudio Rodrigues Américo (U. Grenoble 1), Jair Gonzalez (Mines-Telecom)

Laurent George
PhD reviewer: Xiaoting Li (ENSEEIHT), Ahmed Daghsh (UTC Compiègne)

Yves Sorel
PhD reviewer: Yassine Ouhammou (ENSMA Poitiers)
PhD examiner: Pierre Courbin (U. Paris Est)

Dumitru Potop Butucaru
PhD reviewer: Léonard Gérard (U. Paris Sud)

9.3. Popularization

We held a thematic Spring School in late April in Shanghai, based on the topics of the DAESD associated-team 8.4.1.1. It was open to students from all over China, and we invited also chinese speakers (but the attendance of around sixty students was mostly from several universities in and around Shanghai).

10. Bibliography

Major publications by the team in recent years


Publications of the year

Doctoral Dissertations and Habilitation Theses


Articles in International Peer-Reviewed Journals


Articles in National Peer-Reviewed Journals


Invited Conferences

[26] L. CUCU. Independence - a misunderstood property of and for probabilistic real-time systems, in "Alan Burns 60th Anniversary", York, United Kingdom, N. AUDSLEY, S. BARUAH (editors), March 2013, http://hal.inria.fr/hal-00920504

[27] L. CUCU. Probabilistic real-time scheduling, in "ETR 2013 - Ecole d’été temps réel", Toulouse, France, August 2013, http://hal.inria.fr/hal-00920517

International Conferences with Proceedings


[47] M. VARA LARSEN, A. GOKNIL. *Railroad Crossing Heterogeneous Model*, in "GEMOC workshop 2013 - International Workshop on The Globalization of Modeling Languages", Miami, Florida, United States, September 2013, This research was supported by ANR GEMOC project., http://hal.inria.fr/hal-00867316


Scientific Books (or Scientific Book chapters)


Books or Proceedings Editing


Research Reports


Other Publications

[57] E. Grolleau, J. Goossens, L. Cucu. On the periodic behavior of real-time schedulers on identical multiprocessor platforms, 2013, http://hal.inria.fr/hal-00920529

References in notes


