Section: New Results
Certified RealTime Programming
Participants : Pascal Fradet, Alain Girault, Gregor Goessler, Xavier Nicollin, Sophie Quinton, Xiaojie Guo, Maxime Lesourd.
Time predictable programming languages and architectures
Time predictability (PRET) is a topic that emerged in 2007 as a solution to the ever increasing unpredictability of today's embedded processors, which results from features such as multilevel caches or deep pipelines [46]. For many realtime systems, it is mandatory to compute a strict bound on the program's execution time. Yet, in general, computing a tight bound is extremely difficult [69]. The rationale of PRET is to simplify both the programming language and the execution platform to allow more precise execution times to be easily computed [35].
Within the Caphca project, we have proposed a new approach for predictable intercore communication between tasks allocated on different cores. Our approach is based on the execution of synchronous programs written in the ForeC parallel programming language on PREcision Timed (hence deterministic) architectures [71], [72]. The originality resides in the timetriggered model of computation and communication that allows for a very precise control over the thread execution. Synchronization is done via configurable Time Division Multiple Access (TDMA) arbitrations (either physical or conceptual) where the optimal size and offset of the time slots are computed to reduce the intercore synchronization costs. Results show that our model guarantees timepredictable intercore communication, the absence of concurrent accesses (without relying on hardware mechanisms), and allows for optimized execution throughput [17]. This is a collaboration with Nicolas Hili and Eric Jenn, the postdoc of Nicolas Hili being funded by the Caphca project.
We have also proposed a multirate extension of ForeC [16]. Indeed, up to now ForeC programs were constrained to operate at a single rate, meaning that all the parallel threads had to share the same execution rate. While this simplified the semantics, it also represented a significant limitation.
Finally, we have extended the compiler of the PretC programming language [33], [34] in order to make it energy aware. PretC is a parallel programming language in the same sense as Esterel [44], meaning that the parallelism is “compiled away”: the PretC compiler generates sequential code where the parallel threads from the source program are interleaved according to the synchronous semantics, and produces a classical Control Flow Graph (CFG). This CFG is then turned into a Timed Control Flow Graph (TCFG) by labeling each basic block with the number of clock cycles required to execute it on the chosen processor, based on its microarchitectural characteristics. From the TCFG, we use the method described in Section 6.2.5 to compute a Pareto front of nondominated (worstcase execution time – WCET, worstcase energy consumption – WCEC) compromises.
Synthesis of switching controllers using approximately bisimilar multiscale abstractions
The use of discrete abstractions for continuous dynamics has become standard in hybrid systems design (see e.g., [67] and the references therein). The main advantage of this approach is that it offers the possibility to leverage controller synthesis techniques developed in the areas of supervisory control of discreteevent systems [64]. The first attempts to compute discrete abstractions for hybrid systems were based on traditional systems behavioral relationships such as simulation or bisimulation, initially proposed for discrete systems most notably in the area of formal methods. These notions require inclusion or equivalence of observed behaviors which is often too restrictive when dealing with systems observed over metric spaces. For such systems, a more natural abstraction requirement is to ask for closeness of observed behaviors. This leads to the notions of approximate simulation and bisimulation introduced in [50]. These approaches are based on sampling of time and space where the sampling parameters must satisfy some relation in order to obtain abstractions of a prescribed precision. In particular, the smaller the time sampling parameter, the finer the lattice used for approximating the statespace; this may result in abstractions with a very large number of states when the sampling period is small. However, there are a number of applications where sampling has to be fast; though this is generally necessary only on a small part of the statespace.
In previous work we have proposed an approach using mode sequences as symbolic states for our abstractions [59]. By using mode sequences of variable length we are able to adapt the granularity of our abstraction to the dynamics of the system, so as to automatically trade off precision against controllability of the abstract states [12]. We have shown the effectiveness of the approach on examples inspired by road traffic regulation.
A Markov Decision Process approach for energy minimization policies
In the context of independent realtime sporadic jobs running on a singlecore processor equipped with Dynamic Voltage and Frequency Scaling (DVFS), we have proposed a Markov Decision Process approach (MDP) to minimize the energy consumption while guaranteeing that each job meets its deadline. The idea is to leverage on the statistical information on the jobs' characteristics available at design time: release time, worstcase execution time (WCET), and relative deadline. This is the topic of Stephan Plassart's PhD, funded by the Caserm Persyval project. We have considered several cases depending on the amount of information available at design time:
 Offline case:

In the offline case, all the information is known and we have proposed the first linear complexity offline scheduling algorithm that minimizes the total energy consumption [15]: our complexity is $\mathcal{O}\left(n\right)$ where $n$ is the number of jobs to be scheduled, while the previously best known algorithms were in $\mathcal{O}\left({n}^{2}\right)$ and $\mathcal{O}(nlogn)$ [60].
 Clairvoyant case:

In the clairvoyant case, the characteristics of the jobs are only known statistically, and each job's WCET and relative deadline are only known at release time. We want to compute the optimal online scheduling speed policy that minimizes the expected energy consumption while guaranteeing that each job meets its deadline. This general constrained optimization problem can be modeled as an unconstrained MDP by choosing a proper state space that also encodes the constraints of the problem. In the finite horizon case we use a dynamic programming algorithm, while in the infinite horizon case we use a value iteration algorithm [25].
 Nonclairvoyant case:

In the nonclairvoyant case, the actual execution time (AET) of a job is only known only when this job completes its execution. This AET is of course assumed to be less than the WCET, which is known at the job's release time. Again, by building an MDP for the system with a well chosen state, we compute the optimal online scheduling speed policy that minimizes the expected energy consumption [26].
 Learning case:

In the learning case, the only information known for the jobs are a bound on the jobs' WCETs and a bound on their deadlines. We have proposed two reinforcement learning algorithms, one that learns the optimal value of the expected energy (Qlearning), and another one that learns the probability transition matrix of the system, from which we derive the optimal online speed policy.
This work led us to compare several existing speed policies with respect to their feasibility. Indeed, the policies (OA) [70], (AVR) [70], and (BKP) [37] all assume that the maximal speed ${S}_{max}$ available on the processor is infinite, which is an unrealistic assumption. For these three policies and for our (MDP) policy, we have established necessary and sufficient conditions on ${S}_{max}$ guaranteeing that no job will ever miss its deadline [27].
Formal proofs for schedulability analysis of realtime systems
We contribute to Prosa [31], a Coq library of reusable concepts and proofs for realtime systems analysis. A key scientific challenge is to achieve a modular structure of proofs, e.g., for response time analysis. Our goal is to use this library for:

a better understanding of the role played by some assumptions in existing proofs;

a formal verification and comparison of different analysis techniques; and

the certification of results of existing (e.g., industrial) analysis tools.
We have further developed CertiCAN, a tool produced using Coq for the formal certification of CAN analysis results [14]. Result certification is a process that is lightweight and flexible compared to tool certification, which makes it a practical choice for industrial purposes. The analysis underlying CertiCAN is based on a combined use of two wellknown CAN analysis techniques [68]. Additional optimizations have been implemented (and proved correct) to make CertiCAN computationally efficient. Experiments demonstrate that CertiCAN is able to certify the results of RTaWPegase, an industrial CAN analysis tool, even for large systems.
In addition, we have started investigating how to connect Prosa with implementations and less abstract models. Specifically, we have used Prosa to provide a schedulability analysis proof for RTCertiKOS, a singlecore sequential realtime OS kernel verified in Coq [20]. A connection with a timedautomata based formalization of the CAN specification is also in progress. Our objective with this line of research is to understand and bridge the gap between the abstract models used for realtime systems analysis and actual realtime systems implementation.
Finally, we contributed to a major refactoring of the Prosa library to make it more easily extendable and usable.
Scheduling under multiple constraints and Pareto optimization
We have completed a major work on embedded systems subject to multiple nonfunctional constraints, by proposing the first of its kind multicriteria scheduling heuristics for a DAG of tasks onto an homogeneous multicore chip [9], [23]. Given an application modeled as a Directed Acyclic Graph (DAG) of tasks and a multicore architecture, we produce a set of nondominated (in the Pareto sense) static schedules of this DAG onto this multicore. The criteria we address are the execution time, reliability, power consumption, and peak temperature. These criteria exhibit complex antagonistic relations, which make the problem challenging. For instance, improving the reliability requires adding some redundancy in the schedule, which penalizes the execution time. To produce Pareto fronts in this 4dimension space, we transform three of the four criteria into constraints (the reliability, the power consumption, and the peak temperature), and we minimize the fourth one (the execution time of the schedule) under these three constraints. By varying the thresholds used for the three constraints, we are able to produce a Pareto front of nondominated solutions. Each Pareto optimum is a static schedule of the DAG onto the multicore. We propose two algorithms to compute static schedules. The first is a ready list scheduling heuristic called ERPOT (Execution time, Reliability, POwer consumption and Temperature). ERPOT actively replicates the tasks to increase the reliability, uses Dynamic Voltage and Frequency Scaling to decrease the power consumption, and inserts cooling times to control the peak temperature. The second algorithm uses an Integer Linear Programming (ILP) program to compute an optimal schedule. However, because our multicriteria scheduling problem is NPcomplete, the ILP algorithm is limited to very small problem instances, namely DAGs of at most 8 tasks. Comparisons showed that the schedules produced by ERPOT are on average only 9% worse than the optimal schedules computed by the ILP program, and that ERPOT outperforms the PowerPerfPET heuristic from the literature on average by 33%. This is a joint work with Athena Abdi and Hamid Zarandi from Amirkabir University in Tehran, Iran.
In a related line of work, we have considered the bicriteria minimization problem in the (worstcase execution time – WCET, worstcase energy consumption – WCEC) space for realtime programs. To the best of our knowledge, this is the first contribution of this kind in the literature.
A realtime program is abstracted as a Timed Control Flow Graph (TCFG), where each basic block is labeled with the number of clock cycles required to execute it on the chosen processor at the nominal frequency. This timing information can be obtained, for instance, with WCET analysis tools. The target processor is equipped with dynamic voltage and frequency scaling (DVFS) and offers several (frequency $f$, voltage $V$) operating points. The goal is to compute a set of nondominated points in the (WCET, WCEC) plane, nondominated in the Pareto sense. Each such point is an assignment from the set of basic blocks of the TCFG to the set of available $(f,V)$ pairs.
From the TCFG we extract the longest execution path, therefore deriving the WCET and the WCEC for a chosen fixed $(f,V)$ pair. By construction, all the other execution paths are shorter, so this WCET and this WCEC hold for the whole program. This ensures that each singlefrequency assignment is a nondominated point. Then, we study two frequencies assignments, still for the longest execution path. When the frequency switching costs in time and in energy are assumed to be negligible, we demonstrate that each two frequencies (say with ${f}_{i}$ and ${f}_{j}$) assignment is a point in the segment between the single frequency assignment at ${f}_{i}$ and the single frequency assignment at ${f}_{j}$. We also propose a linear time heuristic to assign a $(f,V)$ pair to all the other blocks (i.e., those not belonging to the longest path) such that all the other execution paths have a shorter WCET and a lesser WCEC. A key result is that we demonstrate that any two frequencies assignment where the two frequencies are not contiguous is dominated either by a single frequency assignment or by a two frequencies assignment with contiguous frequencies. A corollary is that the Pareto front is a continuous piecewise affine function. Finally, we generalize these results to the case where the frequency switching costs are not negligible. This is the topic of Jia Jie Wang's postdoc.
We evaluate our method and heuristic on a set of hard real time benchmark programs and we show that they perform extremely well. Our DVFS assignment algorithm can also be used as a backend for the compiler of the PretC programming language [33], [34] in order to make it energy aware, thanks to the ability of this compiler to generate TCFGs (see Section 6.2.1).