Section: New Results
Dependability and Group Communication
Participants : Michel Hurfin, Jean-Pierre Le Narzul, Izabela Moise.
Agreement Problems and Group Communication Services
We consider an asynchronous distributed system which is prone to message losses and crash failures. Within a group, several important services, such as total order broadcast or group membership, can be solved by relying on repeated calls to a Consensus service. The classical specification of the Consensus problem requires that each participant proposes an initial value during an invocation of the Propose primitive and, despite failures, all the correct processes have to decide on a single value selected out of these proposed values. In a pure asynchronous system, this problem is impossible to solve [29] . Yet under some well-identified additional synchrony properties which can be indirectly exploited by a failure detector or a leader election service, several consensus protocols have been proposed. Among them, the Paxos Protocol presented by Lamport [34] , [32] is probably the most famous one. Lamport has identified four basic roles: proposer, learner, coordinator, and acceptor. Each participant may take on multiple roles or just a single one. Proposers are entities that may provide initial values. The learners are in charge of detecting that the protocol has successfully converged toward a decision value. Proposers and learners are not involved in the convergence procedure which is only driven by the interactions between coordinators and acceptors. Coordinators and acceptors play a central role in ensuring that eventually a single value is selected to become the decision value. A leader election service is used to grant eventually a privilege to a single coordinator. If a correct coordinator becomes the unique leader forever (or at least, till the current consensus instance ends), it is able to impose a selected value to a majority of acceptors and to detect the successful termination of its attempt. Acceptors are used to implement quorums as majority sets. Therefore, by assumption, a majority of acceptors should never crash during the computation.
In [20] , we revisit the interaction scheme between proposers, learners, coordinators, and acceptors. We formally define the Multiple Integrated Consensus problem and consider a protocol in charge of the whole sequence of consensus instances. Consensus instances are still executed sequentially but not in a complete isolation from each other. We extend the remit of the sub-group of coordinators and acceptors so that they also have to ensure the availability of the past decisions and they have to control when a new consensus instance can start. In the context of a long lasting computation performed by a (potentially large) collection of (possibly ephemeral) processes, the core of dedicated processes formed by the coordinators and acceptors is able, on one hand, to provide all the decision values already computed (or only the most recent ones) to any current member of the collection and, on the other hand, to ensure the progress of the successive consensus instances while regulating the activity of the proposers that may dynamically join and leave the collection. By definition, the k th decision value corresponds to the outcome of the k th consensus instance which selects an initial value v , proposed by at least one member and generates a decision <v, k> . A member of the collection may ignore this outcome but, instead of this decision, it cannot consider another couple <v', k> with v'v . To regulate the rate of consensus instances, a classical constraint is used: a participant is not allowed to act as a proposer during consensus instance k if it is not able to access the k-1 previous decisions.
The repeated and intensive use of a consensus building block militates in favor of an optimization of the performance of this basic agreement protocol. In a recent past, two different protocols, namely FastPaxos (without space) by Boichat et al. [25] and Fast Paxos (with a space) by Lamport [33] , have been designed to reduce the latency of learning a decision value to respectively, three and two communication steps, in favorable circumstances. The first strategy which is also adopted in [35] (where the notion of view is proposed) and in [36] (where the concept of regency is introduced) tries to benefit from the stability of an elected leader during long lasting failure-free synchronous periods. The second strategy tries to take advantage from a low throughput of the flow of initial values provided by the proposers. To solve efficiently the Multiple Integrated Consensus problem, we present in [20] a protocol called Paxos-MIC that integrates, for the first time to our knowledge, within a single simple framework the two best known methods for reducing decision latency in Paxos-like protocols. Our protocol unifies these two different strategies, in order to obtain the best performance gain.
Group Communication Services to Secure a Web Access
In addition to the classical prevention security tools, Intrusion Detection Systems (IDS) are nowadays widely used by security administrators to detect attack occurrences against their systems. Anomaly detection is often viewed as the only approach to detect new forms of attack. The main principle of this approach consists in building a reference model of the behavior for a given entity (user, machine, service, or application) in order to compare it with the current observed behavior. If the observed behavior diverges from the model, an alert is raised to report the anomaly.
Intrusion detection is traditionally based on the definition of an explicit reference model. In the context of a joint work with Supelec, we consider an implicit model. We propose a solution to protect Web applications which is based on the concepts of diversity and redundancy. A set of COTS (Components-Off-The-Self) servers executed on different nodes and different operating systems constitutes the core of the generic architecture: they provide simultaneously the service to the client. Design diversity is used to build at runtime the reference model. As an attack takes advantage of a vulnerability which is specific to either an operating system or a running software (i.e. , a web server), an attack will succeed on at most one node. If at least three nodes are used, the normal behavior is the one adopted by a majority of servers. To ensure integrity and confidentiality, any request is forwarded to the different servers which implement the same functionality but through diverse designs. Any difference between results that are returned can be interpreted as a possible attack and a possible corruption of one node. This approach can detect even previously unknown attacks.
Furthermore to ensure also availability, replication techniques implemented on top of agreement services are used to avoid any single point of failure. Secured and robust group communication mechanisms (see the Prometeus software description) are used to maintain consistency at various stages of the architecture. Our system has been deployed on a intrusion detection platform that is based on a set of diversified Web servers running on top of three different operating systems (Windows, Linux, Mac OS X). Performance evaluations are currently conducted. We aim at evaluating the relevance of our solution along two axes. On one hand, we have to show that diversification of COTS servers can improve the detection of attacks with respect to false positives. On the other hand, we have to show that the cost of the atomic broadcast service is reasonable enough to be used in real applications where dependability is a key requirement.
In addition to this main activity, we have proposed a solution to protect web applications running on top of a diversified architecture against code injection. Our solution consists in creating diversity in the web applications scripts by randomizing the language understood by all the redundant servers. The automatization of this process called Instruction-Set Randomization is presented in [17] .