Section: New Results
Participants : Jean-Claude Bermond, Olivier Dalle, Afonso Ferreira, Frédéric Giroire, Julian Monteiro, Stéphane Pérennes.
P2P Storage Systems
Traditional means to store data are dedicated servers or magnetic tapes. These solutions are reliable but expensive. Recently, hard disks and bandwidth have become cheaper and widely available, allowing new forms of data storage on distributed, peer-to-peer (P2P) architectures. To achieve high durability, such P2P systems encode the user data in a set of redundant fragments and distribute them among the peers. These systems are cheap to operate, but their highly distributed nature raises questions about reliability, durability, availability, confidentiality, and routing of the data. An abundant literature exists on the topic of P2P storage systems. Several efforts to build large-scale self-managing distributed systems have been done, among others, Intermemory, Ocean Store, Freenet, PASTRY, CFS, Total Recall. However, few analytical models have been proposed to estimate the behavior of the system (the data durability, resource usage, e.g., bandwidth) and understand the trade-offs between the system parameters. Furthermore, in almost all these models, the behavior of a single block is modeled and the block failures are considered independent. We showed that this assumption can lead to severe errors of estimation on the behavior of a system subject to peer failures  . Therefore the need of new more complex analytical models to describe the systems. Part of Mascotte 's work on this topic is done inside the ANR SPREADS project.
In P2P storage systems, peers fail continuously, hence, the necessity of self-repairing mechanisms to achieve high durability. In  ,  , we propose and study analytical models that assess the bandwidth consumption and the probability to lose data of storage systems that use erasure coded redundancy. We show by simulations that the classical stochastic approach found in the literature, that models each block independently, gives a correct approximation of the system average behavior, but fails to capture its variations over time. These variations are caused by the simultaneous loss of multiple data blocks that results from a peer failing (or leaving the system). We then propose a new stochastic model based on a fluid approximation that better captures the system behavior. In addition to its expectation, it gives a correct estimation of its standard deviation. This new model is validated by simulations.
In  ,  , we study the impact of different data placement strategies on the system performance when using erasure codes redundancy schemes. We compare three policies: two of them local, in which the data are stored in logical neighbors, and the other one global, in which the data are spread randomly in the whole system. We focus on the study of the probability to lose a data block and the bandwidth consumption to maintain enough redundancy. We use simulations to show that, without resource constraints, the average values are the same no matter which placement policy is used. However, the variations in the use of bandwidth are much more bursty under the local policies. When the bandwidth is limited, these bursty variations induce longer maintenance time and henceforth a higher risk of data loss. Finally, we propose a new external reconstruction strategy and a suitable degree of locality that could be introduced in order to combine the efficiency of the global policy with the practical advantages of a local placement.
In  , we focus on a class of distributed storage systems whose content may evolve over time. Each component or node of the storage system is mobile and the set of all nodes forms a delay tolerant (ad hoc) network (DTN). The goal of the paper is to study efficient ways for distributing evolving files within DTNs and for managing dynamically their content. We consider both the cases when (a) nodes do not cooperate and (b) nodes cooperate. The objective is to find file management policies (rules specifying when a node may send a copy of a file) which maximize some system utility functions under a constraint on the resource consumption. Both myopic (static ) and state-dependent (dynamic ) policies are considered, where the state of a node is the age of the copy of the file it carries. For scenario (a), we find the optimal static (resp. dynamic) policy which maximizes a general utility function under a constraint on the number of transmissions within a slot. In particular, we show the existence of a threshold dynamic policy. In scenario (b) we study the stability of the system (aging of the nodes) and derive an (approximate) optimal static policy. We then revisit scenario (a) when the source does not know parameter N (node population) and q (node meeting probability) and derive a stochastic approximation algorithm which we show to converge to the optimal static policy found in the complete information setting.
Framework for the Analysis of Distributed Systems
Besides the complexity in time or in number of messages, a common approach for analyzing distributed algorithms is to look at their assumptions on the underlying network. In  , we focus on the study of such assumptions in dynamic networks, where the connectivity is expected to change, predictably or not, during the execution. Our main contribution is a theoretical framework dedicated to such analysis. By combining several existing components (local computations, graph relabellings, and evolving graphs), this framework allows to express detailed properties on the network dynamics and to prove that a given property is necessary, or sufficient, for the success of an algorithm. Consequences of this work include (i) the possibility to compare distributed algorithms on the basis of their topological requirements, (ii) the elaboration of a formal classification of dynamic networks with respect to these properties, and (iii) the possibility to check automatically whether a network trace belongs to one of the classes, and consequently to know which algorithm should run on it.
In collaboration with Intel Research Berkeley, Mascotte worked on methods for providing security to end hosts in typical enterprise environments. The research is focused on making end host security customizable and adaptable by exploring design profiles based on the end host's communication traffic and using these for anomaly detection.
In  , we describe a method, on individual end-hosts, to detect command and control (C&C) traffic of a botnet (i.e., a set of devices controlled by a malicious entity). We introduce the notion of destination traffic atoms which aggregate the destinations and services that are communicated with. We then track the persistence (a measure of temporal regularity) of new destination atoms not already whitelisted, to identify suspicious C&C destinations. Importantly, our method does not require any a-priori information about destinations, ports, or protocols used in the C&C, nor do we require payload inspection. We evaluate our system using extensive user traffic traces collected from an enterprise network, along with collected botnet traces. We demonstrate that our method correctly identifies a botnet's C&C traffic, even when it is very stealthy, doing so with a very low false positive rate.
In  , we study the impact of today's IT policies, defined based upon a monoculture approach, on the performance of endhost anomaly detectors. This approach leads to the uniform configuration of Host intrusion detection systems (HIDS) across all hosts in an enterprise networks. We assess the performance impact this policy has from the individual's point of view by analyzing network traces collected from 350 enterprise users. We uncover a great deal of diversity in the user population in terms of the tail behavior, i.e., the component which matters for anomaly detection systems. We demonstrate that the monoculture approach to HIDS configuration results in users that experience wildly different false positive and false negatives rates. We then introduce new policies, based upon leveraging this diversity and show that not only do they dramatically improve performance for the vast majority of users, but they also reduce the number of false positives arriving in centralized IT operation centers.
Estimating the number of distinct flows in a data stream has many applications in network monitoring and network security. For instance, one can count the number of distinct flows on a traffic to detect Denial of Service attacks. In  , a new class of algorithms to estimate the cardinality of very large multisets using constant memory and doing only one pass on the data is introduced. It is based on order statistics rather than on bit patterns in binary representations of numbers. Three families of estimators are analyzed. They attain a standard error of using M units of storage, which places them in the same class as the best known algorithms so far. The algorithms have a very simple internal loop, which gives them an advantage in terms of processing speed. For instance, a memory of only 12 kB and only few seconds are sufficient to process a multiset with several million elements and to build an estimate with accuracy of order 2 percent. The algorithms are validated both by mathematical analysis and by experimentation on real Internet traffic.