Section: New Results
Optimized protocol implementation and networking equipements
Evaluation and optimization of network performance in virtual end systems and routers
Keywords : system virtualization, virtual router, traffic control.
Participants : Fabienne Anhalt, Jean-Patrick Gelas, Pascale Vicat-Blanc Primet.
Virtualization techniques are applied to improve features like isolation, security, mobility and dynamic reconfiguration in distributed systems. To introduce these advantages into the network where they are highly required, an interesting approach is to virtualize the internet routers themselves. This technique could enable several virtual networks of different types, owners and protocols to coexist inside one physical network. Systematic analysis and experiments of the cost of network virtualization have been conducted. Optimisation of scheduler have been proposed.
The evaluation of Xen 3.1's network performance with TCP flows on end-hosts and routers show that the multiplexing level (Dom0) was the bottleneck . We have shown that the performance could be improved by manipulating the scheduler parameters. Giving more weight to dom0 improves throughput and fairness.
The evaluation of Xen 3.2's network performance with TCP and UDP flows on end-hosts and routers have shown that the performance improved significantly compared to previous 3.1 version of Xen. Better throughput was obtained and dom0's CPU overhead decreases. No more unfairness exists. A stated bottleneck is the forwarding of small sized packets.
We have proposed a model of a virtual software router we have implemented with XEN and we have evaluated its properties. We show that the performance is close to the performance of non virtualized software routers, but causes an important processing overhead and unfairness in the share of the ressources. We study the impact of the virtual machine scheduler parameters on the network performance and we show that the module which is responsible of forwarding the packets between the virtual machines and the physical interfaces is the critical point of network communications. We analysed virtualization from the the data plane perspective. We explore the resulting network performance in terms of throughput, packet loss and latency between virtual machines, and also the correspondig CPU cost. The virtual machines act as senders or receivers, or as software routers forwarding traffic between two interfaces in the context of Xen. Our results show that the impact of virtualization on network performance is getting smaller with the successive Xen versions, making this approach a promising solution for data plane virtualization.The router migration with Xen 3.2 has been explored. The migration process is slowed down by the forwarding of flows by the virtual router, especially with TCP flows. It can take several minutes instead of several seconds in case of inactivity.
Exploiting our results on virtual software routers we start to investigate virtual router design. We first worked on a survey to attempt to sketch the evolution of the modern switch architectures. The survey covers the literature over the period 1987-2008 on switch architectures. Starting with the simple crossbar switch, we explore various architectures such as Output queueing, Input queueing, Combined Input/Output queueing, buffered crosspoint etc., that have evolved during this period. We discuss the pros and cons of these designs, so as to shed light on the path of evolution of switch architecture, in particular in the context of equipment virtualisation. We are curently working on the design and on the evaluation of a virtual switch.
High availability for clustered network equipments
Keywords : fault tolerance, scalability, high availability.
Participants : Laurent Lefèvre, Pascale Vicat-Blanc Primet.
A key component for improving the scalability and the availability of network services is to deploy them within a cluster of servers. The main objective of this work is to design a network traffic load balancing architecture which meets fine grained scheduling while efficiently spreading the offered network traffic among the available cluster resources.
A scalable architecture for balancing the offered network traffic
While a lot of researches have been conducted in the field of job and network load balancing, less interest has been granted to the impact of the granularity of the used mechanism on the reliable execution of the upper layer services. In fact, the currently used flow level network load balancing frameworks fail to achieve session awareness while efficiently spreading the offered network load among the available resources, typically, when the offered network session involves multiple and heterogeneous flows. Representative services range from familiar services like HTTP and FTP, to some recent services like multimedia streaming using RTSP/RTP/RTCP and Voice over IP using SIP. Our work aims to provide an architecture to efficiently balance the offered network sessions among the available processing resources within a cluster of servers.
A highly available architecture for balancing the offered network traffic
High availability allows service architectures to meet growing demands and to ensure uninterrupted service. In our work, we are interested in providing the continuous execution of the offered network sessions in case of failure of the legitimate entry point to the cluster as well as in case of the failure of the processing server inside the cluster. We noticed that current fault tolerant frameworks need to support consistent transport and application level failover mechanisms, and that transport layer protocols do not provide high availability capabilities. Indeed, TCP does not distinguish between a packet loss due to congestion, or a packet loss due to a server overload or due to a server/link failure. Thus, it reacts the same way to packet losses and to delays, by retransmitting the same segment to the same remote end point of the connection. Moreover, TCP tolerates short periods of disconnection not longer than a few RTTs. It disconnects the communicating hosts once specific timers expire. On the other hand, transport protocols rely on an explicit association between a service and its physical location for the wired Internet. Thus, when a host fails, the end-to-end flow terminates.
In order to address this limitation, we proposed an active replication based system which enhances the reliability of the already established TCP flows. The proposed scheme is client transparent and does not incur any overhead to the end-to-end communication during failsafe periods, and performs well during failures. Parts of this work are protected by the Intellectual Property National Institute (INPI) patent disclosure No. FR0653546
High availability for stateful network equipments
Keywords : fault tolerance, high availability.
Participant : Laurent Lefèvre.
Joint work with Pablo Neira Ayuso from University of Sevilla (Spain).
In operational networks, the availability of some critical elements like gateways, firewalls and proxies must be guaranteed. Some important issues like the replication of these network elements, the reduce of unavailability time and the need of detecting failure of an element must be studied. We propose the SNE library (Stateful Network Equipment ) which is an add-on to current High Availability (HA) protocols. This library is based on the replication of the connection tracking table system for designing stateful network equipments.
Proposing stateful network equipments on open source systems is a challenging task. We propose the basic blocks (SNE library) for building a stateful network equipment. This library can be combined with high-availability protocols (CARP, Linux HA...). We focus on Linux system in order to provide software solutions for designing high-available solutions for NAT, firewalls, proxies or gateways equipments...This library is based on components located in kernel and in user space of the network equipment. First micro-benchmark of communications mechanisms with Netlink sockets have shown the effectiveness of our approach
XCP-i: a new interoperable XCP version for high speed heterogeneous networks
Keywords : XCP, XCP-i, TCP, available bandwidth, congestion control, virtual XCP-i router.
Participant : Laurent Lefèvre.
XCP (eXplicit Control Protocol) is a transport protocol that uses the assistance of specialized routers to very accurately determine the available bandwidth along the path from the source to the destination. In this way, XCP efficiently controls the sender's congestion window size thus avoiding the traditional slow-start and congestion avoidance phase. However, XCP requires the collaboration of all the routers on the data path which is almost impossible to achieve in an incremental deployment scenario of XCP. It has been shown that XCP behaves badly, worse than TCP, in the presence of non-XCP routers thus limiting dramatically the benefit of having XCP running in some parts of the network. In this work, we address this problem and propose XCP-i which is operable on an internetwork consisting of XCP routers and traditional IP routers without loosing the benefit of the XCP control laws.
XCP-i basically executes the next four steps to discover and compute a new feedback that reflects the state of the network where non-XCP routers are placed:
Discover where the non-XCP routers are in the data path.
Discover the upstream and downstream XCP-i routers of the non-XCP routers.
Estimate the available bandwidth where the non-XCP routers are placed.
Create a virtual XCP-i router that computes a new feedback using the estimated available bandwidth before.
The simulation results on a number of topologies that reflect the various scenario of incremental deployment on the Internet show that although XCP-i performances depend on available bandwidth estimation accuracy, XCP-i still outperforms TCP on high-speed links  .
Autonomic Service Deployment in Next Generation Networks
Keywords : autonomic network, programmability, service deployment.
Participants : Abderhaman Cheniour, Jean-Patrick Gelas, Laurent Lefèvre.
RESO is involved in the FP7 Autonomic Internet project by focusing on autonomic service deployment solutions for large scale overlays.
Programmability in network and services encompasses the study of decentralised enablers for dynamic (de)activation and reconfiguration of new/existing services, including management services and network components. The challenge in Autonomic Internet FP7 project (AutoI) is to enable trusted parties (users, operators, and service providers) to activate management-specific service and network components into a specific platform. Dynamic programming enablers will be created that are applied to executable service code, which can be injected/activated into the system's elements to create the new functionality at runtime. Network and service enablers for programmability can therefore realise the capabilities for flexible management support required in AutoI.
RESO has proposed the ANPI : Autonomic Network Programming Interface which will support the service enablers plane of the AUTOI architecture. This interface is currently under devlopment with the support of other AUTOI partners (Hitachi Europe, University College of London, UPC Barcelona, Univeristy of Passau).
Energy-efficiency in computing and networking for large-scale distributed systems
Keywords : Energy-awareness, Grid monitoring, Energy-efficiency.
Participants : Marcos Dias de Assuncao, Alejandro Fernandez, Jean-Patrick Gelas, Isabelle Guerin-Lassous, Laurent Lefèvre, Anne-Cécile Orgerie.
High performance computing aims to solve problems that require a lot of resources in terms of power and communication. While an extensive set of research project deals with the saving power problem of electronic devices powered by electric battery, few have interest in large scale distributed systems permanently plugged in the wall socket. The general common idea is indeed that, when they are not reserved, the grid resources should be always available, so that they should always remain fully powered on.
The large-scale distributed systems are sized to support reservation bursts. So they are not fully used all the time. Between the bursts, some resources remain free, so we can save energy during these gaps. This is our first approach taken in this work: to save energy by shutting down nodes when they are not used. We use the same approach for high performance data transport: the high-speed links are not always fully used and we can turn off the Ethernet cards and switch ports off to save energy.
Understanding the characteristic usage and workloads of the large-scale distributed systems is a crucial step towards the design of new energy-aware distributed system frameworks. Therefore we have studied the Grid5000 platform usage over long periods of time.
The analysis of these usage traces lead us to propose an energy-aware reservation infrastructure (EARI) which is able to shut down nodes when they are idle. This infrastructure proposes several energy efficient solutions for a reservation made by a user: several energy-efficient possibilities for his reservation.Thus the user is able to choose among these “green” solutions and this leads to an aggregation of the reservations. This infrastructure also includes a prediction algorithm to anticipate the next reservation in order to avoid shutting down nodes that we will need to be restarted quickly.
So, our infrastructure is based on three mechanisms:
switching on and off the nodes;
reservation aggregation with green policiesand
predictions of the next reservations.
This model has been validated over the Grid5000 traces by using a replay mechanism. The results are really encouraging and show that our infrastructure could make huge energy savings. This on/off model is a first step in our research on energy efficiency in computing and networking for large-scale distributed systems.
We are working on improving the prediction models with Alejandro Fernandez from University of Seville, Spain.