Team : rap
Section: New Results
Analysis of Traffic Measurements
For traffic analysis, we adopt in this study a flow based approach and the popular mice and elephants dichotomy. A TCP flow is characterized by a sequence of packets characterized by four integers: source and destination addresses, source and destination ports. A mouse is a TCP flow with less than 20 packets, only the slow start phase of TCP protocol is used. On the contrary, due to their lengths, elephants share the remaining bandwidth becaus of the flow control mechanism of TCP. These two types of flows have therefore a completely different behavior from a modeling point of view.
Trace captures have been done by France Telecom R&D. TCP traffic has been collected on an Internet backbone link connecting different ADSL areas. A significant part of of it are p2p applications and hence large elephants.
Modeling ADSL traffic
An extensive statistical study of the traffic trace has been achieved. It has been observed the inter-arrival times of mice have an exponential distribution. Unfortunately, the mice arrival process is not Poisson as can be expected at first glance. This is due to the fact that mice are not actually independent but are sent by clumps.
But if mice of non p2p traffic with the same source address are aggregated, the stationary distribution of the number of macro-mice can then be described as Poisson. The distribution of the duration of these aggregated mice is Weibull. A complete description of the non p2p mice traffic (parameters of the distributions) has therefore been obtained.
For the p2p mice traffic, a second level of aggregation appears to be necessary. The reason is that the requests for a source at different destinations generate response messages. The conclusions are then identical as in the non p2p case.
For the elephant traffic, it has been observed (but never been mentioned in earlier studies) that the transmission of elephants is interrupted by time periods of several seconds. These parts of elephants are then considered as distinct elephants . Though the volume of a long flow has a Pareto distribution, it can be noticed that the distribution of the duration of these flows is Weibull.
A theoretical approximated model has been proposed. Indeed the mice processes are volumes in a where customers are characterized by their arrival time, their service durations and their profile. Moreover, the arrival rate is large so the processes can be approximated by a fluid approximation which leads to a Gaussian process. For the elephants, it is worth noticing that the queue is a good model due certainly to congestion on the access links. For the theoretical model, we have explicit expressions for the Laplace transform of the stationary rate and the stationary autocorrelation (even the transient ones). In the cases of specific distributions of the transmission duration of the flows, asymptotics for the autocorrelation have been obtained. This can be deduced from a general result by Borovkov and Iglehart. In the case of Poisson arrivals, an elementary proof in term of martingales of the heavy traffic limit in a queue has been derived.
This work  has been presented in an INRIA report and submitted to ICC.
Interaction of TCP flows
The integration of two types of flows sharing a channel is an important issue in the design of communication networks. It applies in the context of IP networks to the case of best effort traffics (TCP) and streaming traffics (UDP). It is also relevant to describe the interaction of short TCP flows and large TCP flows. (See the above section).
The basic model analyzed here consists in a bottleneck link with variable capacity receiving TCP connections. The varying capacity is due to the UDP traffic or the mice traffic depending on the model considered. The general idea being that, contrary to ``greedy'' traffic like UDP traffic or mice traffic, the large TCP flows adapt their throughputs to the state of the link.
The varying capacity is driven by a stationary Gaussian process, an Ornstein-Uhlenbeck process. This assumption is natural based on the observations of traffic measurements (See section above) and also, from a mathematical point of view, since it is known that the superposition of many small connections converges to such processes.
Up to now, the problem of expressing the invariant distribution of these systems is largely unsolved and known to be very difficult. Some earlier work by Nunez-Queija solves the problem in the context of a varying capacity driven by a Markov Modulated Point Process (MMPP). The result obtained are expressed in terms of complicated matrix expressions involving the (numerous) parameters of the MMPP. They are not easy to use in practice. This is also the reason why an Ornstein Uhlenbeck process has been taken, it is quite simple since it has only two parameters, the mean and the variance.
It has been chosen to study the case where the varying capacity oscillates around a fixed value . Perturbation methods have been used to describe the stationary behavior of such a system. To tackle this problem two approaches have been used. The first one which is analytic consists in expanding the solution of the PDE associated to the dynamic of the system with respect to the perturbation parameter. As a result a reduced load approximation has been proved when the capacity varies linearly with respect to the Gaussian process.
The other approach is probabilistic, it consists in analyzing the effect of the perturbation on one cycle of the Markov process. Under quite general assumptions, careful calculations lead to the expansion up to the second order of the stationary queue length. Our work is focusing now on a similar expansion but for the stationary sojourn time process.