Section: Scientific Foundations
High Speed Network's traffic metrology and statistical analysis
Tools for measuring the end-to-end performance of a path between two hosts are very important for transport protocol and distributed application performance optimization. Bandwidth evaluation methods aim to provide a realistic view of the raw capacity but also of the dynamic behavior of the interconnection that may be very useful to evaluate the time for bulk data transfer. Existing methods differ according to the measurements strategies and the evaluated metric. These methods can be active or passive, intrusive or non-intrusive. Non-intrusive active approaches, based on packet train or on packet pair provide available bandwidth measurements and/or the total capacity measurements. None of the proposed tools, based on these methods, enable the evaluation of both metrics, while giving an overview of the link topology and characteristics.
That is the reason why a metrology activity including data processing, statistical inference, time series and stochastic processes analysis, deemed important to embed in the main research realm of RESO. Our goal is for these analyses to become in the near future a plain component not only in the study and in the development of infrastructures and computing networks, but also in real-time resources identification and management.
Grids specificities, such as the cooperating equipments number and heterogeneity, the number of independent processes, the treatments, bandwidth and stock capacities, turn indispensable to revisit the algorithms, as well as the control and operating mechanisms, in order to reach appropriate and optimal performances.
To validate a priori hypotheses that sustain already investigated approaches (e.g. overlay, virtualizing network resources, distributing network treatments, middleware programming), we resort to metrology and to the statistical analysis of the collected data. Indeed, we believe that automatic identification of static and dynamic properties of network resources is a prerequisite for developing adequate, adaptive and self-reconfigurable solutions.
We ground our approach on our large scale, fully controllable and configurable experimental facility (Grid5000 + MetroFlux  ) to validate, to better understand and to extend anterior results that were either heuristically observed or theoretically derived. Conversely, we perform realistic experiments, under prescribed and reproducible conditions, to get new insights into the statistical specificities of internet traffic, and to precisely identify the role of the network parameters  .
Difficulty dwells in reliable classifiers and estimators of statistical properties, from non-stationary and possibly incomplete traces. We address these issues by proposing signal processing techniques well-tailored to network traffic measurements  . To go beyond the statistically description of the Internet traffic, we aim at investigating their effects on networking equipments such as routers and switches. An empirical study, implying the development of a “ realistic " traffic generator, along with comprehensive sets of experimental measurements should allow us to achieve this goal. RESO also aims at carrying out complementary analytical studies, based on the definition of theoretical models, so as to gain new insights in the performance measures resulting from the application of “ Internet-like " traffic into classical queueing systems. To tackle these issues, RESO acquires, with the arrival of T. Begin in September 2009, new technical skills on modeling, queueing theory and performance evaluation.
Finally, the great investment that has been granted to Grid5000 (and to the interconnections Grid5000-Osaka) will profitably be used providing us with a high-performance, heterogeneous and quite novel experimental setup to confront the proposed theoretical models with real traffic measurements.