Section: New Results
Improving QoS in Large-scale Distributed Data Storage Services
Participants : Bogdan Nicolae, Jesús Montes, Gabriel Antoniu.
The ability to sustain a stable high throughput for data access is a highly desirable property for large scale distributed storage systems, as it strongly impacts the quality of service offered by the storage system and thereby the overall performance of applications running on top of the storage service. Handling quality of service in a large-scale distributed system is however a very difficult task, as a very large number of factors are involved: the data access patterns, the status of a huge number of physical components, etc. Thus, conventional profiling and analysis is of little use.
We proposed an offline analysis approach to improve the quality of service in distributed storage systems based on global behavior modeling combined with client-side quality of service feedback. It automates the process of identifying dangerous behavior patterns in storage services, which makes reasoning about potential improvements much easier.
We demonstrated our approach, by applying GloBeM, a global behavior modeling technique based on monitoring data analysis and machine learning, to improve the quality of service in BlobSeer. We evaluated the improvement through extensive experiments on the Grid'5000 testbed under hard conditions: highly-concurrent data access patterns, for long periods of service uptime, while supporting failures of the physical storage components.
Our results show substantial improvement in sustaining a higher and more stable data access throughput. They have been submitted for publication  .