Team AxIS

Members
Overall Objectives
Application Domains
Software
New Results
Contracts and Grants with Industry
Other Grants and Activities
Dissemination
Bibliography

Section: New Results

Mining Sequential Patterns in Data Streams and Managing their History

Participants : Alice Marascu, Florent Masseglia, Yves Lechevallier.

Mining data streams is an important challenge nowadays, due to i) their strong characteristics and ii) the growing number of systems that produce them. Analyzing and managing the history of these streams implies to use a degree of approximation. Marascu's thesis (partially founded by PACA Region) was devoted to such problems with numerous publications in 2009 [32] , [41] , [31] , [42] , [34] (cf. section 6.1 ).

More precisely, this year, Marascu's thesis focused on designing A Fast Approximation Strategy for Summarizing a Set of Streaming Time Series .

Summarizing a set of streaming time series is an important issue that reliably allows information to be monitored and stored in domains such as finance, frequent pattern extraction, networks, etc. To date, most of existing algorithms have focused on this problem by summarizing the time series separately. Moreover, the same amount of memory has been allocated to each time series. Yet, memory management is an important subject in the data stream field, but a framework allocating equal amount of memory to each sequence is not appropriate.

In fact, most of existing solutions to the problems raised by the data stream history management consider that old events are less interesting and recent events should be kept at fine granularity. The main idea of these methods is that, according to the human memory, we are generally more interested in recent events than we are in older ones. As a matter of fact, the human memory keeps a detailed report of the “salient” events but a careless record of the “unimportant” events (i.e. an accident versus a breakfast). Based on this principle, we have proposed a novel approach where the most important events are kept with high fidelity, while the less important events kept at a coarser resolution. For this purpose we have presented GEAR (Global-Error Aware Representation), a fast on-line approximation algorithm.

Solving this problem calls for an approximation technique that would fastly merge the lines representing a set of values. Marascu's Ph.D. thesis report [19] gives the description of GEAR and MiTH(REGLO (Regression attentive à l'Erreur GLObale) and AMi (Approximation par les MIlieux) in French publications) (Middle THrough), a new model for merging two lines.

This work has been submitted and accepted as a long paper by an international conference (ACM SAC 2010) and by a national conference (EGC 2010).

Ms A. Marascu has defended her Ph.D thesis [19] (University of Nice Sophia Antipolis) in September 2009 at Inria.


previous
next

Logo Inria