Section: New Results
Keywords : Sequential Patterns, Sequence Database, Trends, Gradual Rule.
Gradual dependencies and temporal tendencies in sequential patterns
Participants : Celine Fiot, Florent Masseglia.
This work was done in collaboration with A. Laurent and M. Teisseire (LIRMM).
Temporal data can be handled in many ways for discovering specific knowledge. Sequential pattern mining is one of these relevant approaches when dealing with temporally annotated data. It allows discovering frequent sequences embedded in the records. In the access data of a commercial Web site, one may, for instance, discover that “5% of the users request the page register.php 3 times and then request the page help.html”. However, symbolic or fuzzy sequential patterns, in their current form, do not allow extracting:
Gradual dependencies among objects in a sequential patterns.
Temporal tendencies that are typical of sequential data.
In  , we have proposed Grasp , an algorithm intended to discover gradual trends in sequences. A gradual sequential pattern could be that considering mail server breakdowns, the more the number of received e-mails is “high” and the more the average size of received e-mails is also “high” at time t, the higher the number of time delivery errors becomes later. First, the database is converted into a membership degree database, such as for fuzzy sequential pattern mining, using predefined fuzzy sets automatically or from expert knowledge designed. Then, this membership degree database is converted into a time-related variation degree database. This dataset is the one mined for gradual sequential patterns.
This work allows discovering a new kind of knowledge, relying on the relationships between levels (or quantities) of objects in the corresponding patterns ( the more A is high, the more B is low ).However, they do not allow discovering patterns of co-evolution ( i.e. relationships between fluctuations in these levels or quantities).
Therefore, in  ,  we have proposed Ted and Eva , a couple of methods designed for evolution pattern mining. such patterns could be for instance that An increasing number of requests to registration.php during a short period precedes an increasing number of requests to faq.html, after a very short period . This knowledge would be explicit for the end-user (is the registration-form easy to fill-in?). However, modeling such temporal knowledge requires to handle a very large number of elements both in terms of attributes and records during the mining task. Searching for evolution patterns indeed requires to compare each record of a data sequence to the following ones which leads to a combinatorial space complexity. TED converts a numerical database into a trend database, describing evolution of numerical attribute values, according to time for several objects. These evolutions are represented as trend sequences. Then EVA searches for frequent evolution sequences in this trend sequence dataset.