Section: New Results
Analysis of Evolving Web Usage Data
Nowadays, more and more organizations are becoming reliant on the Internet. The growing number of traces left behind user transactions (e.g. customer purchases, user sessions, etc.) automatically increases the importance of usage data analysis. Indeed, the way in which a web site is visited can change over time. By consequence, the usage models must be continuously updated in order to reflect the current behaviour of the visitors. Such a task remains difficult when the temporal dimension is ignored or simply introduced into the data description as a numeric attribute.
It is precisely on this challenge that the present thesis of Ms A. Da Silva is focused  ,  ,  ,  ,  . In order to deal with the problem of acquisition of real usage data, we propose a methodology for the automatic generation of artificial usage data over which one can control the occurrence of changes and thus, analyse the efficiency of a change detection system. Guided by tracks born of some exploratory analyzes  , we propose a tilted window approach for detecting and following-up changes on evolving usage data. In order to measure the level of changes, this approach applies two external evaluation indices based on the clustering extension. The approach also characterizes the changes undergone by the usage groups (e.g. appearance, disappearance, fusion and split) at each timestamp. The proposed approach is totally independent of the clustering method used and is able to manage different kinds of data in addition to the Web usage data. The effectiveness of the approach is evaluated on artificial data sets of different degrees of complexity and also on real data sets from different domains (academic, tourism and marketing).
Ms A. Da Silva defended her PhD in September 2009 at University of Paris IX Dauphine. All the results obtained from this work are described in her PhD thesis  .