Team AxIS

Overall Objectives
Scientific Foundations
Application Domains
New Results
Contracts and Grants with Industry
Other Grants and Activities

Section: Software

Keywords : web usage mining, pre processing, http logs.


Participants : Sergiu Chelcea, Doru Tanasa [ co-correspondant ] , Christophe Mangeat, Brigitte Trousse [ co-correspondant ] .

AxISLogMiner Preprocessing is a software application that implements our preprocessing methodology for Web Usage Mining [111] . We used Java to implement our application as this gives several benefits both in terms of added functionality and in terms of implementation simplicity. The application uses Perl modules for the operations carried on the log file such as: log files join, log cleaning, robot requests filtering and session/visit/episode identification. To store the preprocessed log file, in our relational model we used JDBC with Java. The result of this preprocessing is then used in data mining tool to extract, for instance, sequential patterns consisting in sequences of Web pages frequently requested by users. Recently, we extensed this software with the ability of recording the keywords employed by users in search engines to find the browsed pages.

The keywords extracted from the http referrer field can therefore be associated with the Web pages and used to build a dissimilarity matrix for those Web pages. Furthermore, this allows extracting clusters of similar pages in terms of content filtered through users' views (keywords). The results of such clustering were used in the experiments conducted in the GWUM work [59] (cf. section  6.4.3 ).


Logo Inria