Overall Objectives
Research Program
Application Domains
Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
XML PDF e-pub
PDF e-Pub

Section: New Results

Semantic and Temporal Analysis of Online Communities

Participants : Catherine Faron Zucker, Fabien Gandon, Zide Meng.

This work is done in the PhD of Zide Meng in the OCKTOPUS ANR project.

Data Formalization: We use FOAF and SIOC schema to formalize a dataset from the popular question-answer site StackOverflow into RDF format. For some mis-matched vocabulary, we introduce ugc schema, which refer to user generated content. Moreover, in order to enrich the dataset, we link tag entity of our dataset to the corresponding entity in DBpedia by using cosine distance of two entities description to solve the disambiguation problem.

Analysis: After formalizing the dataset, we begin to exploit some graph mining algorithms, such as community detection algorithm, to analyse the dataset. We extract different kinds of graph from the RDF dataset, such as question-answer graph, co-answer graph, tag co-occurrence graph etc. We aim at finding useful information such as interest groups, experts and tag groups from this kind of question-answer site. By studying the state of the art of community detection algorithm, we analyse the advantage and disadvantage of different approaches, then try to introduce a better algorithm which could outperform others in this scenario.

Plan: During our analysis, we find out some difficult problems which haven't been well solved, such as question intent understanding and community evolution. We will use semantic technology, combining with social network analysis to solve this problem. In the future, we would develop an information management system for such dataset by using analysis algorithms we introduced to improve the performance of information retrieval on user generated content sites.