Section: New Results
Using Web Graph Structure for Person Name Disambiguation
Participants : Elena Smirnova, Brigitte Trousse.
This work takes place in the context of Elena Smirnova's Ph.D thesis. supervised by B. Trousse (AxIS) and K. Avratchenkov (Maestro).
In the third edition of WePS campaign(WePS evaluation campaign), we have undertaken the person name disambiguation problem referred to as a clustering task. Our aim was to make use of intrinsic link relationships among Web pages for name resolution in Web search results. To date, link structure has not been used for this purpose. However, Web graph can be a rich source of information about latent semantic similarity between pages. In our approach, we hypothesize that pages referring to one person should be linked through the Web graph structure, namely through topically related pages. Our clustering algorithm consists of two stages. In the first stage, we find topically related pages for each search result page using graph-based random walk method. After, in the next step, we cluster Web search result pages with common related pages. In the second stage, Web pages are further clustered using content-based clustering algorithm. The results of evaluation have showed that this algorithm can deliver competitive performance. The official performance ranking over WePS-3 participants showed that our algorithm took the second place (in F-0.5 measure) among 8 competitors within total 27 submitted runs. This work was presented at WePS [38] and also partially described in the context of a research report on Monte-Carlo methods [54] .