Keywords : Entity Ranking, INEX, XML Document, named entities, linkrank, categories, Topic difficulty.

Entity Ranking

Participants : Anne-Marie Vercoustre, Vladimir Naumovski.

The goal of entity ranking is to retrieve entities as answers to a query. The objective is no longer to tag the names of the entities in documents (in batch mode) but rather to return a list of the relevant entity names, and possibly a page or some description associated with each entity. We have developed a system for Entity Ranking in Wikipedia that addresses two specific tasks: a task where the category of the expected entity answers is provided; and a task where a few (two or three) examples of the expected entity answers are provided. In our approach, candidate pages are ranked by combining three different scores: a linkrank score, a category score, and the initial search engine similarity score. The architecture of our system provides a general framework for evaluating entity ranking which allows for replacing some modules by more advanced modules and evaluate alternatives or different combinations of the score functions [35] , [32] . This year we focused on evaluating our approach on taking into account topic difficulty. We show that the knowledge of predicted classes of topic difficulty can be used to further improve the entity ranking performance. To predict the topic difficulty, we generate a classifier that uses features extracted from an INEX topic definition to classify the topic into an experimentally pre-determined class. This knowledge is then utilised to dynamically set the optimal values for the retrieval parameters of our entity ranking system. Our experiments suggest that topic difficulty prediction is a promising approach that could be exploited to improve the effectiveness of entity ranking [70] . The current system has been developed in the context of the INEX (Initiative for the Evaluation of XML Retrieval) track on Entity Ranking [27] , [34] .


