## Project-Team : modbio

## Section: New Results

**Keywords: ***locally coherent discourse*.

## Computing locally coherent discourses (ATIPE)

**Participant:**Ernst Althaus.

One central problem in discourse generation and summarisation is to
structure the discourse in a way that maximises
*coherence*. Coherence is the property of a good human-authored
text that makes it easier to read and understand than a
randomly-ordered collection of sentences.
Several papers in the recent literature
have focused on defining *local* coherence, which evaluates the
quality of sentence-to-sentence transitions. Measures of
local coherence specify which *ordering* of the sentences makes
for the most coherent discourse, and can be based e.g. on Centering
Theory or on statistical models.
While formal models of local coherence have made substantial
progress over the past few years, the question of how to efficiently
*compute* an ordering of the sentences in a discourse that
maximises local coherence is still largely unsolved.

In [16], we present the first algorithm that computes optimal locally coherent discourses, and establishes the complexity of the discourse ordering problem. We first prove that the discourse ordering problem for local coherence measures is equivalent to the Travelling Salesman Problem (TSP). This result implies that the problem is not approximable. Despite this negative result, we show that by applying modern algorithms for TSP, the discourse ordering problem can be solved efficiently enough for practical applications. We define a branch-and-cut algorithm based on linear programming, and evaluate it on discourse ordering problems based on the GNOME corpus and the BLLIP corpus. If the local coherence measure depends only on the adjacent pairs of sentences in the discourse, we can order discourses of up to 50 sentences in under a second. If it is allowed to depend on the left-hand context of the sentence pair, computation is often still efficient, but can become expensive.