Section: Scientific Foundations
Computational Linguistics, Computational Logic, Knowledge Representation:
The central research theme of LED is computational semantics (where ``semantics'' is broadly construed to cover various pragmatic, and discourse and dialog level phenomena). And research within LED is particularly focused on investigating the interplay between representation and inference. Another way of putting this idea would be to say that the scientific foundations of LED's work boil down to the motto: computational linguistics meets computational logic and knowledge representation .
From computational linguistics we take the large linguistic and lexical semantics resources, the parsing and generation algorithms, and the insight that (whenever possible) statistical information should be employed to cope with ambiguity. From computational logic and knowledge representation we take the various languages and methodologies that have been developed for handling different forms of information (such as temporal information), the computational tools (such as theorem provers, model builders, and model checkers) that have been devised for working with them, together with the insight that, whenever possible, it is better to work with inference tools that have been tuned for particular problems, and moreover that, whenever possible, it is best to devote as little computational energy to inference as possible.
This picture is somewhat idealised. For example, for many languages (and French is one of them) the large scale linguistic resources (lexicons, grammars, WordNet, FrameNet, PropBank, etc.) that exist for English are not yet available. In addition, the syntax/semantics interface often cannot be taken for granted, and existing inference tools often need to be adapted to cope with the logics that arise in natural language applications (for example, existing provers for Description Logic, though excellent, do not cope with temporal reasoning). Thus we are not simply talking about bringing together known tools, and investigating how they work once they are combined — often a great deal of research, background work and development is needed; as we will soon discuss, LED is actively involved in carrying out the required background work. Nonetheless, the ideal of bringing together the best tools and ideas from computational linguistics, knowledge representation and computational logic and putting them to work in coordination is the guiding line.
Another simplification involved in the ``computational linguistics meet computational logic and knowledge representation'' motto is that often the goal is to find out when the use of computational logic can be avoided or minimised . Logical inference can be computationally expensive, and if simpler statistical methods can be used, or if only computationally tractable inference methods (such as model checking) are required, then it is highly desirable to turn to them. Empirically inspired heuristics are needed so that the tools of computational logic are only applied when truly needed, and only to the smallest problems possible.
To ensure that theoretically plausible ideas really are applicable, and to gain insight as to when empirically oriented methods can be usefully employed, LED focuses on concrete semantic phenomena (for example, tense and aspect, presupposition and anaphora resolution, dialogue structure, etc.). By carefully examining the empirical data, we aim to determine which phenomena require inference and which not; which can be dealt with using weak logics and which not; which can be handled statistically and which not; what scales up successfully and what does not...
In the next sections we'll discuss in detail some examples of these ideas.