Team Orpailleur

Overall Objectives
Scientific Foundations
Application Domains
New Results
Other Grants and Activities

Section: Scientific Foundations

From KDD to KDDK

Knowledge discovery in databases

is a process for extracting knowledge units from large databases, units that can be interpreted and reused within knowledge-based systems.

From an operational point of view, the KDD process is performed within a KDD system including databases, data mining modules, and interfaces for interactions, e.g. editing and visualization. The KDD process is based on three main operations: selection and preparation of the data, data mining, and finally interpretation of the extracted units.

The KDDK process –as implemented in the research work of the Orpailleur team– is based data mining methods that are either symbolic or numerical. The methods that are used in the Orpailleur team are the following:

Then, the principle summarizing KDDK can be read as follows [84] : going “from complex data units to complex knowledge units guided by domain knowledge” (KDDK) or “knowledge with/for knowledge”. Two original aspects can be underlined: (i) the fact that the KDD process is guided by domain knowledge, and (ii) the fact that the extracted units are embedded within a knowledge representation formalism to be reused in a knowledge-based system for problem solving purposes.

In the research work of the Orpailleur team, the various instantiations of the KDDK process are all based on the idea of classification . Classification is a polymorphic process involved in various tasks, e.g. modeling, mining, representing, and reasoning. Accordingly, a knowledge-based system may be designed, fed up by the KDDK process, and used for problem-solving in application domains, e.g. agronomy, astronomy, biology, chemistry, and medicine, with a special mention for semantic web activities involving text mining, content-based document mining, and intelligent information retrieval [67] , [68] .


Logo Inria