Section: Scientific Foundations
Semantics and Design of Document-Based Information Systems
Designing and maintaining document-based information systems, such as Web sites, is a real challenge. On the Web, it is more common to find inconsistent pieces of information than a well structured site. Our goal is to study and build tools to support the design, development and maintenance of complex but coherent sites. Our approach involves Software Engineering and Artificial Intelligence techniques. There is strong similarities between structured documents (such as Web sites) and programs; the Web is a good candidate to experiment with some of the technologies that have been developed in software engineering.
Most of the efforts deployed in the Web domain are related to languages for documents presentation (HTML, CSS, XSL) and structure (XML) and to Web sites modelling and Web services (UML). Very little has been done on Web sites formal semantics to support their quality and evolution. The initiative led by the W3C consortium on Semantic Web (XML, RDF, RDF Schema) and ontologies aims at a different objective related to resource discovery.
The term “semantics” has at least two significations: a) the meaning of words and texts, and b) the study of propositions in a deductive theory.
To address the first definition of the word semantics, we use taggers, thesaurus, ontologies, in order to add some semantics to plain texts. However we are especially interested with the latter definition, trying to give a formal semantics to Web sites.
We distinguish between the static aspects of a site that may involve a set of global constraints (not only syntactic but also semantic and context dependent) to be verified, and the dynamic aspects. Dynamic aspects formalize the Web site navigation that also needs to be specified and validated (cf. the execution of a program).
Our approach is related to the Semantic Web but yet different. The main goal of the Semantic Web is to ease computer-based information retrieval, formalizing data that is mostly textual, for further discovery. We are mostly concerned with Web sites design and production, taking into account their semantics, development and evolution. In this respect we are closer to what is called content management and we would like to insure that a particular Web site does follow a predefined specification. We use approaches and techniques based on logic programming and formal semantics of programming languages, in particular operational semantics.