Section: Scientific Foundations
Structure transformation is a specific domain that can be approached following different abstraction levels with respect to programming specifications. The lowest level is based on general purpose languages, such as Python or Java, associated with dedicated libraries and toolkits that implement a standard structure manipulation API, typically the DOM. On the opposite, there are dedicated languages, such as XSLT, which abstract over data and control complexity through a tree-based data model and a powerful execution model.
Some properties are expected from specialized languages in order to help solving the most common problems: expressiveness, verifiability, efficiency, modularity, reusability, scalability, succintness, correctness, etc. These properties are studied using the fundamental connection between language theory, mathematical logic, structured languages and query languages. Most of our theoretical work follows this approach.
The goal of the research published so far is limited to establishing new theoretical properties and complexity bounds. Our research differs in that we seek, in addition to these goals, efficient implementation techniques and concrete design that may be directly applied to XML systems. We also consider that some more properties are of particular importance for XML structure transformations, namely:
Type checking : The types we considered are structural constraints over documents expressed in formalisms such as DTD or XML Schema. Few techniques are able to exploit typing information of the input or output documents to provide type-safe transformations. In this domain, algorithmic advances have led to the creation of new research languages, such as XDuce, based on efficient containment of regular tree types. However, many challenges remain. While type-checking full XSLT or XQuery is theoretically impossible (these are Turing-complete languages), one challenge is to push the ``decidability envelope'' further for type-checking standard XML transformations. Another challenge is to provide effective algorithms usable in practice for realistic scenarios.
Efficiency: Transformation languages may benefit from static analysis whenever performance is concerned. Static analysis techniques usually take advantage of robust formal semantics to help development of optimized compilers and runtimes.
Processing with restricted access policies : Some applications may require particular policies for accessing XML data, that are incompatible with the current state of the art. For instance, many transformation languages assume that the whole structure to be transformed is available when the transformation process is run. In streaming applications however, the input data flow may be very large or even infinite and the transformation has to be performed on the fly, with bounded memory resources.