Section: New Results
Embedded data management
Participants : Nicolas Anciaux, Luc Bouganim, Philippe Pucheral, Shaoyi Yin.
In 2009, we pursued the work initiated on the definition of storage and indexing models dedicated to electronic stable storage technologies, and more precisely to NAND-Flash. Such techniques are very challenging to design in our context due to a combination of NAND Flash constraints (i.e., block-erase-before-page-rewrite constraint and limited number of erase cycles) and embedded system constraints (i.e., tiny RAM and resource consumption predictability). Last year, we proposed a new alternative for indexing Flash-resident data, called PBFilter, which specifically addresses the embedded context. This approach organizes the index structure in a purely sequential way and speeds up key lookups thanks to two principles called Summarization and Partitioning. PBFilter has been patented by Gemalto and INRIA [46] and published in [20] . PBFilter has been preliminary designed for primary-key indexes in an append-oriented database context. Our current work focuses on the integration of this Flash-based storage and indexing model in a complete DBMS engine supporting secondary key indexes, updates and transaction management. An efficient atomic transaction protocol has been defined based on a virtualization of all database updates (i.e., data modifications and deletions do not modify the database state; they rather apply at load time to dynamically build a consistent version of the pages in RAM).
Thanks to its excellent properties in terms of read performance, energy consumption and shock resistance, NAND Flash has become a credible competitor even for traditional disks on high-end servers. A natural extension of the aforementioned action is thus to study how database systems adapt to this new form of secondary storage. Before we can answer this question, we need to fully understand the performance characteristics of flash devices. We have designed a benchmark, called uFLIP, to cast light on all relevant usage patterns of current, as well as future, flash devices. uFLIP is a set of nine micro-benchmarks based on IO patterns (i.e., a sequences of IOs). Each micro-benchmark is a set of experiments designed around a single varying parameter, that affects either time, size, or location. Thanks to uFLIP, we established which kind of IOs should be favored (or avoided) when designing algorithms and architectures for flash-based systems. We also set up a benchmarking methodology which takes into account the particular characteristics of flash devices. This work, published in [16] and [21] , was done in cooperation with the University of Copenhagen and the Reykjavík University. More recently, we have also devised a mechanism for measuring the energy consumption of flash devices. While energy consumption cannot be traced to individual IOs, we can associate energy consumption figures to IO patterns, which helps understand further the behavior of the devices. The source code of uFLIP is available on http://www.uflip.org and has been register at APP (Agence de Protection des Programmes) in 2009 [29] .