Section: New Results
Due to the great uncertainties that arise in air quality modeling, relying on a single model may not be sufficient. Therefore ensembles of simulations are now considered in a wide range of applications, from uncertainty estimation to operational forecast.
Ensemble forecasting with machine learning algorithms
Participants : Vivien Mallet, Gilles Stoltz [ CNRS ] , Karim Drifi, Édouard Debry [ INERIS ] .
Based on ensemble simulations, improved forecasts can be generated by means of linear combinations of the individual forecasts. A weight is associated to each model, depending on past observations and simulations (Figure 3 ). New machine learning algorithms (sequential aggregation) were developed and used for this purpose. Most of these provide theoretical bounds on the performance (compared to the optimal constant model combination) and deliver significantly improved forecasts.
The practical performance of the methods which have been developed is very satisfactory. The theoretical bounds are always reached proving that the potential of the ensemble is well exploited. This was checked for large ensembles (dozens of models) as well as for small ensembles (a few models). The methods were successfully applied to forecast ozone, nitrogen dioxide and aerorols in operational mode, on the Prév'air platform (http://www.prevair.org ) managed by INERIS. After an operational test from summer 2008 to summer 2009, the methods have been officially introduced in the daily forecasts of Prév'air.
The aggregation methods proved to be efficient on extreme events, but not enough to forecast threshold exceedances: they cannot compensate enough for the poor threshold detection of the individual models. Classification methods, mainly Perceptron, have been studied to address this issue. These methods can slightly improve the forecasts, but further work is needed.
Uncertainty estimation based on multimodel ensembles
Participants : Damien Garaud, Vivien Mallet.
Air quality forecasts are limited by strong uncertainties especially in the input data and in the physical formulation of the models. There is a need to estimate these uncertainties for the evaluation of the forecasts, the production of probabilistic forecasts, and a more accurate estimation of the error covariance matrices required by data assimilation.
Because a large part of the uncertainty in the forecast originates from uncertainties in the model formulation (primarily the physical parameterizations), a multimodel ensemble seems to be the adequate tool for uncertainty estimation. A large ensemble with 100 members was generated over year 2001 and analyzed with criteria like the Brier score. Preliminary work on the calibration of the ensemble was carried out (Figure 4 ): the ensemble members were selected so as to optimize the evaluation criteria. This may be formulated as a combinatorial optimization problem where one searches for an optimal combination of models out of a huge space of acceptable models.