Section: Scientific Foundations
Markovian modelling, simulation-based inference and decision
The Mathematical modelling of systems exposed to randomness is of particular interest whenever we seek an in depth understanding of complex stochastic phenomena or if we wish to infer noise-corrupted data. The underlying system can be static or dynamic. The state variables, the parameters and the observations can be finite, continuous, hybrid (continuous/discrete), graphical, time varying, pathwise, etc.
The first step in modelling is to describe the dependency graph connecting the different variables and parameters. Note that in the Bayesian networks framework this graph can be inferred from the data. The Markovian hypothesis is made in order to limit the complexity of the model and to allow for tractable algorithms. It consists in supposing that the dependency graph is limited to local connections. It appears in dynamic contexts (Markov random processes), in static contexts (Markov random field), as well as in spatio-temporal frameworks. From a statistical point of view, Markovian models can also feature hidden variables.
The Monte Carlo (MC) methods have expanded considerably over the past two decades, and have been successful in many areas.
In MC approaches, the quantity of interest is formulated in a probabilistic way as a functional of the distribution law of a stochastic process (or simply a random variable). By sampling independent trajectories of this process, we empirically approximate the underlying targeted distribution law. The convergence of this procedure is provided by the law of large numbers and the speed of convergence by central limit theorems.
MC approaches can be used for numerical approximation of complex systems distribution laws through empirical approximations [55] [61] . They are intensively used in Bayesian inference: “Markov chain Monte Carlo” (MCMC) in the static context [63] and “sequential Monte Carlo” (SMC, also called “particle filter”) in the dynamic context [56] . In the non-Bayesian approach, Monte Carlo techniques are used to explore likelihood functions [64] . They also gave rise to general algorithmics [57] . Monte Carlo methods are also used to approximate deterministic quantity of interest, usually represented as the expected value of a functional of a process trajectory. This quantity can also be the probability that a given event has occurred. Finally, simulation-based approaches allow for approximating Markov decision problems in random and partially observed situations [54] .
MC methods can lead to very poor results because trajectories are generated blindly. Classically, adequacy to the specific problem or to data is handled afterwards by weighting the different trajectories: the higher the weight, the more the trajectory matches the targeted phenomenon or data. Some of these weights could be negligible, in which case the corresponding trajectories will not contribute to the estimator, i.e. computing power has been wasted. Recent advances, like sequential Monte Carlo or population Monte Carlo, focus on mutation-selection mechanisms that automatically concentrate MC simulations, i.e. the available computing power, into regions of interest of the state space.
Markovian modelling and algorithmics are applied successfully in numerous fields, a reason for this is its strong theoretical background. The limit behaviors of Markov processes are reasonably well identified [62] , allowing for precise analyses of the asymptotic behavior of the proposed models, as well as convergence properties of the simulation-based inference algorithms. The development of these sophisticated MC methods, together with the associated mathematical analysis, which we can summarize as Markovian engineering, represents one of the major breakthroughs in applied probability.
The Markovian approach presents many features that can be exploited for modeling in ecology. This approach integrates the tools from statistical physics and allows to model the dynamics of large number of individuals with interactions. This type of dynamics is frequently encountered in ecology, e.g., in individual-based models (IBM). The Markov and statistical physics approaches are also suitable for multi-scale modeling and analysis which also is one of the main features of ecological systems. They also bridge spatially explicit and individual-based models with aggregated models. This framework also includes the analysis of so-called neutral community models [59] .