Session des journées MAS de la SMAI
”Statistique computationnelle et méthodes de simulation”
Clermont-Ferrand, 29-31 Aout 2012
organisateur de la session : Jean-Michel Marin (Université Montpellier 2)
Les méthodes de simulation prennent une place de plus en plus importante dans les procédures d’estimation statistique qu’elle soit fréquentielle ou bayésienne. Le but de cette session est mettre en évidence cela en présentant notamment les techniques Approximate Bayesian Computation (ABC)
Orateur 1: (40 minutes) Pierre PUDLO (Université Montpellier 2)
Tour d’horizon des méthodes computationnelles bayésiennes approchées, dites ABC
Depuis leur introduction par Tavaré et al. (1997) pour prédire l’âgede l’ancêtre commun le plus récent d’un échantillon d’individus, le calcul bayésien approché, ou Approximate Bayesian Computation (ABC), a fait florès dans de nombreux champs d’applications : génétique des populations, biologie systémique, traitement d’image, etc. Lorsque la vraisemblance s’écrit comme une intégrale sur un espace de très grande dimension, les méthodes classiques de Monte-Carlo par chaînes de Markov (MCMC) ne sont pas très efficaces. Les méthodes ABC (Beaumont (2010), Csillery et al. (2010), Marin et al. (2012)) proposent de simuler suivant la loi jointe sur le couple paramètres et jeu de données, puis de reconstruire la loi a posteriori en estimant la densité conditionnée par le jeu de données observé.
Ces méthodes sont facilement parallélisables sur des clusters mais restent lourdes en temps de calcul. Dans cet exposé, nous verrons quelques améliorations de l’algorithme de base : ABC-MCMC de Marjoram et al. (2003) ABC-SMC de Del Moral et al. (2012), ABC-PMC de Beaumont et al. (2009),... Nous mentionnerons la comparaison au travers de statistiques résumées Fearnhead et Prangle (2012). Enfin, nous discuterons du problème de choix de modèle : Robert et al. (2011), Marin et al. (2011). Nous illustrerons ces questions par des exemples en génétique des populations pour répondre à des problèmes de phylogéographie.
Beaumont, M. (2010). Approximate Bayesian Computation in Evolution and Ecology. Annual Review of Ecology, Evolution, and Systematics 41, 379-406.
Beaumont, M., J.-M. Cornuet, J.-M. Marin, et C. Robert (2009). Adaptive approximate Bayesian computation. Biometrika 96(4), 983-990.
Csillèry, K., M. Blum, O. Gaggiotti, et O. François (2010). Approximate Bayesian computation (ABC) in practice. Trends in Ecology and Evolution 25, 410- 418.
Del Moral, P., A. Doucet, et A. Jasra (2012). An adaptive sequential Monte Carlo method for approximate Bayesian computation. Statistics and Computing, to appear.
Fearnhead, P. et D. Prangle (2012). Constructing summary statistics for approximate Bayesian computation: Semi-automatic Approximate Bayesian Computation. J. Royal Statist. Society Series B.
Marin, J.-M., N. Pillai, C. P. Robert, et J. Rousseau (2011). Relevant statistics for bayesian model choice. http://arxiv.org/abs/1110.4700.
Marin, J.-M., P. Pudlo, C. P. Robert, et R. Ryder (2012). Approximate bayesian computational methods. Statistics and Computing, to appear.
Marjoram, P., J. Molitor, V. Plagnol, et S. Tavaré (2003, December). Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 100(26), 15324-15328.
Robert, C. P., J.-M. Cornuet, J.-M. Marin, et N. Pillai (2011). Lack of confidence in approximate bayesian computation model choice. Proc. Natn. Acad. Sci. USA 108(37), 15112-15117.
Tavaré, S., D. Balding, R. Griffith, et P. Donnelly (1997). Inferring coalescence times from DNA sequence data. Genetics 145, 505-518
Orateur 2: (20 minutes) : Meïli BARAGATTI (Montpellier SupAgro)
Likelihood-free parallel tempering
Approximate Bayesian Computational (ABC) methods (or likelihood-free methods) have appeared in the past fifteen years as useful methods to perform Bayesian analysis when the likelihood is analytically or computationally intractable. Several approaches have been proposed: Monte Carlo Markov Chains (MCMC) methods have been developed by Marjoram et al. (2003) and by Bortot et al. (2007) for instance, and sequential methods have been proposed among others by Sisson et al. (2007), Beaumont et al. (2009) and Del Moral et al. (2012). Until recently, while ABC-MCMC methods were the reference, sequential ABC methods have appeared to outperform them (see for example McKinley et al. (2009) or Sisson et al. (2007)). We will present a new algorithm combining population-based MCMC methods with ABC requirements, using an analogy with the Parallel Tempering algorithm (Geyer (1991)). Performances will be compared with existing ABC algorithms.
Beaumont, M., J.-M. Cornuet, J.-M. Marin, and C. Robert (2009). Adaptive approximate Bayesian computation. Biometrika 96(4), 983-990.
Bortot, P., S. Coles and S. Sisson (2007). Inference for stereological extremes. Journal of the American Statistical Association , 102, 84-92.
Del Moral, P., A. Doucet, et A. Jasra (2012). An adaptive sequential Monte Carlo method for approximate Bayesian computation. Statistics and Computing, to appear.
Geyer, C. (1991) Markov chain monte carlo maximum likelihood. Computing Science and Statistics: Proceedings of the 23rd Symposium on the interface, 156-163.
Marjoram, P., J. Molitor, V. Plagnol, and S. Tavaré (2003, December). Markov chain Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA, 100(26), 15324-15328.
McKinley, T., A. Cook, and R. Deardon (2009). Inference in epidemic models without likelihoods. The international Journal of Biostatistics, 5(1).
Sisson, S. A., Fan, Y., and Tanaka, M. (2007). Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA, 104(6), 1760-1765.
Orateur 3: (20 minutes) : Florence FORBES (INRIA Rhône-Alpes)
Comparing variational and Markov Chain Monte Carlo approaches for the analysis of fMRI data.
We address the issue of jointly detecting brain activity and estimating the hemodynamic response from event-related fMRI data. We adopt the so-called region-based Joint Detection-Estimation (JDE) framework built by making use of a regional bilinear generative model of the BOLD response and constraining the parameter estimation by physiological priors using temporal and spatial information in a Markovian model. In contrast to previous works that use Markov Chain Monte Carlo (MCMC) techniques to sample the resulting intractable posterior distribution, we recast the JDE into a missing data framework and derive a Variational Expectation-Maximization (VEM) algorithm for its inference. A variational approximation is used to approximate the Markovian model in the unsupervised spatially adaptive JDE inference, which allows automatic fine-tuning of spatial regularisation parameters. It provides a new algorithm that exhibits interesting properties compared to the previously used MCMC-based approach. Experiments on artificial and real data allow a comparison of the two approaches and show that VEM-JDE is robust to model mis-specification and provides computational gain while maintaining good performance in terms of activation detection and hemodynamic shape recovery.
Orateur 4 : (20 minutes) : Mathieu RIBATET (Université Montpellier 2)
Conditional simulation of max-stable processes
Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several locations and the areal modelling of extremes is therefore essential if the spatial dependence of extremes has to be appropriately taken into account. This paper proposes a general framework for conditional simulations of max-stable processes with an emphasis on Brown–Resnick and Schlather processes. We test the method on simulated data and give an application to extreme rainfall around Zurich and extreme temperature in Switzerland. Results show that the proposed framework provides accurate conditional simulations and can handle real-sized problems.
Key words: Conditional simulation, Markov chain Monte Carlo, Max- stable process, Precipitation, Regular conditional distribution, Temperature.