Many statistical models of interest in engineering, the sciences, and machine learning define a likelihood function that is computationally prohibitive to evaluate. This may be induced from the model only being known through a data generating process or the likelihood function involving a high-dimensional integral (e.g., from a marginalization procedure or the computation of a normalizing constant). In these cases, it is difficult to apply classical inference methods such as maximum likelihood estimation or likelihood-based Bayesian inference algorithms. To enable inference in these settings, several approaches have been developed in the statistics and machine learning community that avoid direct evaluation of the likelihood function (e.g., approximate Bayesian computation). Despite these success, efficiently solving such problems remains challenging, especially in high dimensions, or when only limited information or few samples are available. This mini-symposium will explore new algorithms and methodologies for performing likelihood-free inference in these complex models.
Component-wise approximate Bayesian computation via Gibbs-like steps
Christian P. Robert | Universite Paris-Dauphine | France
Approximate Bayesian computation methods are useful for generative models with intractable likelihoods. These methods are however sensitive to the dimension of the parameter space, requiring exponentially increasing resources as this dimension grows. To tackle this difficulty, we explore a Gibbs version of the ABC approach that runs component-wise approximate Bayesian computation steps aimed at the corresponding conditional posterior distributions, and based on summary statistics of reduced dimensions. While lacking the standard justifications for the Gibbs sampler, the resulting Markov chain is shown to converge in distribution under some partial independence conditions. The associated stationary distribution can further be shown to be close to the true posterior distribution and some hierarchical versions of the proposed mechanism enjoy a closed form limiting distribution. Experiments also demonstrate the gain in efficiency brought by the Gibbs version over the standard solution.
Approximate Bayesian computation via the energy statistic
Florence Forbes | Universite Grenoble Alpes, Inria, CNRS | France
Approximate Bayesian computation (ABC) has become an essential part of the Bayesian toolbox for addressing problems in which the likelihood is prohibitively expensive or entirely unknown, making it intractable. ABC defines a quasi-posterior by comparing observed data with simulated data, traditionally based on some summary statistics, the elicitation of which is regarded as a key difficulty. In recent years, a number of data discrepancy measures bypassing the construction of summary statistics have been proposed, including the Kullback--Leibler divergence, the Wasserstein distance and maximum mean discrepancies. Here we propose a novel importance-sampling (IS) ABC algorithm relying on the so-called two-sample energy statistic. We establish a new asymptotic result for the case where both the observed sample size and the simulated data sample size increase to infinity, which highlights to what extent the data discrepancy measure impacts the asymptotic pseudo-posterior. The result holds in the broad setting of IS-ABC methodologies, thus generalizing previous results that have been established only for rejection ABC algorithms. Furthermore, we propose a consistent V-statistic estimator of the energy statistic, under which we show that the large sample result holds. Our proposed energy statistic based ABC algorithm is demonstrated on a variety of models, including a Gaussian mixture, a moving-average model of order two, a bivariate beta and a multivariate g-and-k distribution.
On the use of approximate Bayesian computation Markov chain Monte Carlo with inflated tolerance and post-correction
Matti Vihola | University of Jyvaskyla | Finland
Approximate Bayesian computation allows for inference of complicated probabilistic models with intractable likelihoods using model simulations. The Markov chain Monte Carlo implementation of approximate Bayesian computation is often sensitive to the tolerance parameter: low tolerance leads to poor mixing and large tolerance entails excess bias. We consider an approach using a relatively large tolerance for the Markov chain Monte Carlo sampler to ensure its sufficient mixing, and post-processing the output leading to estimators for a range of finer tolerances. We introduce an approximate confidence interval for the related post-corrected estimators, and propose an adaptive approximate Bayesian computation Markov chain Monte Carlo, which finds a `balanced' tolerance level automatically, based on acceptance rate optimisation. Our experiments show that post-processing based estimators can perform better than direct Markov chain targetting a fine tolerance, that our confidence intervals are reliable, and that our adaptive algorithm leads to reliable inference with little user specification.
Confidence Regions and Hypothesis Testing in a Likelihood-Free Inference Setting
Ann Lee | Carnegie Mellon University | United States
Parameter estimation, statistical tests and confidence regions are often cited as the cornerstones of classical statistics. They allow scientists to make inferences about the underlying process that generated the observations while attaching uncertainties to these statements. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage in a likelihood-free inference (LFI) setting with complex, high-dimensional data. In this talk, I will present a frequentist approach to LFI that first formulates the classical likelihood ratio test (LRT) as a parametrized classification problem, and then uses a Neyman inversion of the LRT to build confidence regions for parameters of interest. I will also present a goodness-of-fit test for checking whether the constructed hypothesis tests and confidence regions are valid. Our methods are based on the key observation that the LRT statistic, the rejection probability of the test, and the coverage of the confidence set are conditional distribution functions which often vary smoothly as a function of the parameter of interest. Hence, instead of relying solely on samples simulated at fixed parameter settings (as is the convention in standard Monte Carlo solutions), one can leverage machine learning tools and data simulated in the neighborhood of a parameter to improve estimates of quantities of interest. I will present some of our preliminary work on the topic with examples from the physical sciences.
This is joint work with Niccolo Dalmasso (Department of Statistics & Data Science, Carnegie Mellon University) and Rafael Izbicki (Department of Statistics, Federal University of Sao Carlos).