Transport maps are deterministic couplings between probability measures with broad applications in uncertainty quantification and machine learning. They have been used for posterior sampling in Bayesian inference, for accelerating Markov chain Monte Carlo and importance sampling algorithms, and as building blocks of generative models and density estimation methods. More broadly, transport---including but not limited to optimal transport---provides an important mathematical foundation for many tools in machine learning and uncertainty quantification. The recent surge of interest in transport maps has been accompanied by efficient numerical methods that make constructing and learning such maps tractable in high dimensions and for large data sets. This minisymposium brings together researchers from uncertainty quantification and machine learning to discuss recent advances in theory, numerics, and applications of transport maps and related techniques.
16:30
A transport-based multifidelity preconditioner for Markov chain Monte Carlo
Benjamin Peherstorfer | Courant Institute, New York University | United States
Show details
Authors:
Benjamin Peherstorfer | Courant Institute, New York University | United States
Youssef Marzouk | Massachusetts Institute of Technology | United States
Underlying most sampling-based methods for Bayesian modeling is the task of drawing samples from the posterior distribution. In many situations of interest, the posterior is ill-conditioned in the sense that it is strongly non-Gaussian, multi-modal, and skewed. The efficacy of Markov chain Monte Carlo and other sampling techniques quickly deteriorates for ill-conditioned posterior distributions, which means that large numbers of likelihood evaluations are required to achieve desired effective sample sizes. We propose a multifidelity preconditioner that seeks efficient sampling via a proposal that is explicitly tailored to the posterior at hand and that is constructed efficiently with low-cost, low-fidelity models/data. First, a transport map is constructed that deterministically couples a reference Gaussian distribution with the approximation of the posterior given by the low-fidelity model/data. Then, the posterior distribution is explored using the non-Gaussian proposal distribution derived from the transport map. By relying on the low-fidelity model/data only to construct the proposal distribution, the approach guarantees that the stationary distribution of the Markov chain is the original posterior. In our numerical examples, our multifidelity approach achieves significant speedups compared to single-fidelity Monte Carlo sampling methods.
17:00
HINT: Hierarchical Invertible Neural Transport for General and Sequential Bayesian inference
Robert Scheichl | Ruprecht-Karls University Heidelberg | Germany
Show details
Authors:
Gianluca Detommaso | University of Bath | United Kingdom
Jakob Kruse | Visual Learning Lab, Heidelberg University | Germany
Lynton Ardizzone | Visual Learning Lab, Heidelberg University | Germany
Carsten Rother | Visual Learning Lab, Heidelberg University | Germany
Ullrich Koethe | Visual Learning Lab, Heidelberg University | Germany
Robert Scheichl | Ruprecht-Karls University Heidelberg | Germany
In this talk, I will introduce Hierarchical Invertible Neural Transport (HINT), an algorithm that merges Invertible Neural Networks and optimal transport to sample from a posterior distribution in a Bayesian framework. This method exploits a hierarchical architecture to construct a Knothe-Rosenblatt transport map between an arbitrary density and the joint density of hidden variables and observations. After training the map, samples from the posterior can be immediately recovered for any contingent observation. Any underlying model evaluation can be performed fully offline from training without the need of a model-gradient. Furthermore, no analytical evaluation of the prior is necessary, which makes HINT an ideal candidate for sequential Bayesian inference. We demonstrate the efficacy of HINT on some numerical experiments.
17:30
Preconditioning at scale for a fusion plasma inverse problem
Ian Langmore | Google | United States
Show details
Author:
Ian Langmore | Google | United States
Using a combination of interferometry, SEE, and magnetic probes, we reconstruct time-varying non-equilibrium electron density profiles in TAE's "Norman" plasma generator. The system is under-constrained, so reconstruction is Bayesian, with samples produced via Hamiltonian Monte Carlo (HMC). Reconstructions are done at 10,000+ time points in each of 1000+ experiments, resulting in a challenging sampling problem. To meet this challenge, we carefully apply linear preconditioning techniques, making use of recent theoretical results in HMC sampling efficiency.
18:00
- MOVED from MS192 - Deep tensor product Rosenblatt transformation for sampling of high-dimensional distributions
Sergey Dolgov | University of Bath | United Kingdom
Show details
Authors:
Tiangang Cui | Monash University | Australia
Sergey Dolgov | University of Bath | United Kingdom
Uncertainty quantification in many variables is a pressingly needed task, but the computation of desired quantities of interest involves a notoriously difficult high-dimensional integration. Function approximations, in particular low-rank tensor product decompositions, have become popular for reducing this computational cost down to linear scaling in the number of variables.
However, tensor approximations rely on weak (in a certain sense) correlations between variables, which might not be the case for posterior density functions arising in realistic Bayesian inverse problems. On the other hand, the (inverse) Rosenblatt transform allows independent Monte Carlo sampling, but it still suffers from the enormous cost of a high-dimensional integration.
In this talk, I will present a nested approximation framework, where an approximate tensor decomposition of a simplified (e.g. tempered) density is used to compute an efficient Rosenblatt transformation, which is in turn used as a de-correlating change of coordinates to aid tensor approximation for a more difficult model. Similarly to deep neural networks, composing many layers of this procedure can significantly expand the class of feasible distributions. We demonstrate that this procedure can produce efficient MCMC samples in several challenging cases, such as a very concentrated likelihood in the inverse elliptic equation, and the parameter identification in the predator-prey and Lorenz models.