The goal of optimal experimental design (OED) is to find the optimal design of a data acquisition system (e.g., location of sensors, what quantities are measured and how often, what sources are used in each experiment), so that the uncertainty in the inferred parameters—or some predicted quantity derived from them—is minimized with respect to a statistical criterion. OED for Bayesian inverse problems governed by partial differential equations (PDEs) is an extremely challenging problem. First, the parameter to be inferred is often a spatially correlated field, leading to a high dimensional parameter space upon discretization. Second, the forward PDE model is often complex and computationally expensive to solve. Third, the design space for the data acquisition system may be high dimensional and constrained. And fourth, the Bayesian inverse problem—a difficult problem in itself—is a part of the OED formulation and needs to be repeated many times. This minisymposium brings together leading experts to present recent advances in numerical methods for Bayesian OED that address these difficulties.
10:30
Scalable structure-exploiting approaches to optimal experimental design
Omar Ghattas | The University of Texas at Austin | United States
Show details
Authors:
Peng Chen | The University of Texas at Austin | United States
Omar Ghattas | The University of Texas at Austin | United States
Thomas O'Leary Roseberry | The University of Texas at Austin | United States
Umberto Villa | Washington University in St. Louis | United States
Keyi Wu | The University of Texas at Austin | United States
We address optimal experimental design (OED) problems for Bayesian inverse problems governed by PDE forward models with infinite-dimensional uncertain parameter fields. We consider the OED objective of maximizing the expected information gain (EIG), i.e. the
expected KL divergence between the posterior and the prior. Straightforward evaluation of the EIG via double loop Monte Carlo sampling is intractable, especially for complex forward problems in high dimensions. We consider several approaches to making OED tractable. The first determines a transport map that pushes forward the prior distribution to approximate the posterior, and once constructed, permits sampling at no additional cost in PDE solves. The
second replaces the posterior distribution by its Laplace approximation, leading to fast evaluation of the EIG by exploiting rapid spectral decay of the Hessian of the log posterior and a
randomized eigensolver. The third employs a low rank SVD of the sensitivity matrix to determine the data subspace that is most sensitive to the parameters. While these methods represent different trade-offs between accuracy and number of forward PDE solves, they share the common property that the cost, measured in number of PDE solves, is independent of the number of uncertain parameters and design variables. This is a consequence of exploiting the intrinsic low-dimensionality of the parameter-to-observable map. We compare these approaches on several model inverse problems.
11:00
Bayesian experimental design in high-dimensional settings: A point process approach
Jayanth Jagalur-Mohan | Massachusetts Institute of Technology | United States
Show details
Authors:
Jayanth Jagalur-Mohan | Massachusetts Institute of Technology | United States
Vishagan Ratnaswamy | Sandia National Labs | United States
Youssef Marzouk | Massachusetts Institute of Technology | United States
The task of experimental design typically involves optimizing a problem-specific objective. Most objectives, especially information theoretic ones such as entropy or mutual information, are hard to estimate even in moderately high dimensions. This renders the downstream design problem intractable. An alternative to seeking the optimal (or quasi-optimal) design is to probabilistically model the whole design space, and then to query the most likely design instance. Determinantal point processes (DPP) are one powerful way to model the design space, and early efforts have demonstrated their efficacy. DPPs offer an efficient way to trade diversity and quality of candidate choices, and have been tremendously useful in different machine learning applications. In the context of information theoretic experimental design, we will discuss how the underlying kernel of the DPP can be learned from the given model and data, thus providing a prescription for experimental design in many high-dimensional and non-Gaussian scenarios.
11:30
Bayesian design of experiments for the calibration of computational models
Yiolanda Englezou | University of Cyprus | Cyprus
Show details
Author:
Yiolanda Englezou | University of Cyprus | Cyprus
The estimation of empirical and physical models is often performed using data collected via experimentation. Hence, the design of the experiment is crucial in determining the quality of the results. For complex models, an optimal design often depends on features, particularly model parameters, which are uncertain prior to experimentation. This dependence leads naturally to a Bayesian approach which can (a) make use of any prior information on these features, and (b) be tailored to the reduction of posterior uncertainty. Optimal Bayesian design for most realistic models including those incorporating expensive computer simulators is complicated by the need to approximate an analytically intractable expected utility; for example, the expected gain in Shannon information from the prior to posterior distribution. For models which are nonlinear in the uncertain parameters or are computationally expensive this expected gain must be approximated numerically. We propose new Monte Carlo approaches for approximate numerical integration of the expected utility that give reduced bias and computational expense compared to several existing methods. Another challenge is that, when the model incorporates an expensive computer simulator, in order to perform design optimization for the physical experiment one must use a computationally cheap surrogate model in place of the simulator.
12:00
Optimal experimental design for symbolic regression
Lior Horesh | IBM and Columbia University | United States
Show details
Authors:
R. Zhao | Massachusetts Institute of Technology | United States
Lior Horesh | IBM and Columbia University | United States
K. Clarkson | IBM | United States
Sara Magliacane | IBM | United States
In the quest of scientific discovery, often neither the functional form nor the choice of parameters of the underlying model are known in advance. Under these settings, it is instrumental to perform symbolic regression, while incorporating forms of knowledge at our disposal. In realistic settings, experimental data may be restricted or costly, providing limited support for any given hypothesis as to the underlying functional form. In this study, we propose a Bayesian framework for experimental design where joint model selection and parameter estimation are pursued.