The challenge of acquiring the most valuable data from experiments—for the purpose of inference, prediction, design, or control—has received substantial attention in statistics, applied mathematics, and engineering and science. This task can be formalized through the framework of optimal experimental design (OED). Models describing experimental conditions and processes, both physical and statistical, can be particularly useful for arriving at these optimal designs. However, model-based OED faces many challenges, such as formulational difficulties, choices of optimality criteria, computation of information metrics, handling nonlinear responses and non-Gaussian distributions, and dealing with expensive and dynamically evolving simulations. This minisymposium invites researchers of model-based OED, in the broad areas of computational and applications-oriented developments.
14:00
Optimal Bayesian Design of Thermal Desorption (TDS) and Temperature programmed Desorption (TPD) Experiments
Udo von Toussaint | Max Planck Institute for Plasmaphysics | Germany
Show details
Authors:
Udo von Toussaint | Max Planck Institute for Plasmaphysics | Germany
Roland Preuss | Max Planck Institute for Plasmaphysics | Germany
TDS and TPD are ubiquitous experimental methods in surface physics and provide information about binding energies of species. However, the analysis of the measured spectra is an ill-posed problem and often results in conflicting energy assignments. Here we show that optimized measurement protocols (ie. choice of heating rates and holding times) based on Bayesian Experimental Design result in a significant increased information gain. The approach is illustrated with TPD data from amorphous hydrocarbon films and some pitfalls are outlined. The necessary marginalisations are tackled with a combination of analytic approaches and the Nested Sampling algorithm.
14:30
Bayesian optimal experimental design for chemical rate constant measurement using mass spectrometry
James Oreluk | Sandia National Laboratories | United States
Show details
Authors:
James Oreluk | Sandia National Laboratories | United States
Leonid Sheps | Sandia National Laboratories | United States
Habib Najm | Sandia National Laboratories | United States
Experiments are one of the primary methods used by scientists to gain a deeper understanding of a physical phenomenon; however, experimentation is extremely laborious and costly to conduct. One means of accelerating on-going research is to identify the experimental conditions that would cause the largest reduction of uncertainty in a parameter of interest. In this talk, we explore a sequential Bayesian optimal experimental design to guide future experiments of a chemically reacting system under both model and parameter uncertainty. Specifically, we study the gas-phase reactions occurring in a high-pressure photolysis reactor coupled to a mass spectrometer. In this experiment, the evolution of numerous highly reactive intermediates and product species is measured simultaneously through a real-time change in the mass spectrum across a range of ionizing beam energies. We employ a hierarchical Bayesian analysis to infer the uncertain physics and instrument model parameters from these measurements. An optimal experimental design is found by selecting the set of conditions for the physics and instrument model that maximizes the expected information gain over the constrained design space. We discuss our findings, illustrating the key challenges encountered when dealing with a large collection of experimental data.
15:00
High dimensional optimal design using stochastic gradient optimisation and Fisher information gain
Sophie Harbisher | Newcastle University | United Kingdom
Show details
Authors:
Sophie Harbisher | Newcastle University | United Kingdom
Colin Gillespie | Newcastle University | United Kingdom
Dennis Prangle | Newcastle University | United Kingdom
Finding high dimensional designs is increasingly important in applications of experimental design, but is computationally demanding under existing methods. We introduce an efficient approach applying recent advances in stochastic gradient optimisation methods. To allow rapid gradient calculations we work with a computationally convenient utility function, the trace of the Fisher information. We provide a decision theoretic justification for this utility, analogous to work by Bernardo (1979) on the Shannon information gain. Due to this similarity we refer to our utility as the Fisher information gain. We compare our optimisation scheme, SGO-FIG, to existing state-of-the-art methods and show our approach is quicker at finding designs which maximise expected utility, allowing designs with hundreds of choices to be produced in under a minute in one example.
15:30
Asymptotically exact optimal experimental design using local surrogate models to design aquifer monitoring strategies
Andrew D. Davis | U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory | United States
Show details
Authors:
Andrew D. Davis | U.S. Army Corps of Engineers, Cold Regions Research and Engineering Laboratory | United States
Xun Huan | University of Michigan | United States
Youssef Marzouk | Massachusetts Institute of Technology | United States
Optimal Bayesian experimental design seeks to maximize the expected information gain on a quantity of interest (e.g. model parameter), which can be formally quantified based on the Kullback-Leibler (KL) divergence from the posterior distribution to the prior distribution. However, numerical evaluation of this expected utility, e.g. via a nested Monte Carlo formulation, is often intractable due to the expensive forward models required by many practical applications. We thus introduce a computationally cheaper surrogate model of the likelihood function to bypass this obstacle. This strategy has two challenges: i) the surrogate model introduces bias into the estimate of the KL divergence, and ii) the surrogate may not be integrable and thus not a proper probability density function. We show that refining the surrogate model ensures asymptotic decay of the surrogate bias, and using a bias-variance tradeoff to trigger refinements yields a rate-optimal strategy. We also leverage high performance computing resources to generate Monte Carlo samples across many parallel cores. By exchanging expensive likelihood evaluations across all cores, we build a shared surrogate model and further reduce the computational cost. We apply this framework to find optimal sensor placement in an unconfined aquifer in order to infer the transmissivity field.