Emulation and calibration has proven to be a fruitful way to get the most from simulators. The usual approach for combining simulators with field data has 3 main goals; (i) building a predictive model; (ii) estimating calibration parameters that govern the system; and (iii) estimating the discrepancy between the response surface mean of the system and the computer model.
Increasingly, there are cases where the model is fast but the code is not readily available, but a large suite of model evaluations is available. In other cases, the code is fast, but the object being simulated is so large that it cannot be saved. That is, one cannot save what was computed (a more common problem as exa-scale computers come online). In either case, experimenters are faced with having to construct an emulator of the computer model to stand in for the simulator.
The common approach for emulation uses a Gaussian process (GP). Unfortunately, it is computationally intractable when the number of evaluations is large and a poor choice for non-smooth functions.
This mini-symposium presents new approaches for emulation and calibration for large computer experiments. There are 4 talks: (i) an overview of the issues and new large sample approximations for fast calibration; (ii) non-stationary deep GPs for calibration of a model of binary black hole mergers, with variational inference; (iii) calibration in exa-scale computing with new in-situ analyses; and (iv) knot-based methods for fast GPs.
10:30
Fast parameter calibration and prediction with large computer experiments
Matthew Plumlee | Northwestern University | United States
Show details
Author:
Matthew Plumlee | Northwestern University | United States
Calibration of inexact physical or scientific models is critical across many areas of science. One example is the building of a functional description of atomic nuclei that holds across the nuclear chart. To this point, methods have largely focused on optimization-based estimation. The reasons for this are two-fold: (i) there is a non-trivial expense associated with getting outputs from the simulator; and (ii) even in the relatively narrow area near the optimum, failures of the model are common (failure to produce an output can be close to 10%). Many approaches to produce identifiable calibration thus far have created more computational overhead (e.g., computing many derivatives). This talk will (i) introduce model calibration and issues for large data sets; and (ii) propose a fast way to calibrate a computer model that includes joint discrepancy and parameter uncertainty quantification by creating a quick-to-evaluate “goodness of calibration” measure. Large sample approximations provide even faster techniques to quickly return important calibration information to the user, even when large numbers of computer simulations are completed.
11:00
Deep Gaussian process calibration for binary black hole mergers
Faezeh Yazdi | Department of Statistics and Actuarial Science, Simon Fraser University | Canada
Show details
Authors:
Faezeh Yazdi | Department of Statistics and Actuarial Science, Simon Fraser University | Canada
Derek Bingham | Department of Statistics and Actuarial Science | Canada
Ilya Mandel | School of Physics & Astronomy, Monash University | Australia
Danny Williamson | Mathematics, Exeter University | United Kingdom
Computer models, or simulators, are used to explore physical systems to reduce the cost of collecting physical observations. The traditional approach for emulation and calibration is using stationary Gaussian processes (GPs). This work was motivated by a simulator of the chirp mass of binary black hole mergers where no output is observed for large portions of the input space and more that 10^6 simulator evaluations are available. This poses two problems: (i) we need to address the severe non-stationarity posed in observing no chirp mass; and (ii) performing statistical inference for on GPs with a large number of simulator evaluations. In this talk, we propose to use a deep Gaussian process for computer model calibration. In our setting, where the number of simulation runs is quite large, variational inference is used to approximate the posterior distribution of the statistical and computer model parameters.
11:30
In Situ inference for exascale uncertainty quantification
Earl Lawrence | Statistical Sciences, Los Alamos National Laboratory | United States
Show details
Authors:
Earl Lawrence | Statistical Sciences, Los Alamos National Laboratory | United States
Michael Grosskopf | Statistical Sciences, Los Alamos National Laboratory | United States
Mary Frances Dorn | Statistical Sciences, Los Alamos National Laboratory | United States
Ayan Biswas | Statistical Sciences, Los Alamos National Laboratory | United States
Nathan Urban | Statistical Sciences, Los Alamos National Laboratory | United States
As simulations generate ever-increasing amounts of data, there are correspondingly richer opportunities for scientific discovery - discoveries that will be missed if most of the data must be discarded before it is analyzed. Because future exascale architectures will be increasingly storage-limited, it will not be possible to save the vast majority of simulation data for later analysis, requiring analysis to occur “in-situ” within the simulation. However, existing in-situ data analysis frameworks provide little support for statistical modeling or uncertainty quantification.
In this talk, I will present initial results from work to develop an approach to in situ inference that can be applied to UQ problems at exascale. Our goals are to develop an approach to fast, scalable, distributed inference for Gaussian spatial process models that can be fit to simulation data directly or as part of hierarchical Bayesian models for simulation quantities-of-interest. I will cover preliminary work in the domains of climate and space weather.
12:00
One-at-a-time knot selection for approximate Gaussian processes
Jarad Niemi | Department of Statistics, Iowa State University | United States
Show details
Authors:
Jarad Niemi | Department of Statistics, Iowa State University | United States
Nate Garton | Department of Statistics, Iowa State University | United States
Alicia Carriquiry | Department of Statistics, Iowa State University | United States
For a large number of observations, the required matrix inversions and determinants for full Gaussian Process (GP) models become computationally intractable. We utilize a knot-based approximation to the full GP model and investigate a greedy approach to sequential addition of knots, called one-at-a-time (OAT). The proposal of new knot locations is accomplished by modeling the marginal likelihood through a meta GP and the number of knots is automatically determined based on an overall convergence criteria. Compared to simultaneous knot selection, we demonstrate this methodology has computational advantages and improved predictive performance despite having fewer knots. In addition, the methodology can be utilized in regression, classification, and additional data models. We demonstrate the methodology on the Water Erosion Prediction Project computer model.