Deep learning techniques are becoming the center of attention across many scientific disciplines. Many predictive tasks are currently being tackled using over-parameterized, black-box discriminative models such as deep neural networks, in which interpretability and robustness is often sacrificed in favor of flexibility in representation and scalability in computation. Such models have yielded remarkable results in data-rich domains, yet their effectiveness in data-scarce and risk-sensitive tasks still remains questionable, primarily due to open challenges in statistical inference and uncertainty quantification. This mini-symposium invites contributions on uncertainty quantification methods for deep learning and their application in the physical and engineering sciences. Topics include (but are not limited to) Bayesian neural networks, deep generative models, posterior inference techniques, and applications to forward/inverse problems, active learning, Bayesian optimization and reinforcement learning.
14:00
- Moved from CT16 - Encoder-decoder architectures for PDE based inference problems
Thomas O'Leary Roseberry | UT-Austin | United States
Show details
Authors:
Thomas O'Leary Roseberry | UT-Austin | United States
Umberto Villa | Washington University in St. Louis | United States
Peng Chen | The University of Texas at Austin | United States
Omar Ghattas | The University of Texas at Austin | United States
In this work we investigate architecture selection for PDE based quantity of interest surrogate modeling, with applications to UQ. Encoder-decoder networks have been successful as a form of nonlinear compression. For a properly designed encoder-decoder network, the dimension of the inner most layer (the code) is the inherent dimension of the input output mapping. How to identify architectures that parameterize this compressive mapping is an open question; we use model structure and insights from nonlinear matrix factorizations to suggest architectures that can capture the inherent dimension of the problem.
14:30
Benchmarking and Exploiting Predictive Correlations in Deep Learning
Shengyang Sun | University of Toronto | Canada
Show details
Authors:
Shengyang Sun | University of Toronto | Canada
Roger Grosse | University of Toronto | Canada
Most applications of uncertainty in deep learning employ only marginal uncertainty estimates, i.e., the predictive means and variances at individual input locations. But estimates of predictive correlations between different input locations can tell us even more. In this talk, we investigate how accurately various Bayesian models and algorithms can estimate predictive correlations, and how effectively these estimates can be used to guide exploration. First, we look at transductive active learning, where the learning algorithm chooses which data points are to be labeled, and where the test locations are known in advance. We then look at cost-sensitive Bayesian optimization, where one tries to minimize a black-box function, but some queries are much more expensive than others. In both cases, we find that it is essential to both (1) use a prior that reflects the structure of the problem, and (2) use a criterion that encourages indirect exploration.
15:00
- Moved from CT14 - Deep probabilistic learning of reduced dynamics of multiscale systems in the Small Data regime
Sebastian Kaltenbach | Technical University of Munich | Germany
Show details
Authors:
Sebastian Kaltenbach | Technical University of Munich | Germany
Phaedon-Stelios Koutsourelakis | Technical University of Munich | Germany
Dynamical models of physical and engineering systems are high-dimensional, nonlinear and exhibit multiscale behavior. They are frequently imbued with uncertainty in their parameters, their initial conditions, the measurements that are available but also about the validity of the governing equations themselves. In this work we utilize data obtained by computer simulations in order to construct effective, coarse-grained dynamical models that overcome the aforementioned scale limitations and are predictive of the system’s macroscale behavior. We argue that a direct application of state-of-the-art machine learning tools in such problems is ill-advised as data is limited due to their expense and the nature of the task at hand is extrapolative rather interpolative. The success of such an endeavor in the Small Data regime relies on accounting for domain knowledge which in our case appears in the form of physical constraints, invariances etc in the machine learning objectives. We demonstrate a fully Bayesian framework that enables the incorporation of such physical information and produces probabilistic predictive estimates that account for the unavoidable information loss in a variety of high-dimensional dynamical systems.
15:30
Deep probabilistic reduced-order models accounting for physics by using virtual observables
Maximilian Rixner | Technical University of Munich | Germany
Show details
Authors:
Maximilian Rixner | Technical University of Munich | Germany
Phaedon-Stelios Koutsourelakis | Technical University of Munich | Germany
While today it is in principle within our ability to simulate a wide array of complex systems governed by partial differential equations, the increase in available computing power lags behind our ambitions pertaining to complexity and scale. This issue of numerical cost becomes particularly pronounced when considering multi-scale systems or many-query tasks such as uncertainty propagation or inverse problems. This predicament has given rise to a pronounced interest in emulators as well as reduced-order models, which enable (significantly) faster predictions given a one-time upfront cost.
We argue that simply using pairs of input/output data in conjunction with state-of-the-art machine learning tools (e.g. deep neural nets) is not advisable as this is by definition a Small Data problem. In order to overcome this limitation as well as produce reduced-order models that can be used under extrapolative conditions (e.g. different initial/boundary conditions), prior physical knowledge should be reflected in the reduced-order models as well as in the machine learning objectives. We propose a generative probabilistic model in which physical constraints are introduced by employing the concept of virtual observables. The resulting model can thus be trained with labeled or unlabeled data (i.e. inputs of the PDE without any outputs).