Probabilistic and deterministic kernel methods have proven very useful and versatile for a number of classification,
density estimation, and prediction problems arising in science and society. Yet, these methods are often considered as black boxes, and the fantastic expressiveness allowed by the choice of the underlying positive definite kernel is classically underestimated. This double minisymposium gathers researchers from various horizons who have been investigating the incorporation of physical and other structural information in kernel methods in contexts such as Gaussian Process (GP) modelling, adaptive Bayesian integration, space-filling design with minimum energy measures versus maximum mean discrepancy, and probabilistic prediction of probability density fields. In Part I, the emphasis will be put on the incorporation of physical laws and boundary information in GP-related models, with applications in a number of fields encompassing in particular electromagnetism, mechanics, geophysics and biology. In Part II, the focus will be put more specifically on kernels and distances for space-filling design, image-valued GP modelling, high-dimensional integration, and assessing predictions of probability density fields by spatial logistic Gaussian and related models.
14:00
Bounded approximations of singular kernels: minimum energy measures, maximum mean discrepancy and space-filling design
Luc Pronzato | CNRS | France
Show details
Author:
Luc Pronzato | CNRS | France
We consider the construction of space-filling designs that minimize the Maximum Mean Discrepancy (MMD) between the empirical measure of design points and the uniform measure on a given compact set. Singular kernels have great potential in this context, due to their intrinsic repelling property and sometimes to the absence of tuning parameters (in particular, the correlation length when the kernel corresponds to a correlation function). The logarithmic kernel k(x,x')=-log ||x-x'|| is a typical example. Although singular kernels can be used for MMD minimization with Frank-Wolfe algorithm, the energy of a discrete measure is not defined and there is no natural RKHS associated. In this talk, we shall investigate the properties of a family of bounded kernels (with an associated RKHS) that approximate singular kernels, obtained by approximating a Completely Monotone (CM) function with singularity at 0 by a sequence of bounded CM functions. The accuracy of the approximation is highlighted by comparing the discrete minimum-energy measure of the original singular kernel with that obtained numerically from a discrete approximation constructed with the bounded kernel. Various examples will be presented to illustrate the construction.
Joint work with Anatoly Zhigljavsky (Cardiff University, UK)
14:30
In High-Dimensional Integration, Structure is Everything
Motonobu Kanagawa | Eurecom Sophia Antipolis | France
Show details
Author:
Motonobu Kanagawa | Eurecom Sophia Antipolis | France
Integration is an ancient task with a profound role in very many fields, but far from solved. Adaptive Bayesian Integration is a recent probabilistic numerical framework that has produced modest but interesting empirical performance (outperforming MCMC in Wall-Clock time on problems of modest size), primarily by leveraging structural prior information. Scaling it to the high-dimensional setting remains challenging. I will present recent work that introduces formal performance guarantees for adaptive Bayesian Quadrature and lays out a space of models for future consideration. It seems clear that improving upon MCMC in high-dimensional settings (with ABQ or otherwise) will require strong structural priors.
Joint work with Philipp Hennig and Alexandra Gessner.
15:00
Learning Invariances using the Marginal Likelihood
Mark van der Wilk | Imperial College London | United Kingdom
Show details
Author:
Mark van der Wilk | Imperial College London | United Kingdom
We present a practical way of introducing convolutional structure into Gaussian processes, making them more suited to high-dimensional inputs like images. The main contribution of our work is the construction of an inter-domain inducing point approximation that is well-tailored to the convolutional kernel. This allows us to gain the generalisation benefit of a convolutional kernel, together with fast but accurate posterior inference. We investigate several variations of the convolutional kernel, and apply it to MNIST and CIFAR-10, where we obtain significant improvements over existing Gaussian process models. We also show how the marginal likelihood can be used to find an optimal weighting between convolutional and RBF kernels to further improve performance. This illustration of the usefulness of the marginal likelihood may help automate discovering architectures in larger models.
Joint work with Matthias Bauer, ST John, James Hensman.
15:30
Probabilistic prediction of probability density fields: how to assess and compare predictive performances?
Athénaïs Gautier | Idiap Research Institute and University of Bern | Switzerland
Show details
Author:
Athénaïs Gautier | Idiap Research Institute and University of Bern | Switzerland
In a number of contexts ranging in particular from spatial statistics to stochastic optimization, it is of interest
to estimate probability density functions varying across physical and parameter spaces. In the presented work, we focus on the specific case where such density fields must be estimated based on samples of heterogeneous sizes across the index set of interest, and where not only parameters but also the shape, multi-modality and other features of the objective distributions can vary over space. While adapting geostatistiscal density field estimation approaches to such a framework turned out to raise a number of theoretical and methodological issues, we developed a non-parametric Bayesian approach relying on a spatial extension of logistic Gaussian model. Here we focus in particular on evaluating the predictive performance of our model versus adaptations of distributional Kriging, calling for investigations in the field of scoring probabilistic forecasts of (fields of) densities.
We demonstrate the applicability of the proposed class of density-valued random field models under several kernel settings, and compare them to each other and to baseline methods using investigated scoring approaches and data from several examples including a contaminant source localization problem under random geology.
Joint work with David Ginsbourger and Guillaume Pirot.