Scalable approximation of integrals using non-reversible methods: from Riemann to Lebesgue, and why you should care

Speaker: Alexandre Bouchard Côté

Date: Fri, Nov 15, 2019 to Sat, Nov 16, 2019

Location: PIMS, University of British Columbia

Conference: PIMS - UBC Mathematical Sciences Young Faculty Award

Subject: Mathematics, Statistics

Class: Scientific

Abstract:

How to approximate intractable integrals? This is an old problem which is still a pain point in many disciplines (including mine, Bayesian inference, but also statistical mechanics, computational chemistry, combinatorics, etc).

 

The vast majority of current work on this problem (HMC, SGLD, variational) is based on mimicking the field of optimization, in particular gradient based methods, and as a consequence focusses on Riemann integrals. This severely limits the applicability of these methods, making them inadequate to the wide range of problems requiring the full expressivity of Lebesgue integrals, for example integrals over phylogenetic tree spaces or other mixed combinatorial-continuous problems arising in networks models, record linkage and feature allocation.

 

I will describe novel perspectives on the problem of approximating Lebesgue integrals, coming from the nascent field of non-reversible Monte Carlo methods. In particular, I will present an adaptive, non-reversible Parallel Tempering (PT) allowing MCMC exploration of challenging problems such as single cell phylogenetic trees.

 

By analyzing the behaviour of PT algorithms using a novel asymptotic regime, a sharp divide emerges in the behaviour and performance of reversible versus non-reversible PT schemes: the performance of the former eventually collapses as the number of parallel cores used increases whereas non-reversible benefits from arbitrarily many available parallel cores. These theoretical results are exploited to develop an adaptive scheme approximating the optimal annealing schedule.

 

My group is also interested in making these advanced non-reversible Monte Carlo methods easily available to data scientists. To do so, we have designed a Bayesian modelling language to perform inference over arbitrary data types using non-reversible, highly parallel algorithms.