Abstract
Bayesian inference problems require sampling or approximating high-dimensional probability dis- tributions. The focus of this paper is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm. This construction leads to interacting particle systems, the mean-field limit of which is a gradient flow on the space of probability distributions equipped with a certain geometrical structure. We leverage this viewpoint to shed some light on the convergence properties of the algorithm, in particular addressing the problem of choosing a suitable positive definite kernel function. Our analysis leads us to considering certain nondifferentiable kernels with adjusted tails. We demonstrate significant performance gains of these in various numerical experiments.
Original language | English |
---|---|
Pages (from-to) | 1-39 |
Number of pages | 39 |
Journal | JOURNAL OF MACHINE LEARNING RESEARCH |
Volume | 24 |
Issue number | 56 |
Publication status | Published - 1 Jan 2023 |