#### Towards the verification of semantic assumptions required by
probabilistic inference algorithms

Participants :
Wonyeol Lee, Hangyeol Wu, Xavier Rival [correspondant] , Hongseok Yang.

Probabilistic programming is the idea of writing models from statistics
and machine learning using program notations and reasoning about these
models using generic inference engines.
Recently its combination with deep learning has been explored intensely,
which led to the development of so called deep probabilistic programming
languages, such as Pyro, Edward and ProbTorch.
At the core of this development lie inference engines based on stochastic
variational inference algorithms.
When asked to find information about the posterior distribution of a model
written in such a language, these algorithms convert this posterior-inference
query into an optimisation problem and solve it approximately by a form of
gradient ascent or descent.
We analysed one of the most fundamental and versatile variational inference
algorithms, called score estimator or REINFORCE, using tools from
denotational semantics and program analysis.
We formally expressed what this algorithm does on models denoted by
programs, and exposed implicit assumptions made by the algorithm on
the models.
The violation of these assumptions may lead to an undefined optimisation
objective or the loss of convergence guarantee of the optimisation process.
We then describe rules for proving these assumptions, which can be
automated by static program analyses.
Some of our rules use nontrivial facts from continuous mathematics, and
let us replace requirements about integrals in the assumptions, such as
integrability of functions defined in terms of programs’ denotations, by
conditions involving differentiation or boundedness, which are much easier
to prove automatically (and manually).
Following our general methodology, we have developed a static program
analysis for the Pyro programming language that aims at discharging the
assumption about what we call model-guide support match.
Our analysis is applied to the eight representative model-guide pairs
from the Pyro webpage, which include sophisticated neural network models
such as AIR.
It found a bug in one of these cases, and revealed a non-standard use of
an inference engine in another, and showed that the assumptions are met
in the remaining six cases.