Advances in Probabilistic Programming with...

102
Advances in Probabilistic Programming with Python 2017 Danish Bioinformatics Conference Christopher Fonnesbeck Department of Biostatistics Vanderbilt University Medical Center

Transcript of Advances in Probabilistic Programming with...

Page 1: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Advances in Probabilistic Programming with Python2017 Danish Bioinformatics Conference

Christopher FonnesbeckDepartment of Biostatistics Vanderbilt University Medical Center

Page 2: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Probabilistic Programming

Page 3: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Stochastic language "primitives"Distribution over values:

X ~ Normal(μ, σ)x = X.random(n=100)

Distribution over functions:

Y ~ GaussianProcess(mean_func(x), cov_func(x))y = Y.predict(x2)

Conditioning:

p ~ Beta(1, 1)z ~ Bernoulli(p) # z|p

Page 4: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Bayesian Inference

Page 5: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Inverse Probability

Page 6: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Why Bayes?❝The Bayesian approach is attractive because it is useful. Its usefulness derives in large measure from its simplicity. Its simplicity allows the investigation of far more complex models than can be handled by the tools in the classical toolbox.❞

—Link and Barker 2010

Page 7: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 8: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 9: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Probabilistic Programmingin three easy steps

Page 10: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Encode a

Probability Model⚐

1⚐ in Python

Page 11: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Stochastic programJoint distribution of latent variables and data

Page 12: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Prior distributionQuantifies the uncertainty in latent variables

Page 13: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Prior distributionQuantifies the uncertainty in latent variables

Page 14: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Prior distributionQuantifies the uncertainty in latent variables

Page 15: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 16: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 17: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 18: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Likelihood functionConditions our model on the observed data

Page 19: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Likelihood functionConditions our model on the observed data

Page 20: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Models the distribution of hits observed from at-bats.

Page 21: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Counts per unit time

Page 22: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Infer Valuesfor latent variables

2

Page 23: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Posterior distribution

Page 24: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Posterior distribution

Page 25: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Posterior distribution

Page 26: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Probabilistic programming abstracts the inference procedure

Page 27: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Check your Model

3

Page 28: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Model checking

Page 29: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

WinBUGS

Page 30: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 31: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

model { for (j in 1:J){ y[j] ~ dnorm (theta[j], tau.y[j]) theta[j] ~ dnorm (mu.theta, tau.theta) tau.y[j] <- pow(sigma.y[j], -2) } mu.theta ~ dnorm (0.0, 1.0E-6) tau.theta <- pow(sigma.theta, -2) sigma.theta ~ dunif (0, 1000) }

Page 32: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

PyMC3☞ started in 2003

☞ PP framework for fitting arbitrary probability models

☞ based on Theano

☞ implements "next generation" Bayesian inference methods

☞ NumFOCUS sponsored project

github.com/pymc-devs/pymc3

Salvatier, Wiecki and Fonnesbeck (2016)

Page 33: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt

Page 34: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')

Page 35: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')>>> s = tt.sum(1 / (1 + tt.exp(-x)))

Page 36: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')>>> s = tt.sum(1 / (1 + tt.exp(-x)))>>> gs = tt.grad(s, x)

Page 37: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')>>> s = tt.sum(1 / (1 + tt.exp(-x)))>>> gs = tt.grad(s, x)>>> dlogistic = function([x], gs)

Page 38: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Theano graph

>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')>>> s = tt.sum(1 / (1 + tt.exp(-x)))>>> gs = tt.grad(s, x)>>> dlogistic = function([x], gs)

Page 39: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Gradients in Theano>>> from theano import function, tensor as tt>>> x = tt.dmatrix('x')>>> s = tt.sum(1 / (1 + tt.exp(-x)))>>> gs = tt.grad(s, x)>>> dlogistic = function([x], gs)>>> dlogistic([[3, -1],[0, 2]])array([[ 0.04517666, 0.19661193], [ 0.25 , 0.10499359]])

Page 40: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Example: Radon exposure

Gelman et al. (2013) Bayesian Data Analysis

Page 41: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 42: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Unpooled modelModel radon in each county independently.

where (counties)

Page 43: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Priorswith Model() as unpooled_model:

α = Normal('α', 0, sd=1e5, shape=counties) β = Normal('β', 0, sd=1e5) σ = HalfCauchy('σ', 5)

Page 44: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

>>> type(β)pymc3.model.FreeRV

Page 45: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

>>> type(β)pymc3.model.FreeRV>>> β.distribution.logp(-2.1).eval()array(-12.4318639983954)

Page 46: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

>>> type(β)pymc3.model.FreeRV>>> β.distribution.logp(-2.1).eval()array(-12.4318639983954)>>> β.random(size=4)array([ -10292.91760326, 22368.53416626, 124851.2516102, 44143.62513182]])

Page 47: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Transformed variableswith unpooled_model:

θ = α[county] + β*floor

Page 48: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Likelihoodwith unpooled_model:

y = Normal('y', θ, sd=σ, observed=log_radon)

Page 49: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Model graph

Page 50: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Calculating Posteriors

Page 51: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Bayesian approximation☞ Maximum a posteriori (MAP) estimate

☞ Laplace (normal) approximation

☞ Rejection sampling

☞ Importance sampling

☞ Sampling importance resampling (SIR)

☞ Approximate Bayesian Computing (ABC)

Page 52: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

MCMCMarkov chain Monte Carlo simulates a Markov chain for which some function of interest is the unique, invariant, limiting distribution.

Page 53: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

MCMCMarkov chain Monte Carlo simulates a Markov chain for which some function of interest is the unique, invariant, limiting distribution.

This is guaranteed when the Markov chain is constructed that satisfies the detailed balance equation:

Page 54: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Metropolis sampling

Page 55: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Metropolis sampling**

** 2000 iterations, 1000 tuning

Page 56: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Hamiltonian Monte CarloUses a physical analogy of a frictionless particle moving on a hyper-surface

Requires an auxiliary variable to be specified

☞ position (unknown variable value)

☞ momentum (auxiliary)

Page 57: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Hamiltonian Dynamics

Page 58: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Hamiltonian MC➀ Sample a new velocity from univariate Gaussian

➁ Perform n leapfrog steps to obtain new state

➂ Perform accept/reject move of

Page 59: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Hamiltonian MC**

** 2000 iterations, 1000 tuning

Page 60: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

No U-Turn Sampler (NUTS)Hoffmann and Gelman (2014)

Page 61: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 62: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 63: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 64: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Non-hierarchical models

Page 65: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Hierarchical model

Page 66: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Partial pooling modelwith Model() as partial_pooling:

# Priors mu_a = Normal('mu_a', mu=0., sd=1e5) sigma_a = HalfCauchy('sigma_a', 5)

# Random intercepts a = Normal('a', mu=mu_a, sd=sigma_a, shape=counties)

# Model error sigma_y = HalfCauchy('sigma_y',5)

# Expected value y_hat = a[county]

# Data likelihood y_like = Normal('y_like', mu=y_hat, sd=sigma_y, observed=log_radon)

Page 67: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 68: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Variational Inference

Page 69: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Variational InferenceVariational inference minimizes the Kullback-Leibler divergence

from approximate distributions, but we can't calculate the true posterior distribution.

Page 70: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Evidence Lower Bound(ELBO)

Page 71: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

ADVI*

* Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., & Blei, D. M. (2016, March 2). Automatic Differentiation Variational Inference. arXiv.org.

Page 72: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Maximizing the ELBO

Page 73: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Estimating Beta(147, 255) posterior

Page 74: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

with partial_pooling:

approx = fit(n=100000)

Average Loss = 1,115.5: 100%|██████████| 100000/100000 [00:13<00:00, 7690.51it/s]Finished [100%]: Average Loss = 1,115.5

Page 75: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

with partial_pooling:

approx = fit(n=100000)

Average Loss = 1,115.5: 100%|██████████| 100000/100000 [00:13<00:00, 7690.51it/s]Finished [100%]: Average Loss = 1,115.5

>>> approx

<pymc3.variational.approximations.MeanField at 0x119aa7c18>

Page 76: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

with partial_pooling:

approx_sample = approx.sample(1000)

traceplot(approx_sample, varnames=['mu_a', 'σ_a'])

Page 77: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 78: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Normalizing flows@

@ Rezende & Mohamed 2016

Page 79: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 80: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

with my_model: nf = NFVI('planar*16', jitter=1.)

nf.fit(25000, obj_optimizer=pm.adam(learning_rate=0.01))trace = nf.approx.sample(5000)

Page 81: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Minibatch ADVIminibatch_x = pm.Minibatch(X_train, batch_size=50)minibatch_y = pm.Minibatch(Y_train, batch_size=50)model = construct_model(minibatch_x, minibatch_y)

with model: approx = pm.fit(40000)

Page 82: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Gaussian processesA latent, non-linear function is modeled as being multivariate normally distributed (a Gaussian Process):

☞ mean function,

☞ covariance function,

Page 83: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Quadratic

Page 84: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Matern(3/2)

Page 85: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Cosine

Page 86: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Gibbs

Page 87: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Marginal GPwith pm.Model() as model: ℓ = pm.Gamma("ℓ", alpha=2, beta=1) η = pm.HalfCauchy("η", beta=5)

cov = η**2 * pm.gp.cov.Matern52(1, ℓ) gp = pm.gp.Marginal(cov_func=cov)

σ = pm.HalfCauchy("σ", beta=5) y_ = gp.marginal_likelihood("y", n_points=n, X=X, y=y, noise=σ)

Page 88: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 89: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Posterior predictive samplesn_new = 600X_new = np.linspace(0, 20, n_new)[:,None]

with model: f_pred = gp.conditional("f_pred", n_new, X_new) pred_samples = pm.sample_ppc([mp], vars=[f_pred], samples=2000)

Page 90: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 91: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Example: Psychosocial Determinants of Weight

☞ Modeling patient BMI trajectories with GP

☞ Clustering using dynamic time warping

☞ Covariate model to predict cluster membership

Page 92: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Bayesian Machine Learning

Page 93: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Convolutional variational autoencoder+

+ Photo: K. Frans

Page 94: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Convolutional variational autoencoder

import keras

class Decoder:

...

def decode(self, zs):

keras.backend.theano_backend._LEARNING_PHASE.set_value(np.uint8(0))

return self._get_dec_func()(zs)

Taku Yoshioka (c) 2016

Page 95: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

with pm.Model() as model: # Hidden variables zs = pm.Normal('zs', mu=0, sd=1, shape=(minibatch_size, dim_hidden), dtype='float32')

# Decoder and its parameters dec = Decoder(zs, net=cnn_dec)

# Observation model xs_ = pm.Normal('xs_', mu=dec.out.ravel(), sd=0.1, observed=xs_t.ravel(), dtype='float32')

Page 96: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

x_minibatch = pm.Minibatch(data, minibatch_size)

with model: approx = pm.fit(

15000, local_rv=local_RVs, more_obj_params=enc.params + dec.params, obj_optimizer=pm.rmsprop(learning_rate=0.001), more_replacements={xs_t:x_minibatch}, )

Page 97: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Bayesian Deep Learning in PyMC3

Thomas Wiecki 2016

Page 98: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

The Future☞ Discontinuous HMC

☞ Riemannian Manifold HMC

☞ Stochastic Gradient Fisher Scoring

☞ ODE solvers

Page 99: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Jupyter Notebook Gallery

bit.ly/pymc3nb

Page 100: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

Other Probabilistic Programming Tools⚐

☞ Edward

☞ GPy/GPFlow

☞ PyStan

☞ emcee

☞ BayesPy

⚐ in Python

Page 101: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model
Page 102: Advances in Probabilistic Programming with Pythonelixir-node.cbs.dtu.dk/wp-content/uploads/2017/03/2017...Probabilistic Programming in three easy steps Encode a 1Probability Model

The PyMC3 Team

☞ Colin Carroll

☞ Peadar Coyle

☞ Bill Engels

☞ Maxim Kochurov

☞ Junpeng Lao

☞ Osvaldo Martin

☞ Kyle Meyer

☞ Austin Rochford

☞ John Salvatier

☞ Adrian Seyboldt

☞ Hannes Vasyura-Bathke

☞ Thomas Wiecki

☞ Taku Yoshioka