Workshop on statistical challenges in astronomy ... · easy to create statistical models ......

23
(Last adjustments: December 6, 2017) Workshop on statistical challenges in astronomy –Hierarchical models in Stan Presenter Dr. John T. Ormerod School of Mathematics & Statistics F07 University of Sydney (w) 02 9351 5883 (e) john.ormerod (at) sydney.edu.au

Transcript of Workshop on statistical challenges in astronomy ... · easy to create statistical models ......

Page 1: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

(Last adjustments: December 6, 2017)

Workshop on statistical challenges in astronomy–Hierarchical models in Stan

Presenter

Dr. John T. Ormerod

School of Mathematics & Statistics F07

University of Sydney

(w) 02 9351 5883

(e) john.ormerod (at) sydney.edu.au

Page 2: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Why MCMC?

2 Do you have data? (x)

2 Do you want to build a rich statistical model? (p(x|θ))

2 Perhaps you want to incorporate prior information? (p(θ))

2 Are the integrals to get posterior densities are intractable?

p(θ|x) = p(x|θ)p(θ)p(x)

=p(x,θ)∫p(x,θ)dθ

2 Are point estimates not adequate? (If not Variational Bayes might be for you).

Workshop on statistical challenges in astronomy 2

Page 3: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Review: MCMC

2 Markov Chain Monte Carlo. The samples form a Markov Chain.

2 Markov property:

p(θt+1|θt, . . . , θ1) = p(θt+1|θt)

2 Invariant distribution:

πP = π

2 Detailed balance: sufficient condition:

P(θt+1,A) =∫Aq(θt+1, θt)dy

π(θt+1)q(θt+1, θt) = π(θt)q(θt, θt+1)

2 A Markov chain satisfying the detailed balance will converge in distribution to

π(θ).

2 We want to design Markov chains to mimic posterior distributions.

Workshop on statistical challenges in astronomy 3

Page 4: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Review: Random Walk Metropolis Hastings

2 Want samples from posterior distribution: p(θ|x) ∝ p(x,θ).

2 Algorithm: (Suppose θ ∈ Rd)

◦ Suppose we have x, p(x,θ) and θ0. Choose B ∈ Rd×d

◦ Loop t = 1, . . . , T .

∗ Sample θprop ∼ N(θt−1,B)

∗ With probability

α = min

[1,p(x,θprop)

p(x,θt−1)

]Set θt = θprop otherwise set θt = θt−1.

2 The above Markov chain can be shown (under mild conditions) to satisfy the

detailed balance condition and converges in distribution to p(θ|x).

2 In practice we can treat samples {θt}Tt=1 as independent samples from p(θ|x)for sufficiently large T .

Workshop on statistical challenges in astronomy 4

Page 5: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan: Hamiltonian Monte Carlo

2 In the previous example a multivariate normal distribution was used to generate

proposal samples.

2 Stan uses Hamiltonian dynamics (Hamiltonian Monte Carlo) to generate good

proposal samples and then uses a accept-reject step to ensure that the Markov

chain mimics the posterior distribution.

2 See HMC notes for details.

2 HMC samples are typically much higher quality and require less tuning than

other samplers.

Workshop on statistical challenges in astronomy 5

Page 6: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan: Hamiltonian Monte Carlo

Workshop on statistical challenges in astronomy 6

Page 7: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan: Where to get help

In anticipation that we will not get through everything today I will begin with a list

of resources on Stan

2 Homepage http://mc-stan.org/

2 User Guide

http://mc-stan.org/users/

2 Documentation, tutorials and case studies

http://mc-stan.org/users/documentation/index.html

2 Nice book

“Bayesian Models for Astrophysical Data Using R, JAGS, Python, and Stan” by

Joseph M. Hilbe, Rafael S. de Souza & Emille E. O. Ishida

https://www.bayesianmodelsforastrophysicaldata.com/

2 Code from above book

https://github.com/astrobayes/

Workshop on statistical challenges in astronomy 7

Page 8: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Motivation for Stan

2 Fit rich Bayesian statistical models.

2 The process

1. Create a statistical model.

2. Perform inference on the model.

3. Evaluate.

2 Difficulty with models of interest in existing tools.

Workshop on statistical challenges in astronomy 8

Page 9: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Motivation (cont.)

2 Usability

◦ general purpose, clear modelling language, integration

2 Scalability

◦ model complexity, number of parameters, data size

2 Efficiency

◦ high effective sample sizes, fast iterations, low memory

2 Robustness

◦ model structure (i.e. posterior geometry), numerical routines

Workshop on statistical challenges in astronomy 9

Page 10: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

What is Stan?

2 Statistical model specification language

◦ high level, probabilistic programming language

◦ user specifies statistical model

◦ easy to create statistical models

2 4 cross-platform users interfaces

◦ CmdStan - command line

◦ RStan - R integration

◦ PyStan - Python integration

◦ MStan - Matlab integration (user contributed)

Workshop on statistical challenges in astronomy 10

Page 11: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Inference

2 Hamiltonian Monte Carlo (HMC)

◦ Sample parameters on unconstrained space (transform & Jacobian adjust-

ment)

◦ Gradients of the model wrt parameters (automatic differentiation).

◦ Sensitive to tuning parameters (No-U-Turn sampler).

2 No-U-Turn Sampler (NUTS)

◦ warmup: estimates mass matrix and step size

◦ sampling: adapts number of steps

◦ maintains detailed balance

2 Optimization

◦ BFGS, Newtons method

2 Variational inference.

Workshop on statistical challenges in astronomy 11

Page 12: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan to Scientists

2 Flexible probabilistic language, language still growing

2 Focus on science: the modelling and assumptions

◦ access to multiple algorithms (default is pretty good)

◦ faster and less error prone than implementing from scratch

◦ efficient implementation

2 Lots of (free) modelling help on users list

2 Responsive developers, continued support for Stan

2 Not just for inference

◦ fast forward sampling; lots of distributions

◦ gradients for arbitrary functions

Workshop on statistical challenges in astronomy 12

Page 13: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

The Stan Language

Data Types

2 basic:

real, int, vector, row_vector, matrix

2 constrained:

simplex, unit_vector, ordered, positive_ordered,

corr_matrix, cov_matrix

2 arrays

Workshop on statistical challenges in astronomy 13

Page 14: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

The Stan Language

Bounded variables

2 applies to int, real, and matrix types

2 lower example:

real<lower=0> sigma;

2 upper example:

real<upper=100> x;

Workshop on statistical challenges in astronomy 14

Page 15: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

The Stan Language

Program Blocks

2 data (optional)

2 transformed data (optional)

2 parameters (optional)

2 transformed parameters (optional)

2 model

2 generated quantities (optional)

Workshop on statistical challenges in astronomy 15

Page 16: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: basic structure – Linear regression

data {

int<lower=0> N;

vector[N] y;

vector[N] x;

}

parameters {

real alpha;

real beta;

real<lower=0> sigma;

}

model {

alpha ~ normal(0,10);

beta ~ normal(0,10);

sigma ~ cauchy(0,5);

for (n in 1:N)

y[n] ~ normal(alpha + beta * x[n], sigma);

}

Workshop on statistical challenges in astronomy 16

Page 17: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: vectorisation – Linear regression

data {

int<lower=0> N;

vector[N] y;

vector[N] x;

}

parameters {

real alpha;

real beta;

real<lower=0> sigma;

}

model {

alpha ~ normal(0,10);

beta ~ normal(0,10);

sigma ~ cauchy(0,5);

y ~ normal(alpha + beta*x, sigma);

}

Workshop on statistical challenges in astronomy 17

Page 18: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: hierarchical example

2 Educational Testing Service study to analyze effect of coaching

2 SAT-V in eight high schools

2 No prior reason to believe any program was:

◦ more effective than the others

◦ more similar to others

[see Rubin, 1981; Gelman et al., Bayesian Data Analysis, 2003]

Workshop on statistical challenges in astronomy 18

Page 19: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: hierarchical example

Estimated Standard Error of

School Treatment Effect Treatment Effect

A 28 15

B 8 10

C −3 16

D 7 11

E −1 9

F 1 11

G 18 10

H 12 18

Workshop on statistical challenges in astronomy 19

Page 20: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: Model 0

Set up block stuctures so that data can be read.

data {

int<lower=0> J; // # of schools

real y[J]; // estimated treatment

real<lower=0> sigma[J]; // std err of effect

}

parameters {

real<lower=0, upper=1> theta;

}

model {

}

Workshop on statistical challenges in astronomy 20

Page 21: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: Model 1 – No pooling

Each school treated independently.

data {

int<lower=0> J; // # of schools

real y[J]; // estimated treatment

real<lower=0> sigma[J]; // std err of effect

}

parameters {

real theta[J]; // school effect

}

model {

y ~ normal(theta, sigma);

}

Workshop on statistical challenges in astronomy 21

Page 22: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: Model 2 – Complete pooling

All schools lumped together.

data {

int<lower=0> J; // # of schools

real y[J]; // estimated treatment

real<lower=0> sigma[J]; // std err of effect

}

parameters {

real theta; // pooled school effect

}

model {

y ~ normal(theta, sigma);

}

Workshop on statistical challenges in astronomy 22

Page 23: Workshop on statistical challenges in astronomy ... · easy to create statistical models ... (No-U-Turn sampler). ... estimates mass matrix and step size sampling: adapts number of

Stan Example: Eight Schools: Model 3 – Hierarchical Model

Estimate hyperparameters µ and σ2

data {

int<lower=0> J; // # of schools

real y[J]; // estimated treatment

real<lower=0> sigma[J]; // std err of effect

}

parameters {

real theta[J]; // school effect

real mu; // mean for schools

real<lower=0> tau; // variance between schools

}

model {

theta ~ normal(mu, tau);

y ~ normal(theta, sigma);

}

Workshop on statistical challenges in astronomy 23