Applied Bayesian Data Analysis

Roy Levy, Ph.D.

Upcoming Seminar: February 18-20, 2021, Remote Seminar

Regression

Multiple Regression

Regression 28

Multiple Regression Model: J Predictors

Multiple xs, y for each of n subjects

• y = (y1, y2, y3,…, yn)

• x = (x1, x2, x3,…, xn)

• xi = (xi1, xi2,…, xiJ)

yi = β0 + β1xi1 + … + βJxiJ + εi εi independent, ~ N(0, σε2)

yi | xi, β0, β1,…, βJ, σε2 ~ N(β0 + β1xi1 +…+ βJxiJ, σε

yi = β0 + β′xi + εi εi independent, ~ N(0, σε2), β = (β1,…, βJ)

yi | xi, β, σε2 ~ N(β0 + β′xi, σε

Regression

Multiple Regression: Bayesian Analysis

Regression 30

Posterior Distribution

p(β0, β, σε | y, x) p(y | β0, β, σε, x) p(β0, β, σε)

Regression

Conditional Probability of the Data

Regression 32

Conditional Probability of the Data

p(β0, β1, σε | y, x) p(y | β0, β, σε, x) p(β0, β, σε)

Assuming exchangeability of subjects

p(y | β0, β, σε, x) = Πi p(yi | β0, β, σε, xi)

Assuming conditional normality

yi | β0, β, σε, xi ~ N(β0 + β1xi1 + … + βJxiJ, σε2)

Regression

Prior Distribution

Regression 34

Priors

p(β0, β, σε) = p(β0) p(β) p(σε)

Multivariate

prior?

0( ) ( , )p N =

( )p =β

~ Exp( )

Regression 35

Priors

p(β0, β, σε) = p(β0) p(β) p(σε)

0( ) ( , )p N =

~ Exp( )

1 2( ) ( ) ( ) ( )Jp p p p=β

Regression 36

Priors

p(β0, β, σε) = p(β0) p(β) p(σε)

0( ) ( , )p N =

~ Exp( )

1 2( ) ( ) ( ) ( )Jp p p p=β

1 1 2 2

2 2 2( , ) ( , ) ( , )J J

N N N =

Regression

Assuming exchangeability

Priors

p(β0, β, σε) = p(β0) p(β) p(σε)

0( ) ( , )p N =

~ Exp( )

1 2( ) ( ) ( ) ( )Jp p p p=β

1 1 2 2

2 2 2( , ) ( , ) ( , )J J

N N N =

2( ) ( , )jp N = j = 1,…, J

Regression 38

Priors

p(β0, β, σε) = p(β0) p(β) p(σε)

0( ) ( , )p N =

~ Exp( )

2( ) ( , )jp N = j = 1,…, J

Regression

Tying It All Together:

Complete Model and Posterior Distribution

Regression 40

Πi N(β0 + β1xi1 + … + βJxiJ, σε2)

j = 1,…, J

0 ~ ( , )N

2~ ( , )j N

~ Exp( )

Regression

Example

Regression 42

Example

• End-of-chapter test scores, from summing dichotomously

scored item responses from 50 subjects

• Regress Chapter 3 on Chapter 1 and Chapter 2

Test # items Range Mean

Standard

Deviation

Chapter 1 16 4-16 14.10 2.02

Chapter 2 18 3-18 14.34 3.29

Chapter 3 15 1-15 12.22 2.96

Correlation Chapter 1 Chapter 2

Chapter 2 0.58

Chapter 3 0.69 0.68

Regression 43

Πi N(β0 + β1xi1 + β2xi2, σε2)

j = 1, 2

0 ~ (0,900)N

~ (0,900)j N

~ Exp(1)

Regression

Core Code

Ch3Testi | Ch1Testi, β0, β1, β2, σε ~ N(β0 + β1Ch1Testi, + β2Ch2Testi, σε2)

fitted.model <- stan_glm(

Ch3Test ~ Ch1Test + Ch2Test,

Regression

Core Code

β0 ~ N(0, 900) = N(0, 302)

βj ~ N(0, 900) = N(0, 302) for j = 1, 2

σε ~ Exp(1)

fitted.model <- stan_glm(

prior_intercept = normal(0, 30, autoscale =

FALSE),

prior = normal(0, 30, autoscale = FALSE),

prior_aux = exponential(1, autoscale =

FALSE),

Regression

‘Regression model (Ch1Test and Ch2Test predictors)

in Stan via rstanarm.R’

Regression 47

Convergence of 4 Chains for 5,000 Iterations After 1,000 Iterations of Warmup

Regression 48

Regression 49

Regression 50

Regression 51

Regression 52

Summary of 4 Chains for 5,000 Iterations After 1,000 Iterations of Warmup

Regression 53

Regression 54

Mean SD

95% HPD

Effective

Intercept -2.52 1.97 -6.24 1.46 21585

Ch1Test 0.66 0.17 0.31 0.97 13993

Ch2Test 0.38 0.10 0.18 0.59 16222

sigma 1.92 0.20 1.55 2.32 11706

R.squared 0.59 0.08 0.43 0.73 17241

Regression 55

Mean SD

95% HPD

Intercept -2.52 1.97 -6.24 1.46

Ch1Test 0.66 0.17 0.31 0.97

Ch2Test 0.38 0.10 0.18 0.59

sigma 1.92 0.20 1.55 2.32

R.squared 0.59 0.08 0.43 0.73

Interpretation of the slope, R2?

Regression 56

Write Up

We conducted a Bayesian normal theory linear regression

analysis, specifying the outcome as

yi | β0, β, σε, xi ~ N(β0 + β1xi1 + β2xi2, σε2) i = 1,…, n

and employed diffuse prior distributions

β0 ~ N(0, 900); βj ~ N(0, 900) j = 1, 2; σε ~ Exponential(1).

4 chains were run for 5,000 iterations following a warmup period

of 1,000 iterations. Inspection of the trace plots and the PSRF ( )

evidenced convergence. The marginal posterior distributions for

the parameters are depicted in Figure xxxx and summarized in

Table xxxx….

Regression 57

Table of Results from Bayesian Analysis

Bayesian Analysis

95% Cred.

Interval

β0 -2.52 1.97 (-6.24, 1.46)

β1 0.66 0.17 (0.31, 0.97)

β2 0.38 0.10 (0.18, 0.59)

σε 1.92 0.20 (1.55, 2.32)

R2 0.59 0.08 (0.43, 0.73)

Regression 58

Classical and Bayesian Analyses of the Traditional Model

Frequentist Analysis of

Traditional Model

Bayesian Analysis of

Traditional Model

Est. SE

95% Conf.

95% Cred.

Interval

β0 -2.54 1.93 (-6.41, 1.34) -2.52 1.97 (-6.24, 1.46)

β1 0.66 0.17 (0.33, 0.99) 0.66 0.17 (0.31, 0.97)

β2 0.38 0.10 (0.18, 0.59) 0.38 0.10 (0.18, 0.59)

σε 1.95 0.28 (1.60, 2.37) 1.92 0.20 (1.55, 2.32)

R2 0.60 0.59 0.08 (0.43, 0.73)

Numerically similar, conceptually different

Results for β0 troublesome

Applied Bayesian Data Analysis - Statistical Horizons

Transcript of Applied Bayesian Data Analysis - Statistical Horizons

Applied Bayesian Data Analysis - Statistical Horizons

Documents

Transcript of Applied Bayesian Data Analysis - Statistical Horizons

Bayesian Graphical Models

Bayesian Inference for Normal Mean - University of Torontonosedal/sta313/sta313-normal-mean.pdf · Bayesian Inference for Normal Mean. ... (1 ) 100% Bayesian ... where the z-value

Bayesian Optimization with Exponential Convergencepapers.nips.cc/paper/5715-bayesian-optimization-with... · 2015-12-18 · Bayesian Optimization with Exponential Convergence Kenji

Bayesian Inference - cpb-us-w2.wpmucdn.com

Bayesian and frequentist inference

(7) Bayesian linear regression - Nc State Universityreich/ABA/notes/BLR.pdfBayesian linear regression I Linear regression is by far the most common statistical model I It includes

Bayesian Decision and Bayesian Learning

Non-parametric Bayesian Methods - Cambridge Machine Learning …mlg.eng.cam.ac.uk/tutorials/07/zg.pdf · 2007-07-02 · Non-parametric Bayesian Models •Bayesian methods are most

Bayesian Linear Regression - University at Buffalo

Lecture 17 Bayesian Econometrics · PDF fileBayesian Econometrics Bayesian Econometrics: Introduction • Idea: ... • The typical problem in Bayesian statistics involves obtaining

Chapter 5. Bayesian Statistics (II)

Parametric Density Estimation: Bayesian Estimation. Naïve ...rita/uml_course/add_mat/PDE_Bayesian_NB.pdf · Parametric Density Estimation: Bayesian Estimation. ... Learn Bayes classifier

Introduction to Bayesian Statistics - 3milotti/Didattica/... · Introduction to Bayesian Statistics - 3 Edoardo Milotti Università di Trieste and INFN-Sezione di Trieste . Bayesian(inference(and(maximum0likelihood

New Horizons in Cosmology with Spectral Distortions of the ...

Bayesian Interpretations of Regularization - mit.edu9.520/spring09/Classes/class15-bayes.pdf · The Plan Bayesian estimation basics Bayesian interpretation of ERM Bayesian interpretation

Bayesian Inference, Basics - Stony Brookzhu/ams570/Bayesian_Basics.pdf · Bayesian Inference In Bayesian inference there is a fundamental distinction between • Observable quantities

Topic 4: Statistical Inference. Outline Statistical inference –confidence intervals –significance tests Statistical inference for β 1 Statistical inference.

ST451 - Lent term Bayesian Machine Learning

Lecture 17 – Part 1 Bayesian Econometrics 1 Lecture 17 – Part 1 Bayesian Econometrics Bayesian Econometrics: Introduction • Idea: We are not estimating a parameter value, ...

Robust Bayesian clustering - UCL Computer Science - · PDF fileRobust Bayesian clustering ... Bayesian learning, graphical models, approximate inference, variational inference, ...