Module4: IntroductiontotheNormalGamma ModelModule 4: Introduction to the Normal Gamma Model Author:...

Module 4: Introduction to the Normal GammaModel

Rebecca C. Steorts and Lei Qian

Agenda

I NormalGamma distributionI posterior derivationI application to IQ scores

NormalGamma distributionLet X1, . . . ,Xn

iid∼ N (µ, λ−1) and assume both

I the mean µ andI the precision λ = 1/σ2 are unknown.

The NormalGamma(m, c, a, b) distribution, with m ∈ R andc, a, b > 0, is a joint distribution on (µ, λ) obtained by letting

µ|λ ∼ N (m, (cλ)−1)λ ∼ Gamma(a, b)

In other words, the joint p.d.f. is

p(µ, λ) = p(µ|λ)p(λ) = N (µ | m, (cλ)−1)Gamma(λ | a, b)

which we will denote by NormalGamma(µ, λ | m, c, a, b).

NormalGamma distribution (continued)It turns out that this provides a conjugate prior for (µ, λ).

One can show the posterior is

µ,λ|x1:n ∼ NormalGamma(M,C ,A,B) (1)

i.e., p(µ, λ|x1:n) = NormalGamma(µ, λ | M,C ,A,B), where

M = cm +∑n

i=1 xic + n

C = c + nA = a + n/2B = b + 1

2(cm2 − CM2 +

∑ni=1 x2

NormalGamma distribution (continued)

For interpretation, B can also be written (by rearranging terms) as

B = b + 12

n∑i=1

(xi − x̄)2 + 12

cnc + n (x̄ −m)2. (2)

NormalGamma distribution (continued)

I M: Posterior mean for µ. It is a weighted average (convexcombination) of the prior mean and the sample mean:

M = cc + nm + n

c + n x̄ .

I C : “Sample size” for estimating µ. (The standard deviation ofµ|λ is λ−1/2/

√C .)

I A: Shape for posterior on λ. Grows linearly with sample size.I B: Rate (1/scale) for posterior on λ. Equation 2 decomposes

B into the prior variation, observed variation (sample variance),and variation between the prior mean and sample mean:

B = (prior variation)+12n(observed variation)+1

c+n (variation bw means).

Posterior derivation

First, consider the NormalGamma density. Dropping constants ofproportionality, multiplying out (µ−m)2 = µ2 − 2µm + m2, andcollecting terms, we have

NormalGamma(µ, λ | m, c, a, b) (3)= N (µ | m, (cλ)−1)Gamma(λ | a, b)

√cλ2π exp

(− 1

2cλ(µ−m)2) ba

Γ(a)λa−1 exp(−bλ)

∝µ,λ

λa−1/2 exp(− 1

2λ(cµ2 − 2cmµ+ cm2 + 2b)). (4)

Posterior derivation (continued)

Similarly, for any x ,

N (x | µ, λ−1) =

2π exp(− 1

2λ(x − µ)2)

∝µ,λ

λ1/2 exp(− 1

2λ(µ2 − 2xµ+ x2)). (5)

p(µ, λ|x1:n)∝µ,λ

p(µ, λ)p(x1:n|µ, λ)

∝µ,λ

λa−1/2 exp(− 1

2λ(cµ2 − 2cmµ+ cm2 + 2b))

× λn/2 exp(− 1

2λ(nµ2 − 2(∑

xi )µ+∑

x2i ))

= λa+n/2−1/2 exp(− 1

2λ((c + n)µ2 − 2(cm +

∑xi )µ+ cm2 + 2b +

(a)= λA−1/2 exp(− 1

2λ(Cµ2 − 2CMµ+ CM2 + 2B

))(b)∝ NormalGamma(µ, λ | M,C ,A,B)

where step (b) is by Equation 4, and step (a) holds if A = a + n/2,C = c + n, CM = (cm +

∑xi ), and

CM2 + 2B = cm2 + 2b +∑

This choice of A and C match the claimed form of the posterior,and solving for M and B, we get M = (cm +

∑xi )/(c + n) and

B = b + 12(cm2 − CM2 +

as claimed.

Alternative derivation: complete the square all way (longer andmuch more tedious).

Do a teacher’s expectations influence studentachievement?

Do a teacher’s expectations influence student achievement? In afamous study, Rosenthal and Jacobson (1968) performed anexperiment in a California elementary school to try to answer thisquestion. At the beginning of the year, all students were given anIQ test. For each class, the researchers randomly selected around20% of the students, and told the teacher that these students were“spurters” that could be expected to perform particularly well thatyear. (This was not based on the test—the spurters were randomlychosen.) At the end of the year, all students were given another IQtest. The change in IQ score for the first-grade students was:1

1The original data is not available. This data is from the ex1321 dataset ofthe R package Sleuth3, which was constructed to match the summary statisticsand conclusions of the original study.

Do a teacher’s expectations influence studentachievement?

## Warning: package 'ggplot2' was built under R version 3.3.2

#### Attaching package: 'dplyr'

## The following objects are masked from 'package:plyr':#### arrange, count, desc, failwith, id, mutate, rename, summarise,## summarize

## The following objects are masked from 'package:stats':#### filter, lag

## The following objects are masked from 'package:base':#### intersect, setdiff, setequal, union

#### Attaching package: 'reshape'

## The following object is masked from 'package:dplyr':#### rename

## The following objects are masked from 'package:plyr':#### rename, round_any

Spurters/Control Data

#spurtersx <- c(18, 40, 15, 17, 20, 44, 38)#controly <- c(-4, 0, -19, 24, 19, 10, 5, 10,

29, 13, -9, -8, 20, -1, 12, 21,-7, 14, 13, 20, 11, 16, 15, 27,23, 36, -33, 34, 13, 11, -19, 21,6, 25, 30,22, -28, 15, 26, -1, -2,43, 23, 22, 25, 16, 10, 29)

iqData <- data.frame(Treatment =c(rep("Spurters", length(x)),rep("Controls", length(y))),Gain = c(x, y))

An initial exploratory analysis

Plot the number of students versus the change in IQ score forthe two groups. How strongly does this data support the hypothesisthat the teachers’ expectations caused the spurters to performbetter than their classmates?

Histogram of Change in IQ ScoresxLimits = seq(min(iqData$Gain) - (min(iqData$Gain) %% 5),

max(iqData$Gain) + (max(iqData$Gain) %% 5),by = 5)

ggplot(data = iqData, aes(x = Gain,fill = Treatment,colour = I("black"))) +

geom_histogram(position = "dodge", alpha = 0.5,breaks = xLimits, closed = "left")+

scale_x_continuous(breaks = xLimits,expand = c(0,0))+

scale_y_continuous(expand = c(0,0),breaks = seq(0, 10, by = 1))+

ggtitle("Histogram of Change in IQ Scores") +labs(x = "Change in IQ Score", fill = "Group") +theme(plot.title = element_text(hjust = 0.5))

−35 −30 −25 −20 −15 −10 −5 0 5 10 15 20 25 30 35 40 45

Change in IQ Score

t Group

Controls

Spurters

Histogram of Change in IQ Scores

−35 −30 −25 −20 −15 −10 −5 0 5 10 15 20 25 30 35 40 45

Change in IQ Score

t Group

Controls

Spurters

Histogram of Change in IQ Scores

IQ Tests and Modeling

IQ tests are purposefully calibrated to make the scores normallydistributed, so it makes sense to use a normal model here:

spurters: X1, . . . ,XnSiid∼ N (µS , λ

−1S )

controls: Y1, . . . ,YnCiid∼ N (µC , λ

−1C ).

I We are interested in the difference between the means—inparticular, is µS > µC?

I We don’t know the standard deviations σS = λ−1/2S and

σC = λ−1/2C , and the sample seems too small to estimate them

very well.

IQ Tests and Modeling

On the other hand, it is easy using a Bayesian approach: we justneed to compute the posterior probability that µS > µC :

P(µS > µC | x1:nS , y1:nC ).

Let’s use independent NormalGamma priors:

spurters: (µS ,λS) ∼ NormalGamma(m, c, a, b)controls: (µC ,λC ) ∼ NormalGamma(m, c, a, b)

Hyperparameter settings

I m = 0 (Don’t know whether students will improve or not, onaverage.)

I c = 1 (Unsure about how big the mean change will be—priorcertainty in our choice of m assessed to be equivalent to onedatapoint.)

I a = 1/2 (Unsure about how big the standard deviation of thechanges will be.)

I b = 102a (Standard deviation of the changes expected to bearound 10 =

√b/a = E(λ)−1/2.)

Prior Samples## Warning: Removed 2031 rows containing missing values (geom_point).

−50 −25 0 25 50

µ (Mean Change in IQ Score)

λ−12 (

. of C

Prior Samples

Original question

“What is the posterior probability that µS > µC?”

I The easiest way to do this is to take a bunch of samples fromeach of the posteriors, and see what fraction of times we haveµS > µC .

I This is an example of a Monte Carlo approximation (muchmore to come on this in the future).

I To do this, we draw N = 106 samples from each posterior:

(µ(1)S , λ

(1)S ), . . . , (µ(N)

S , λ(N)S ) ∼ NormalGamma(A1,B1,C1,D1)

(µ(1)C , λ

(1)C ), . . . , (µ(N)

C , λ(N)C ) ∼ NormalGamma(A2,B2,C2,D2)

Original question (continued)

What are the updated posterior values?

and obtain the approximation

P(µS > µC | x1:nS , y1:nC ) ≈ 1N

N∑i=1

(i)S > µ

In lab this week, you will explore this data set more and analyze thisin your homework assignment.

Module4: IntroductiontotheNormalGamma ModelModule 4: Introduction to the Normal Gamma Model Author:...

Documents

Transcript of Module4: IntroductiontotheNormalGamma ModelModule 4: Introduction to the Normal Gamma Model Author:...

Lei de Hooke generalizada - dem.ist.utl.ptjfolgado/BioTecidos_0708/Aula_030308_p1.pdf · coordenadas de tensores C ... • a representação matricial destrói a anterior lei de transformação

Physics Department Lancaster University Cavity development Rebecca Seviour.

Aula 3: A Lei de Gausslabfisica.yolasite.com/resources/Aula-03-Eletricidade_basica.pdfuma superfície (gaussiana) com a carga total dentro da superfície: S 1 S 4 S 3 S 2 6 . A Lei

Discovering Pi Brianne And Rebecca

Strategic Positioning for Healthcare Reform Jeff Vrieze Partner CliftonLarsonAllen Rebecca Neth Townsend Senior Vice President Ziegler Kevin Rymanowski Chief Financial Officer Benedictine

Concursos Resumos Coleção para 16...Por esse dispositivo, nasceu a ação civil pública ambiental ainda no ano de 1981, antes mesmo da Lei da Ação Civil Pública (Lei 7.347/85),

A 1a Lei de Ohm Introdução - edutec.unesp.br lei de... · A 1a Lei de Ohm Introdução: A 1a Lei de Ohm trata da relação entre três grandezas elétricas: 1. A corrente elétrica:

Rose-Hulman Institute of Technology 2011 ASME East Coast HPV …hpvt.rose-hulman.edu/design-reports/2011/Rose_2011.pdf · 2012-01-17 · Petras Swissler Team Members Rebecca Bowermaster

F128 - Exercícios Resolvidos - Cap. 9A - Centro ... · Uma barra com secção ortogonal de área uniforme Ae ... Pela terceira lei sabemos que o módulo ... Mas da segunda lei de

Exercício lei de ohm’s2

Fiscal consolidation: Dr Pangloss meets Mr Keynes by Marcus Miller and Lei Zhang.

repositorio.ufu.br · constantes, ou seja, ... w é a densidade da água; A é um parâmetro da Lei de Antoine (24,633); B é um parâmetro da Lei de Antoine ...

APLICAÇÕES DA 2.a LEI DE NEWTON...docx

Lei de Conservação da Quantidade de Movimento

Aula 3 Eletromagnetismo, Fótons e Luztoni/otica3.pdf · Lei de Gauss da eletricidade ∇.D=0 r r. DD P D E r r q D 0 0 0 0 ... Em meios isotrópicos: Lei de indução de Lei de indução

Eletricidade Aula 11 Lei de Ohm. Lei de Ohm Expressa a relação entre a tensão, a corrente e a resistência “A corrente é diretamente proporcional à tensão.

Lei de Stefan-Boltzmann

Lei de Gaussº TERMO... · Cap´ıtulo 2 Lei de Gauss 2.1 Fluxo Elétrico Figura 2.1: Fluxo de E constante através de A perpendicular. (Serway) • O fluxo ΦE de um campo vetorial

Results from Daya Bay Xin Qian On behalf of Daya Bay Collaboration Xin Qian, BNL1.

Campo Magnético de uma Corrente: Lei de Biot-Savart · 15/27 Campo Magnético de uma Corrente: Lei de Biot-Savart Campo devido a uma espira com corrente Ao longo do eixo da espira