Binomial, Multinomial & Poisson · 2012-08-31 · Multinomial Distribution •Series of n...
Transcript of Binomial, Multinomial & Poisson · 2012-08-31 · Multinomial Distribution •Series of n...
Confidence Intervals
• Wald C.I.
• Exact C.I.
α
2= P (Y ≥ y) =
n�
j=y
�n
j
�pj(1− p)n−j
α
2= P (Y ≤ y) =
y�
j=0
�n
j
�pj(1− p)n−j
pL =�
1 +n− y + 1
yF2y,2(n+y−1)(1− α/2)
�−1
pU =�
1 +n− y
(y + 1)F2(y+1),2(n+y)(α/2)
�−1
C(p, n) =n�
j=0
I(j, p) ·�
n
j
�pj(1− p)n−j
M ≤ 1.96 ·�
1n
p(1− p)
n ≥Z2
α/2
Mp(1− p)
6
α
2= P (Y ≥ y) =
n�
j=y
�n
j
�pj(1− p)n−j
α
2= P (Y ≤ y) =
y�
j=0
�n
j
�pj(1− p)n−j
pL =�
1 +n− y + 1
yF2y,2(n+y−1)(1− α/2)
�−1
pU =�
1 +n− y
(y + 1)F2(y+1),2(n+y)(α/2)
�−1
C(p, n) =n�
j=0
I(j, p) ·�
n
j
�pj(1− p)n−j
M ≤ 1.96 ·�
1n
p(1− p)
n ≥Z2
α/2
Mp(1− p)
6
|p− π|�1nπ(1− π)
.∼ N(0, 1)
nπ ≥ 5, n(1− π) ≥ 5
|p− π|�1np(1− p)
< Zα/2
pU = p + Zα/2
�1n
p(1− p)
pL = p− Zα/2
�1n
p(1− p)
α = 0.05α/2 = 0.025
Zα/2 = 1.96n = 115
p = 0.6 =69115
pL = 0.6− 1.96 ·�
0.6 · 0.4115
=
= 0.6− 0.0895 = 0.5105
pU = 0.6 + 1.96 ·�
0.6 · 0.4115
=
= 0.6 + 0.0895 = 0.6895
p ± Zα/2
�1n
p(1− p)
P (Y ≤ y) =α
2
5
Coverage
• Definition: for a fixed value of a parameter the actual coverage probability of an interval estimator is the probability that the interval contains the parameter:
I(j,p) is 1, if the interval contains p for observation j and 0 otherwise
α
2= P (Y ≥ y) =
n�
j=y
�n
j
�pj(1− p)n−j
α
2= P (Y ≤ y) =
y�
j=0
�n
j
�pj(1− p)n−j
pL =�
1 +n− y + 1
yF2y,2(n+y−1)(1− α/2)
�−1
pU =�
1 +n− y
(y + 1)F2(y+1),2(n+y)(α/2)
�−1
C(p, n) =n�
j=0
I(j, p) ·�
n
j
�pj(1− p)n−j
M ≤ 1.96 ·�
1n
p(1− p)
n ≥Z2
α/2
Mp(1− p)
6
Wald & Exact 95% C.I.s
• Wald is excessively liberal
• Clopper-Pearson (Exact) is impractically conservative
Coverage of Confidence Intervals for Binomial
p
coverage
0.70
0.75
0.80
0.85
0.90
0.95
1.00
5
0.0 0.2 0.4 0.6 0.8 1.0
10
0.0 0.2 0.4 0.6 0.8 1.0
50
0.0 0.2 0.4 0.6 0.8 1.0
Method
Wald
Score
Exact
adj_Wald
Score
• Invert normal test using null, rather than estimated standard error:
p− po�1npo(1− po
= ±zα/2
p +z2α/2
2n± zα/2
�����
p(1− p) +z2α/2
4n
�/n
�
1 +z2α/2
n
�−1
p =Y
n
|p− π|�1np(1− p)
.∼ N(0, 1)
nπ ≥ 5, n(1− π) ≥ 5
λ ·��
β2 +�
α2�
{α0m, αm : m = 1, ...,M} M(p + 1){β0k, βk : k = 1, ...,K} K(M + 1)
gk(T ) =eTk
�K�=1 eT�
σ(ν) =1
1 + e−ν
Zm = σ(α0m + α�mX) m = 1, ...,M
Tk = β0k + β�kZ k = 1, ...,K
fk(X) = gk(T ) k = 1, ...,K
Πc = 2I�
i=1
J�
j=1
πij ·�
h>i
�
k>j
πhk =�
i,j
πcij
σ2(γ̂)∞ =16
n(ΠC + ΠD)4
I�
i=1
J�
j=1
πij
�ΠCπd
ij −ΠDπcij
�2
1
p̃ =y + 2n + 4
p̃ ± zα/2
�1n
p̃(1− p̃)
p− po�1npo(1− po)
= ±zα/2
p +z2α/2
2n± zα/2
�����
p(1− p) +z2α/2
4n
�/n
�
1 +z2α/2
n
�−1
p =Y
n
|p− π|�1np(1− p)
.∼ N(0, 1)
nπ ≥ 5, n(1− π) ≥ 5
λ ·��
β2 +�
α2�
{α0m, αm : m = 1, ...,M} M(p + 1){β0k, βk : k = 1, ...,K} K(M + 1)
gk(T ) =eTk
�K�=1 eT�
σ(ν) =1
1 + e−ν
Zm = σ(α0m + α�mX) m = 1, ...,M
Tk = β0k + β�kZ k = 1, ...,K
fk(X) = gk(T ) k = 1, ...,K
Πc = 2I�
i=1
J�
j=1
πij ·�
h>i
�
k>j
πhk =�
i,j
πcij
1
Wilson, 1937
Adjusted Wald
• ‘add x failures and x successes’, good values for x = 1,2:
p̃ =y + 2n + 4
p̃ + zα/2
�1n
p̃(1− p̃)
p− po�1npo(1− po
= ±zα/2
p +z2α/2
2n± zα/2
�����
p(1− p) +z2α/2
4n
�/n
�
1 +z2α/2
n
�−1
p =Y
n
|p− π|�1np(1− p)
.∼ N(0, 1)
nπ ≥ 5, n(1− π) ≥ 5
λ ·��
β2 +�
α2�
{α0m, αm : m = 1, ...,M} M(p + 1){β0k, βk : k = 1, ...,K} K(M + 1)
gk(T ) =eTk
�K�=1 eT�
σ(ν) =1
1 + e−ν
Zm = σ(α0m + α�mX) m = 1, ...,M
Tk = β0k + β�kZ k = 1, ...,K
fk(X) = gk(T ) k = 1, ...,K
Πc = 2I�
i=1
J�
j=1
πij ·�
h>i
�
k>j
πhk =�
i,j
πcij
1
p̃ =y + 2n + 4
p̃ ± zα/2
�1n
p̃(1− p̃)
p− po�1npo(1− po
= ±zα/2
p +z2α/2
2n± zα/2
�����
p(1− p) +z2α/2
4n
�/n
�
1 +z2α/2
n
�−1
p =Y
n
|p− π|�1np(1− p)
.∼ N(0, 1)
nπ ≥ 5, n(1− π) ≥ 5
λ ·��
β2 +�
α2�
{α0m, αm : m = 1, ...,M} M(p + 1){β0k, βk : k = 1, ...,K} K(M + 1)
gk(T ) =eTk
�K�=1 eT�
σ(ν) =1
1 + e−ν
Zm = σ(α0m + α�mX) m = 1, ...,M
Tk = β0k + β�kZ k = 1, ...,K
fk(X) = gk(T ) k = 1, ...,K
Πc = 2I�
i=1
J�
j=1
πij ·�
h>i
�
k>j
πhk =�
i,j
πcij
1
Score, adjusted Wald
• Score C.I has average coverage close to nominal value
• adjusted Wald is `simple fix’, makes liberal Wald slightly more conservative than Score
Coverage of Confidence Intervals for Binomial
p
coverage
0.70
0.75
0.80
0.85
0.90
0.95
1.00
5
0.0 0.2 0.4 0.6 0.8 1.0
10
0.0 0.2 0.4 0.6 0.8 1.0
50
0.0 0.2 0.4 0.6 0.8 1.0
Method
Wald
Score
Exact
adj_Wald
see also: Agresti & Coull, American Statistician, 1998
Multinomial Distribution
• Series of n independent and identical trials, where the outcome for each trial falls into one of K mutually exclusive categories with
• then
is called a two dimensional contingency table or cross-classification table. If πij denotesthe probability of pair (xi, yj) then the table
X\Y y1 y2 ... yJ
x1 π11 π12 ... π1J π1.
x2 π21 π22 ... π2J π2....
...... . . . ...
...xI πI1 πI2 ... πIJ πI.
π.1 π.2 ... π.J 1
contains the joint distribution and {π1., π2., ...,πI.} and {π.1, π.2, ...,π.J} are marginal dis-tributions.
G2 = 2
�
i
yi log�
mi
m0,i
�= 0.74
X2 =
�
i
(mi −m0,i)2
m0,i= 0.69
Ha : 0 ≤ π ≤ 1 with π1 + π2 + π3 = 1
Ho : π = πo
πo =
0.750.18750.0625
√n(p− π) −→ N(0,diag(π)− ππt)
Let Y be a (nominal) categorical variable, i.e. Y ∈ {y1, y2, ..., yK} with
P (Y = y1) = p(y1) = π1
P (Y = y2) = p(y2) = π2
K�
i=1
πi = 1
...
P (Y = yK) = p(yK) = πK
πi = P (Y = yi), 1 ≤ i ≤ K
The probability mass function then is
p(n1, n2, ..., nk) =n!
n1!n2! · ... · nk!πn1
1 πn22 · ... · πnk
k ,
3
is called a two dimensional contingency table or cross-classification table. If πij denotesthe probability of pair (xi, yj) then the table
X\Y y1 y2 ... yJ
x1 π11 π12 ... π1J π1.
x2 π21 π22 ... π2J π2....
...... . . . ...
...xI πI1 πI2 ... πIJ πI.
π.1 π.2 ... π.J 1
contains the joint distribution and {π1., π2., ...,πI.} and {π.1, π.2, ...,π.J} are marginal dis-tributions.
G2 = 2
�
i
yi log�
mi
m0,i
�= 0.74
X2 =
�
i
(mi −m0,i)2
m0,i= 0.69
Ha : 0 ≤ π ≤ 1 with π1 + π2 + π3 = 1
Ho : π = πo
πo =
0.750.18750.0625
√n(p− π) −→ N(0,diag(π)− ππt)
Let Y be a (nominal) categorical variable, i.e. Y ∈ {y1, y2, ..., yK} with
P (Y = y1) = p(y1) = π1
P (Y = y2) = p(y2) = π2
K�
i=1
πi = 1
...
P (Y = yK) = p(yK) = πK
πj = P (Yi = j), 1 ≤ j ≤ K, 1 ≤ i ≤ n
The probability mass function then is
p(n1, n2, ..., nk) =n!
n1!n2! · ... · nk!πn1
1 πn22 · ... · πnk
k ,
3
Multinomial
• Let yj be the number of times that we observe outcome j, then
• 0 ≤ yj ≤ n (for all j) and
• sum of yj is n
•
with n =�k
i=1 ni.
X = # of insects trapped overnight,or X = # of mosquito bites in an hour,⇒ no upper limit for n
X ∈ {0, 1, 2, 3, ...}Poisson probability mass function
P (X = x) = p(x) = e−µ µx
x!,
where λ is the rate parameter (remember, that the rate depends on the unit used).
E[X] = λ
V ar[X] = λ
skewness =√
λ
kurtosis = 1/λ
P (Y1 = y1, Y2 = y2, ..., YK = yK) =
=n!
y1!y2! · ... · yK !πy1
1 πy22 · ... · πyK
K
Yi ∼ Bn,πi E[Yi] = nπi
V ar[Yi] = nπi(1− πi)Cov(Yi, Yj) = −nπiπj
ML estimator for πi: π̂i = nin . Let L(π1, π2, ...,πk) =
�ki=1 πni
i the multinomial likelihoodfunction. With the additional constraint
�ki=1 πi = 1 the likelihood function is actually a
function in k − 1 variables:
L(π1, π2, ...,πk−1) =k−1�
i=1
πnii ·
�1−
k−1�
i=1
πi
�nk
Then
logL(π1, π2, ...,πk−1) = ni log πi + nk log
�1−
k−1�
i=1
πi
�
∂
∂πilogL(π1, π2, ...,πk−1) =
ni
πi+ nk
11−
�k−1i=1 πi
· (−1) =ni
πi− nk
πk
!= 0.
⇒ π̂i
πk=
ni
nk.
4
Multinomial
• Marginals Yi have Binomial distributions, and
• limiting distribution:
with p = (n1/n, n2/n, ..., nK/n)
with n =�k
i=1 ni.
X = # of insects trapped overnight,or X = # of mosquito bites in an hour,⇒ no upper limit for n
X ∈ {0, 1, 2, 3, ...}Poisson probability mass function
P (X = x) = p(x) = e−µ µx
x!,
where λ is the rate parameter (remember, that the rate depends on the unit used).
E[X] = λ
V ar[X] = λ
skewness =√
λ
kurtosis = 1/λ
P (Y1 = y1, Y2 = y2, ..., YK = yK) =
=n!
y1!y2! · ... · yK !πy1
1 πy22 · ... · πyK
K
Yi ∼ Bn,πi E[Yi] = nπi
V ar[Yi] = nπi(1− πi)Cov(Yi, Yj) = −nπiπj
ML estimator for πi: π̂i = nin . Let L(π1, π2, ...,πk) =
�ki=1 πni
i the multinomial likelihoodfunction. With the additional constraint
�ki=1 πi = 1 the likelihood function is actually a
function in k − 1 variables:
L(π1, π2, ...,πk−1) =k−1�
i=1
πnii ·
�1−
k−1�
i=1
πi
�nk
Then
logL(π1, π2, ...,πk−1) = ni log πi + nk log
�1−
k−1�
i=1
πi
�
∂
∂πilogL(π1, π2, ...,πk−1) =
ni
πi+ nk
11−
�k−1i=1 πi
· (−1) =ni
πi− nk
πk
!= 0.
⇒ π̂i
πk=
ni
nk.
4
is called a two dimensional contingency table or cross-classification table. If πij denotesthe probability of pair (xi, yj) then the table
X\Y y1 y2 ... yJ
x1 π11 π12 ... π1J π1.
x2 π21 π22 ... π2J π2....
...... . . . ...
...xI πI1 πI2 ... πIJ πI.
π.1 π.2 ... π.J 1
contains the joint distribution and {π1., π2., ...,πI.} and {π.1, π.2, ...,π.J} are marginal dis-tributions.
G2 = 2
�
i
yi log�
mi
m0,i
�= 0.74
X2 =
�
i
(mi −m0,i)2
m0,i= 0.69
Ha : 0 ≤ π ≤ 1 with π1 + π2 + π3 = 1
Ho : π = πo
πo =
0.750.18750.0625
√n(p− π) −→ N(0,diag(π)− ππt)
Let Y be a (nominal) categorical variable, i.e. Y ∈ {y1, y2, ..., yK} with
P (Y = y1) = p(y1) = π1
P (Y = y2) = p(y2) = π2
K�
i=1
πi = 1
...
P (Y = yK) = p(yK) = πK
πj = P (Yi = j), 1 ≤ j ≤ K, 1 ≤ i ≤ n
The probability mass function then is
p(n1, n2, ..., nk) =n!
n1!n2! · ... · nk!πn1
1 πn22 · ... · πnk
k ,
3
Poisson Distribution
• #of insects trapped overnight in tent, #mosquito bites in an hour, #stranded vehicles in 10 km of I-35
Simeon Poisson, 1781-1840with n =�k
i=1 ni.
X = # of insects trapped overnight,or X = # of mosquito bites in an hour,⇒ no upper limit for n
X ∈ {0, 1, 2, 3, ...}Poisson probability mass function
P (X = x) = p(x) = e−µ µx
x!,
where λ is the rate parameter (remember, that the rate depends on the unit used).
E[X] = λ
V ar[X] = λ
skewness =√
λ
kurtosis = 1/λ
P (Y1 = y1, Y2 = y2, ..., YK = yK) =
=n!
y1!y2! · ... · yK !πy1
1 πy22 · ... · πyK
K
Yi ∼ Bn,πi E[Yi] = nπi
V ar[Yi] = nπi(1− πi)Cov(Yi, Yj) = −nπiπj
ML estimator for πi: π̂i = nin . Let L(π1, π2, ...,πk) =
�ki=1 πni
i the multinomial likelihoodfunction. With the additional constraint
�ki=1 πi = 1 the likelihood function is actually a
function in k − 1 variables:
L(π1, π2, ...,πk−1) =k−1�
i=1
πnii ·
�1−
k−1�
i=1
πi
�nk
Then
logL(π1, π2, ...,πk−1) = ni log πi + nk log
�1−
k−1�
i=1
πi
�
∂
∂πilogL(π1, π2, ...,πk−1) =
ni
πi+ nk
11−
�k−1i=1 πi
· (−1) =ni
πi− nk
πk
!= 0.
⇒ π̂i
πk=
ni
nk.
4
with n =�k
i=1 ni.
X = # of insects trapped overnight,or X = # of mosquito bites in an hour,⇒ no upper limit for n
X ∈ {0, 1, 2, 3, ...}Poisson probability mass function
P (X = x) = p(x) = e−µ µx
x!,
where λ is the rate parameter (remember, that the rate depends on the unit used).
E[X] = λ
V ar[X] = λ
skewness =√
λ
kurtosis = 1/λ
P (Y1 = y1, Y2 = y2, ..., YK = yK) =
=n!
y1!y2! · ... · yK !πy1
1 πy22 · ... · πyK
K
Yi ∼ Bn,πi E[Yi] = nπi
V ar[Yi] = nπi(1− πi)Cov(Yi, Yj) = −nπiπj
ML estimator for πi: π̂i = nin . Let L(π1, π2, ...,πk) =
�ki=1 πni
i the multinomial likelihoodfunction. With the additional constraint
�ki=1 πi = 1 the likelihood function is actually a
function in k − 1 variables:
L(π1, π2, ...,πk−1) =k−1�
i=1
πnii ·
�1−
k−1�
i=1
πi
�nk
Then
logL(π1, π2, ...,πk−1) = ni log πi + nk log
�1−
k−1�
i=1
πi
�
∂
∂πilogL(π1, π2, ...,πk−1) =
ni
πi+ nk
11−
�k−1i=1 πi
· (−1) =ni
πi− nk
πk
!= 0.
⇒ π̂i
πk=
ni
nk.
4
X in {0,1,2,3,4,...} (no upper limit)
Poisson distribution
1 2 3 4 5 6
lambda= 1
0.000.150.30
1 3 5 7 9 11 13 15
lambda= 3
0.00
0.10
0.20
1 3 5 7 9 11 14 17 20 23
lambda= 5
0.00
0.10
1 5 9 13 18 23 28 33 38 43 48
lambda= 10
0.00
0.06
0.12
approximates Binomial for large n, small p
Normal approximation holds for large values of lambda
Properties of PoissonRelationship to Multinomial
• Let Y1, ..., YK be ind. Poisson with parameters µ1, ..., µK
• Y1 + ... + YK is Poisson with parameter ∑µi
Contingency Tables
• X and Y are categorical variables with I and J categories:
σ2(γ̂)∞ =16
n(ΠC + ΠD)4
I�
i=1
J�
j=1
πij
�ΠCπd
ij −ΠDπcij
�2
γ =ΠC −ΠD
ΠC + ΠD
θ :=π00 π11
π10 π01
π : (1− π)
πj|i=0 − πj|i=1 =: π1 − π2,
r :=πj=1|i=0
πj=1|i=1
χ21,0.05 = 3.84, χ2
1,0.01 = 6.634897
πij = πi+ · π+j
Ho : P ( heart disease | Cholesterol ≤ 220) =P ( heart disease | Cholesterol > 220)
π11
π11 + π12=
π21
π21 + π22
Let X, Y be two categorical variables, with I, J categories respectively and X ∈ {x1, x2, ..., xI}, Y ∈{y1, y2, ..., yJ}.
Then the pair (X, Y ) is categorical variable with IJ outcomes.The table
X\Y y1 y2 ... yJ
x1 n11 n12 ... n1J n1.
x2 n21 n22 ... n2J n2....
...... . . . ...
...xI nI1 nI2 ... nIJ nI.
n.1 n.2 ... n.J n
2
is called a two dimensional contingency table or cross-classification table. If πij denotesthe probability of pair (xi, yj) then the table
X\Y y1 y2 ... yJ
x1 π11 π12 ... π1J π1.
x2 π21 π22 ... π2J π2....
...... . . . ...
...xI πI1 πI2 ... πIJ πI.
π.1 π.2 ... π.J 1
contains the joint distribution and {π1., π2., ...,πI.} and {π.1, π.2, ...,π.J} are marginal dis-tributions.
G2 = 2
�
i
yi log�
mi
m0,i
�= 0.74
X2 =
�
i
(mi −m0,i)2
m0,i= 0.69
Ha : 0 ≤ π ≤ 1 with π1 + π2 + π3 = 1
Ho : π = πo
πo =
0.750.18750.0625
√n(p− π) −→ N(0,diag(π)− ππt)
Let Y be a (nominal) categorical variable, i.e. Y ∈ {y1, y2, ..., yK} with
P (Y = y1) = p(y1) = π1
P (Y = y2) = p(y2) = π2
K�
i=1
πi = 1
...
P (Y = yK) = p(yK) = πK
πj = P (Yi = j), 1 ≤ j ≤ K, 1 ≤ i ≤ n
The probability mass function then is
p(n1, n2, ..., nk) =n!
n1!n2! · ... · nk!πn1
1 πn22 · ... · πnk
k ,
3
Contingency Tables
• X and Y are categorical variables with I and J categories:
Poisson vs Multinomial Sampling
• Poisson Sampling:
• each cell nij is assumed to be Poisson distributed
• Overall sum is random
• Multinomial
• If overall sum is fixed, conditional probabilities become multinomial
Product Multinomial Sampling
• one of the margins in the contingency table is fixed
• e.g. rare disease, `pairing’ of combinations
• given the fixed margins, the other direction still has a multinomial distribution, resulting in a product of multinomials.
• set-up of traditional case-control study
Margins are fixed
• Both margins are fixed
• less frequent in studies, more frequent in inferential methods
• Hypergeometric distribution
Example: Cholesterol/Heart Disease
• 1329 patients of same age/sex
present absent
Cholesterol ≤ 220
> 220
y11= 20 y12= 553
y21= 72 y22= 684
mg/l
Coronary Disease
Cholesterol/Heart Disease
• Y = (y11, y12, y21, y22) ~ Mult (1329, π)
• Is incidence of heart disease independent of cholesterol levels?
i.e. is incidence rate of heart disease the same for both levels of cholesterol?
σ2(γ̂)∞ =16
n(ΠC + ΠD)4
I�
i=1
J�
j=1
πij
�ΠCπd
ij −ΠDπcij
�2
γ =ΠC −ΠD
ΠC + ΠD
θ :=π00 π11
π10 π01
π : (1− π)
πj|i=0 − πj|i=1 =: π1 − π2,
r :=πj=1|i=0
πj=1|i=1
χ21,0.05 = 3.84, χ2
1,0.01 = 6.634897
πij = πi+ · π+j
Ho : P ( heart disease | Cholesterol ≤ 220) =P ( heart disease | Cholesterol > 220)
π11
π11 + π12=
π21
π21 + π22
Let X, Y be two categorical variables, with I, J categories respectively and X ∈ {x1, x2, ..., xI}, Y ∈{y1, y2, ..., yJ}.
Then the pair (X, Y ) is categorical variable with IJ outcomes.The table
X\Y y1 y2 ... yJ
x1 n11 n12 ... n1J n1.
x2 n21 n22 ... n2J n2....
...... . . . ...
...xI nI1 nI2 ... nIJ nI.
n.1 n.2 ... n.J n
2
Cholesterol/Heart Disease
• equivalent to
• equivalent to
σ2(γ̂)∞ =16
n(ΠC + ΠD)4
I�
i=1
J�
j=1
πij
�ΠCπd
ij −ΠDπcij
�2
γ =ΠC −ΠD
ΠC + ΠD
θ :=π00 π11
π10 π01
π : (1− π)
πj|i=0 − πj|i=1 =: π1 − π2,
r :=πj=1|i=0
πj=1|i=1
χ21,0.05 = 3.84, χ2
1,0.01 = 6.634897
πij = πi+ · π+j
Ho : P ( heart disease | Cholesterol ≤ 220) =P ( heart disease | Cholesterol > 220)
π11
π11 + π12=
π21
π21 + π22
Let X, Y be two categorical variables, with I, J categories respectively and X ∈ {x1, x2, ..., xI}, Y ∈{y1, y2, ..., yJ}.
Then the pair (X, Y ) is categorical variable with IJ outcomes.The table
X\Y y1 y2 ... yJ
x1 n11 n12 ... n1J n1.
x2 n21 n22 ... n2J n2....
...... . . . ...
...xI nI1 nI2 ... nIJ nI.
n.1 n.2 ... n.J n
2
σ2(γ̂)∞ =16
n(ΠC + ΠD)4
I�
i=1
J�
j=1
πij
�ΠCπd
ij −ΠDπcij
�2
γ =ΠC −ΠD
ΠC + ΠD
θ :=π00 π11
π10 π01
π : (1− π)
πj|i=0 − πj|i=1 =: π1 − π2,
r :=πj=1|i=0
πj=1|i=1
χ21,0.05 = 3.84, χ2
1,0.01 = 6.634897
πij = πi+ · π+j
Ho : P ( heart disease | Cholesterol ≤ 220) =P ( heart disease | Cholesterol > 220)
π11
π11 + π12=
π21
π21 + π22
Let X, Y be two categorical variables, with I, J categories respectively and X ∈ {x1, x2, ..., xI}, Y ∈{y1, y2, ..., yJ}.
Then the pair (X, Y ) is categorical variable with IJ outcomes.The table
X\Y y1 y2 ... yJ
x1 n11 n12 ... n1J n1.
x2 n21 n22 ... n2J n2....
...... . . . ...
...xI nI1 nI2 ... nIJ nI.
n.1 n.2 ... n.J n
2
Cholesterol/Heart Disease• Expected Values under independence
present absent Total
Cholesterol ≤ 220
> 220
Total
13.66 533.33 573
52.33 703.67 756
92 1237 1329
mg/l
Coronary Disease
Cholesterol/Heart Disease
• loglikelihood ratio test G2 = 19.8
• Pearson test X2 = 18.4
• both G2 and X2 are chisquare distributed with df = 3 - 2 = 1independence seems to be violated
• Area plots (i.e. area represents #combinations)
• Built hierarchically, i.e. order of variables matters
• in R:mosaicplot() (base package)imosaic() (iplots package)productplots package
http://cran.r-project.org/doc/contrib/Short-refcard.pdf
Mosaicplots
• Measure of association between X and Y
• odds ratio = (ad)/(bc)
• odds ratio = 1 is independence
• log odds ratio is symmetric around 0
• log odds is approx Normal with variance 1/a + 1/b + 1/c + 1/d
X=0 X=1
Y=0
Y=1
a b
c d
Odds ratio
Reading the Odds
0.94
1-bbln
1-ddln
-0.85
+inf
0
-inf
2
-2-1
1
log odds scale
0.5
probability scale
0
1.0
b
a
d
c
X and Y are binary variables, Z is categorical with K categories
Death Penalties in Florida:X death penalty (yes/no)Y defendant’s race (black/white)Z victim’s race (black/white)
yes nowhiteblack
53 430 483
15 176 191
68 606 674
Marginal Table of X/Y
Marginal odds ratio: 53*176/(430*15) = 1.45 (±0.59)
slight indication in favor of black defendants
Odds ratios in 2 x 2 x K tables
yes nowhiteblack
53 414 46711 37 4864 451 515
Conditional Tables of X/Y
Conditional odds ratios:0.43 0
Z = white
yes nowhiteblack
0 16 164 139 1434 155 159
Z = black
very strong indication against black defendants
Odds ratios in 2 x 2 x K tables
• Simpson’s paradox: marginal association between X and Y is opposite to conditional associations between X and Y for each level of Z
• due to: very strong association between X and Z or Y and Z
Simpson’s paradox