Bayes’ Theorem - The University of Chicago, Department...

Bayes’ Theorem

Let A and B1, . . . , Bk be events in a sample space Ω.

Inversion problem: given P(A|Bj) (and P(Bj)) find P(Bj|A)

Bayes’ Theorem:

P(Bj|A) =P(A|Bj)P(Bj)

P(A|Bj)P(Bj)∑ki=1P(A|Bi)P(Bi)

For continuous random variables X and Y , Bayes’ Theorem is formulated

in terms of densities:

fX|Y (x|y) =fY |X(y|x) fX(x)

fY (y)=

fY |X(y|x) fX(x)∫fY |X(y|x) fX(x) dx

Application to statistical inference:

Probabilistic model: f(y|θ) - distribution of Y for fixed θ

Statistical problem: given data y make statements about θ

Likelihood: l(θ|y) = f(y|θ) (reflects inversion problem)

Bayesian approach:

A Bayesian statistical (parametric) model consists of

f(y|θ), a parametric statistical model (likelihood function), and

π(θ), a prior distribution on the parameters.

The posterior distribution of the parameter θ is

π(θ|y) =fY |θ(y|θ) π(θ)∫

Θ fY |θ(y|θ) π(θ) dθ∼ fY |θ(y|θ) π(θ)

The Bayesian modelling approach can be summarized by

posterior ∼ likelihood× prior.

Bayesian interpretation of probability

probability = (subjective) uncertainty

Bayesian Inference, Apr 20, 2004 - 1 -

Bayesian Inference

Example: Binomial distribution

Likelihood function

Y |θ ∼ Bin(n, θ)

Prior distribution

θ ∼ U(0, 1) = Beta(1, 1)

Posterior distribution

θ|Y ∼ Beta(1 + Y, 1 + n− Y )

Uncertainty about parameter can be up-

dated repeatedly when new data are avail-

take current posterior distribution as

compute new posterior distribution

conditional on new data

0.0 0.2 0.4 0.6 0.8 1.00.0

0.0 0.2 0.4 0.6 0.8 1.00

The posterior distribution is used for inference about θ:

posterior mean

E(θ|Y )

posterior variance

var(θ|Y ) = E((θ −E(θ|Y ))2

∣∣Y ) posterior confidence interval (credibility interval)∫ θr

π(θ|Y ) dθ = 1− αBayesian Inference, Apr 20, 2004 - 3 -

Conjugate Priors

A mathematical convenient choice are conjugate priors: The posterior dis-

tribution belongs to the same parametric family as the prior distribution

with different parameters:

Likelihood Prior Posterior

f(y|θ) π(θ) π(θ|y)

Normal Normal Normal

N (θ, σ2) N (µ, τ 2) N(

σ2µ+τ2yσ2+τ2 , σ2τ2

σ2+τ2

)Poisson Gamma Gamma

Poisson(θ) Γ(α, β) Γ(α + y, β + 1)

Gamma Gamma Gamma

Γ(ν, θ) Γ(α, β) Γ(α + ν, β + y)

Binomial Beta Beta

Bin(n, θ) Beta(α, β) Beta(α + y, β + n− y)

Multinomial Dirichlet Dirichlet

Mk(θ1, . . . , θk) D(α1, . . . , αk) D(α1 + y1, . . . , αk + yk)

Normal Gamma Gamma

N (µ, 1/θ) Γ(α, β) Γ(α + 1

2 , β + 12 (µ− y)2

)Problems in choice of prior:

The conjugate priors might not reflect our uncertainty about θ correctly.

In general, for non-conjugate priors the posterior distribution is not

available in analytic form.

It is difficult to describe uncertainty about θ in form of a particular

distribution. In particular, we might be uncertain about the parameters

of the prior distribution ( hierarchical modelling, empirical Bayesian

methods).

Bayesian Analysis with Missing Data

Bayesian statistical model:

Data model:

f(Y |θ) complete-data likelihood

f(R|Y, ξ) missing-data mechanism

Prior distribution:

π(θ, ξ)

The posterior distribution of θ and ξ is

π(θ, ξ|Yobs, R) ∼ f(Yobs, R|θ, ξ) π(θ, ξ)

∫f(Yobs, ymis, R|θ, ξ) π(θ, ξ) dymis

∫f(Yobs, ymis|θ) f(R|Yobs, ymis, ξ) π(θ, ξ) dymis

If the data are missing at random (MAR) then

π(θ, ξ|Yobs, R) ∼∫

f(Yobs, ymis|θ) f(R|Yobs, ξ) π(θ, ξ) dymis

∫f(Yobs, ymis|θ) dymis f(R|Yobs, ξ) π(θ, ξ)

= f(Yobs|θ) f(R|Yobs, ξ) π(θ, ξ)

Bayesian Analysis with Missing Data

For inference on θ, we consider the marginal posterior distribution of θ

π(θ|Yobs, R) =

∫Ξπ(θ, ξ|Yobs, R) dξ

∼∫

Ξf(Yobs|θ) f(R|Yobs, ξ) π(θ, ξ) dξ

If the parameters are distinct in the sense that

π(θ, ξ) = π(θ) π(ξ)

then the marginal posterior distribution of θ satisfies

π(θ|Yobs, R) ∼ f(Yobs|θ) π(θ) ·∫

Ξf(R|Yobs, ξ) π(ξ) dξ

It follows that

π(θ|Yobs, R) =f(Yobs|θ) π(θ)∫

Θ f(Yobs|θ) π(θ) dθ

and hence π(θ|Yobs, R) = π(θ|Yobs).

Result:

The missing data mechanism is ignorable for posterior inference about

the parameter θ if

the data are missing at random (MAR) and

the parameters θ and ξ are distinct, that is

π(θ, ξ) = π(θ) π(ξ).

Bayes’ Theorem - The University of Chicago, Department...

Documents

Transcript of Bayes’ Theorem - The University of Chicago, Department...

A Nanosystem of Amphiphilic Oligopeptide-Drug Conjugate ... · Received: 2017.03.24; Accepted: 2017.06.16; Published: 2017.07.23 Abstract To design a prodrug-based self-assembling

Chapter 22 · Chapter 22 The Chemistry of Enolate Ions, Enols, and α,β ... • 22.11 Organic Synthesis with Conjugate‐Addion Reacons 2 ...

Intermolecular Michael Reactions: A Computational ...evans.rc.fas.harvard.edu/pdf/evans330.pdfIntermolecular Michael Reactions: A Computational Investigation ... P. Conjugate Additions

The Primary Output of GRBs David Eichler. My collaborators: Amir Levinson Jonathan Granot Hadar Manis Don Ellison (if time)

Conjugate Gradient:Conjugate Gradientxhx/courses/ConvexOpt/projects/Patrick...• Shewchuck , “An Introduction to the Conjugate Gradient MethodAn Introduction to the Conjugate Gradient

Modelling Missing Data - The University of Chicagogalton.uchicago.edu/~eichler/stat24600/Handouts/l01.pdf · 2004-04-01 · Modelling Missing Data Problem: Some data Y ij may be missing

CKM Reflections: Results from Belle and Babar • B …homepages.uc.edu/~kinoshky/talks/05-05-pheno.pdfK. Kinoshita PHENO 052 CKM and CP Asymmetry CP{ } = ≠ (hermitian conjugate)

A study of conjugate addition of curcumin and chalcone derivatives

Chemistry of Carbonyl Compounds€¦ · Chemistry of Carbonyl Compounds •Nucleophilic addition (1,2-add) / substitution •Conjugate addition (1,4 add) •Robinson annulation (McM

The Conjugate Gradient Method - Universitetet i oslo · I So the conjugate gradient method nds the exact solution in at most n iterations. I The convergence analysis shows that x

50 nominal input / conjugate match balun to SPIRIT1, with ... › content › ccc › resource › technical › ... · 03-Oct-2013 2 Updated document title. Updated Table 1 with

UNIPOTENT ELEMENTS AND CHARACTERS OF FINITE CHEVALLEY GROUPS · FINITE CHEVALLEY GROUPS 527 (b) Two parabolic subgroups P x and P Y(X, YdS) are conjugate to each other if and only

Conjugate Priors, Uninformative Priors · The Exponential Family A probability mass function (pmf) or probability distribution function (pdf) p(Xj ), for X= (X 1;:::;X m) 2Xm and

Automata and Languages DFA to Regular Expression GFNA DFA …hamada/AF/L06-FA-Handouts.pdf · 2016-04-22 · 4/18/16 1 Automata and Languages Prof. Mohamed Hamada Software Engineering

Quantification of the Antibody Drug Conjugate, Trastuzumab ...dmpk.waters.com/sites/default/files/720005619en.pdf · 3 RESULTS AND DISCUSSION Trastuzumab is a humanized anti-HER2

Q922+rfp+l06 v1

On-line Gas Phase Chromatography with Chlorides of ...On-line Gas Phase Chromatography with Chlorides of Niobium and Hahnium (Element 105) By Á. Türler, Â. Eichler, D. T. Jost,

Conjugate Priors, Uninformative Priors Priors, Uninformative Priors Nasim Zolaktaf UBC Machine Learning Reading Group ... (HjHHH) = Z 1 0 p(Hj )p( jHHH)d = Z 1 0 Ber(Hj )Beta ( j3

Outline - Physics & Astronomyhomepage.physics.uiowa.edu/~pkaaret/2015s_astr3772/L06...A supermassive black hole at the nucleus of a galaxy accretes at the Eddington luminosity with

L06 FAwih Output