Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9 CS479/679 Pattern Recognition...

Expectation-Maximization (EM)

Chapter 3 (Duda et al.) – Section 3.9

CS479/679 Pattern RecognitionDr. George Bebis

• EM is an iterative method to perform ML estimation:

– Starts with an initial estimate for θ.

– Refines the current estimate iteratively to increase the likelihood of the observed data:

p(D/ θ)

• EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete)

– Some creativity is required to recognize where the EM algorithm can be used.

– Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

Incomplete Data

• Many times, it is impossible to apply ML estimation because certain features cannot be measured directly.

• The EM algorithm is ideal for problems with unobserved (missing) data.

Example (Moon, 1996)

xx11!x!x22!x!x33!!

x1+x2+x3=k

Assume a trinomialdistribution:

Example (Moon, 1996) (cont’d)

EM: Main Idea

• If x was available, we could use ML to estimate θ, i.e.,

• Since x is not available:

Maximize the expectation of ln p(Dx / θ) with

respect to the unknown variables given Dy and an estimate of θ.

θθ̂ arg max ln ( / θ)xp D

( ; ) (ln ( / ) / , )unobserved

t tx x yQ E p D D

EM Steps

(1) Initialization(2) Expectation(3) Maximization(4) Test for convergence

EM Steps (cont’d)

(1) Initialization Step: initialize the algorithm with a guess θ0

(2) Expectation step: it is performed with respect to the unobserved variables, using the current estimate of parameters and conditioned upon the observations:

– When ln p(Dx / θ) is a linear function of the unobserved variables, the expectation step is equivalent to:

( ; ) (ln ( / ) / , )unobserved

t tx x yQ E p D D

( ; ) ( / , )t tunobserved yQ E x D

EM Steps (cont’d)

(3) Maximization Step: provides a new estimate of the parameters:

(4) Test for Convergence:

stop; otherwise, go to Step 2.

t+1 tθθ arg max (θ;θ )Q

t+1 t|θ - θ | if

xx11!x!x22!x!x33!!

• Suppose:k!

Let’s look at the M-step for a minute before completing the E-step …

• Take expected value:

Let’s go back and complete the E-step now …

• We only need to estimate:

22ΣΣii

ΣΣii

(see Moon’s paper, page 53, for a proof)

• Initialization: θ0

• Expectation Step:

• Maximization Step:

• Convergence Step: t+1 t|θ -θ |

22ΣΣii

ΣΣii

θθtt

Convergence properties of EM

• The solution depends on the initial estimate θ0

• At each iteration, a value of θ is computed so that the likelihood function does not decrease.

• There is no guarantee that it will convergence to a global maximum.

• The algorithm is guaranteed to be stable. • i.e., there is no chance of "overshooting" or diverging

from the maximum.

• EM represents a general framework – works best in situations where the data is incomplete (or can be thought as being incomplete)

– Some creativity is required to recognize where the EM algorithm can be used.

– Standard method for estimating the parameters of Mixtures of Gaussians (MoG).

Mixture of 2D Gaussians - Example

Mixture Model

ππ11

ππ22ππ33

ππkk

Mixture of 1D Gaussians - Example

π1=0.3

π2=0.2

π3=0.5

Mixture Parameters

Fitting a Mixture Model toa set of observations Dx

• Two fundamental problems:

(1) Estimate the number of mixture components K

(2) Estimate mixture parameters (πk , θk), k=1,2,…,K

Mixtures of Gaussians(see Chapter 10)

where each p(x/θ)=

• The parameters θk are (μk,Σk)

Mixtures of Gaussians (cont’d)

ππ11

ππ22ππ33

ππkk

Estimating Mixture Parameters Using ML – not easy!

Estimating Mixture Parameters Using EM: Case of Unknown Means• Assumptions

Observation

… but we don’t!

Estimating Mixture Parameters Using EM: Case of Unknown Means (cont’d)

• Introduce hidden or unobserved variables zi

• Main steps using EM

Estimating Mixture Parameters Using EM: Case of Unknown Means

(cont’d)

(cont’d)• Expectation Step

E(zik) is just the probability that xi was generated by the k-th

component:

• Maximization Step

• Summary

Estimating Mixture Parameters Using EM: General Case

• Need to review Lagrange Optimization first …

Lagrange Optimization

g(x)=0

solve forx and λ

n+1 equations / n+1 unknowns

Lagrange Optimization (cont’d)

• Example

Maximize f(x1,x2)=x1x2 subject to the constraint g(x1,x2)=x1+x2-1=0

( , , )0

L x xx

1 2 1 2 1 2( , , ) ( , ) ( , )L x x f x x g x x

3 equations / 3 unknowns

Estimating Mixture Parameters Using EM: General Case

• Introduce hidden or unobserved variables zi

Estimating Mixture Parameters Using EM: General Case (cont’d)

• Expectation Step

• Expectation Step (cont’d)

• Maximization Stepuse Lagrangeoptimization

• Maximization Step (cont’d)

• Summary

Estimating the Number of Components K

Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9 CS479/679 Pattern Recognition...

Documents

Transcript of Expectation-Maximization (EM) Chapter 3 (Duda et al.) – Section 3.9 CS479/679 Pattern Recognition...

Myrtia, η 18,2003, pp. 237-259 PTOLOMEO Y LOS MODOS ... · Una de las parcelas de estudio de la mùsica griega más desalentadoras y, a la vez, más interesantes es sin duda la de

Treatment options for mCRPC in 2015 Guidelines? Treatment …static.livemedia.gr/.../al16592_us63_20150320085704_21_papandreo… · Median follow-up of 12.8 months 3.9-month benefit

REF33xx 3.9-μA, SC70-3, SOT-23-3, and UQFN-8, 30 … the end of the datasheet. REF3312 in a Single-Supply Signal Chain Dropout Voltage vs Load Current 2 REF3312, REF3318, REF3320,

3.9 Interpretation der Lösungen negativer Energie · 3.9 Interpretation der Lösungen negativer Energie Instabilitätsproblem: E m m E = p + m2 2 E = p + m2 2 Ein Elektron könnte

Hoja de calculo - Dialnet · 2012. 6. 18. · 84.1º),15 sintetizado por primera vez por Eaton y Maggini en 1988 y, sin duda, el dodecaedreno, 27, (Φ= 46º),16 que no dimeriza como

3.9 45 448 4.1 Personalization Data Mining 49 4.2 51...2.1 Διαφήμιση μέσω του διαδικτύου - Online Διαφήμιση..... 15 2.1.1 Ορισμός και ιστορία

MidPowerStereo Series - Farnell · 2017. 9. 15. · Parameter Conditions Min. Typ. Max. Units PowerSupply - 14 36 39 VDC IdlePower SDFloating,FANON - 6 10 W SDFloating,FANOFF - 3.9

Model MB MAGNETIC BASE - Kanetec · MB-BV 150 （5.90） General type with fine move adjustment. MB-F2 194 （7.63） 165 （6.49） 3-M4（0.15） － 1.8kg/ 3.9 lb Main pole 360°

2015-2016- α Tetr Ver 3.9 Τμηματα

ΣΧΕΔΙΟ ΥΓΙΕΙΝΗΣ & ΑΣΦΑΛΕΙΑΣ (Σ.Α.Υ)zakynthos.gov.gr/images/diakirikseis/theatro/8... · 2014-06-06 · 3.9 ΤΗΡΗΣΗ ΕΝΤΥΠΩΝ ΕΠΙ ΤΟΠΟΥ

EVAPOTRANSPIRATION AND ENERGY BALANCE OF PADDY FIELD AND TEAK PLANTATION · 2008. 10. 8. · of teak plantation was 3.9 mm. Regarding estimation of ET a based on general climatic

3.9 Carbon Contamination & Fractionationnsl/Lectures/phys178/pdf/chap3_9.pdf3.9 Carbon Contamination & Fractionation ... (sm) and the standard (st) as: st ... & ≈35% C 3 originated

d=1.7 a) (b - The Royal Society of Chemistry · · 2014-09-17O aerogel, wh nm and 3.9 n vely. Furtherm ave been inve ... e process at a b) the galvano A/g in the pot ng by the equiv

Wakefield Loss Analysis of the Elliptical 3.9 GHz Third ...

Electron Capture in Ho experiment - ECHo · 2018. 7. 4. · P. T. Springer, C. L. Bennett, and P. A. Baisden Phys. Rev. A 35 (1987) 679 Present limit on the electron neutrino 163mass

APLICADAS - repositorio.ufu.br · Produção da citocina IL-1β nos tecidos da pata, linfonodo do poplíteo e baço dos diferentes grupos. ... 3.9 Citometria para confirmação do

APLICACIÓN CIRSOC 301-EL BARRAS Y ELEMENTOS DE … 2009/Guias de Estudio... · 2010. 3. 12. · BARRAS CON ALMA DE ALTURA VARIABLE UTN – FRM- 2008 07_Flexion_5 a (A-F.3.8) b (A-F.3.9)

A Kya 679 1996 Ειδικόςλογ

535-679 Hypalage o Intercambio

Compact DIP8-pin type of 60V to 600V load voltage · 2020. 3. 27. · Panasonic Corporation 2019 ASCT50E 20112 FEATURES Compact 8-pin DIP size (W) 6.4 × (L) 9.78 × (H) 3.9 mm 8-pin

Model MB MAGNETIC BASE - Kanetec · MB-BV 150 （5.90） General type with fine move adjustment. MB-F2 194 （7.63） 165 （6.49） 3-M4（0.15）－ 1.8kg/ 3.9 lb Main pole 360°