Raleigh, NC North Carolina State University Department of … · A. A. Markov Anniversary Meeting...

UpdatingMarkovChains

Carl MeyerAmy Langville

Department of MathematicsNorth Carolina State UniversityRaleigh, NC

A. A. Markov Anniversary Meeting

June 13, 2006

IntroAssumptions

Very large irreducible chain

— m = O(109) number of states

— Qm×m old transition matrix

— φT = (φ1, φ2, . . ., φm) old stationary distribution

IntroAssumptions

Updates

Change some transition probabilities

Add or delete some states

— Pn×n new transition matrix (irreducible)

— πT = (π1, π2, . . ., πn) new distribution (unknown)

IntroAssumptions

Updates

Change some transition probabilities

Add or delete some states

— Pn×n new transition matrix (irreducible)

— πT = (π1, π2, . . ., πn) new distribution (unknown)

Use φT to compute πT

Exact (Theoretical) UpdatingPerturbation Formula

P = Q − E =⇒ πT = φT − εT

P = Q − E =⇒ πT = φT − εT εT = φTEZ(I + EZ)−1

Fundamental Matrix

(I − Q)#(group inverse)

Fundamental Matrix

(I − Q)#(group inverse)Requires m = n (No states added or deleted)

Fundamental Matrix

One Row At A Time (Sherman–Morrison)

pTi = qT

i − δTi =⇒ πT = φT − εT

Fundamental Matrix

pTi = qT

εT =(

1 + δTi A#

)δTA# A# = (I − Q)#

Fundamental Matrix

pTi = qT

εT =(

1 + δTi A#

)δTA# A# = (I − Q)#

(I − P)# = A# + eεT[A# − γI

]− A#

∗iεT

φiγ = εTA#

∗iφi

Fundamental Matrix

pTi = qT

εT =(

1 + δTi A#

)δTA# A# = (I − Q)#

(I − P)# = A# + eεT[A# − γI

]− A#

∗iεT

φiγ = εTA#

∗iφi

Not Practical For Large Problems

Restarted Power MethodxT

j+1 = xTj P with xT

0 = φT(assume aperiodic)

j+1 = xTj P with xT

— Requires m = n (no states added or deleted)

j+1 = xTj P with xT

— Requires φT ≈ πT (generally means P ≈ Q)

j+1 = xTj P with xT

— Asymptotic rate of convergence: R = − log10 |λ2|Need about 1/R iterations to eventually gain oneadditional significant digit of accuracy

j+1 = xTj P with xT

Example:

� Suppose digit1(φT ) = digit1(πT )

� Want 12 digits of accuracy

j+1 = xTj P with xT

Example:

� RPM — about 11/R iterations

� Start from scratch — about 12/R iterations

j+1 = xTj P with xT

Example:

� Only 8.3% reduction

j+1 = xTj P with xT

Example:

� Only 8.3% reduction

A Little Better, But Not Great

CensoringPartition (not necessarily NCD)

Pn×n =

⎛⎜⎜⎜⎝

G1 G2. . . Gk

G1 P11 P12. . . P1k

G2 P21 P22. . . P2k...

......

. . ....

Gk Pk1 Pk2. . . Pkk

⎞⎟⎟⎟⎠

Pn×n =

⎛⎜⎜⎜⎝

G1 G2. . . Gk

G1 P11 P12. . . P1k

G2 P21 P22. . . P2k...

......

. . ....

Gk Pk1 Pk2. . . Pkk

⎞⎟⎟⎟⎠

Censored Chains

Records state of process only when chain visits states in Gi.

Visits to states outside of Gi are ignored.

Pn×n =

⎛⎜⎜⎜⎝

G1 G2. . . Gk

G1 P11 P12. . . P1k

G2 P21 P22. . . P2k...

......

. . ....

Gk Pk1 Pk2. . . Pkk

⎞⎟⎟⎟⎠

Censored Chains

Records state of process only when chain visits states in Gi.

Visits to states outside of Gi are ignored.

Censored Transition Matrices

Ci = Pii + Pi�(I − P�i )−1P�i Stochastic Complements

Aggregation

Censored Distributions

sTi Ci = sT

Aggregation

sTi Ci = sT

Aggregation Matrix

Ak×k =

⎡⎣ sT

1P11e . . . sT1P1ke

.... . .

k Pk1e . . . sTk Pkke

⎤⎦ e =

⎡⎣ 1...

⎤⎦

Aggregation

sTi Ci = sT

Aggregation Matrix

Ak×k =

⎡⎣ sT

1P11e . . . sT1P1ke

.... . .

⎤⎦ e =

⎡⎣ 1...

⎤⎦

Aggregated Distribution

αTA = αT αT = (α1, α2, . . ., αk)

Aggregation

sTi Ci = sT

Aggregation Matrix

Ak×k =

⎡⎣ sT

1P11e . . . sT1P1ke

.... . .

⎤⎦ e =

⎡⎣ 1...

⎤⎦

Aggregated Distribution

αTA = αT αT = (α1, α2, . . ., αk)

Aggregation Theorem

πT = (πT1 |πT

2 | . . . |πTk ) = (α1sT

1 |α2sT2 | . . . |α2sT

PartitioningIntuition

Update relatively small number of states in large sparse chain

— Effects are primarily local

— Most stationary probabilities not significantly affected.

State Space Partition

S = G ∪ G G ={

states likely to be most affectednewly added states

S = G ∪ G G ={

— g = |G| << |G|

S = G ∪ G G ={

— g = |G| << |G|

Induced Matrix Partition

Pn×n =

( G GG P11 P12

G P21 P22

⎡⎢⎢⎢⎣

p11. . . p1g P1�

......

...pg1

. . . pgg Pg�

P�1. . . P�g P22

⎤⎥⎥⎥⎦

Specialized AggregationCensored Transition Matrices

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

Censored DistributionssT

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

Aggregation Matrix

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

Aggregation Matrix Aggregated Distribution

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

)Aggregation Theorem

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

Old Distribution (reorded)

φT = (φ1, φ2, . . . |φT)

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT =⇒ φ

T ≈ αg+1sTg+1

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT =⇒ φ

T ≈ αg+1sTg+1 =⇒ sT

g+1 ≈ φT

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT =⇒ φ

g+1 ≈ φT

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

−−−−−→ αT =(α1, . . ., αg, αg+1

πT =(π1, . . ., πg |πT

)= (α1, . . ., αg, |αg+1sT

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT =⇒ φ

g+1 ≈ φT

C1 = . . . = Cg = [1]1×1 Cg+1 = P22 + P21(I − P11)−1P12

1 = . . . = sTg = 1 sT

g+1 = sTg+1Cg+1

P11 P12esT

g+1P21 1 − sTg+1P21e

](g+1)×(g+1)

−−−−−→ αT =(α1, . . ., αg, αg+1

πT = (α1, . . ., αg, |αg+1sTg+1)

Old Distribution (reorded) Updated Distribution

φT = (φ1, φ2, . . . |φT)

The Assumption

φT ≈ πT =⇒ φ

g+1 ≈ φT

SummaryReorder & Partition Updated State Space

S = G ∪ G G = {New + Most affected} G = {Less affected}

Use Less Affected Components From Old Distribution

φT ← components from φT corresponding to states in G

Approximate Censored Distribution

sT ← φT/(φ

Form Approximate Aggregation Matrix

A ←[

P11 P12esTP21 1 − sTP21e

](g+1)×(g+1)

sT ← φT/(φ

A ←[

](g+1)×(g+1)

Compute Approximate Aggregated DistributionαT ←

(α1, . . ., αg, αg+1

sT ← φT/(φ

A ←[

](g+1)×(g+1)

(α1, . . ., αg, αg+1

)Approximate Updated Distribution

πT ← (α1, . . ., αg, |αg+1sT )

sT ← φT/(φ

A ←[

](g+1)×(g+1)

(α1, . . ., αg, αg+1

πT ← (α1, . . ., αg, |αg+1sT )

Iterate ?

sT ← φT/(φ

A ←[

](g+1)×(g+1)

(α1, . . ., αg, αg+1

πT ← (α1, . . ., αg, |αg+1sT )

Iterate ? No! — At A Fixed Point

Iterative AggregationMove Off Of Fixed Point With Power Step

sT ← φT/(φ

A ←[

](g+1)×(g+1)

αT ←(α1, . . ., αg, αg+1

)πT ← (α1, . . ., αg, |αg+1sT )ψT ← πTP (Also makes progress toward convergence when aperiodic)

If ‖ψT − χT‖ < τ then quit — else

sT ←− ψT/(ψTe)

−−→

sT ← φT/(φ

A ←[

](g+1)×(g+1)

αT ←(α1, . . ., αg, αg+1

−−→

TheoremIf C = P22 + P21(I − P11)−1P12 is aperiodic, then convergent forall partitions S = G ∪ G

sT ← φT/(φ

A ←[

](g+1)×(g+1)

αT ←(α1, . . ., αg, αg+1

−−→

TheoremIf C = P22 + P21(I − P11)−1P12 is aperiodic, then convergent forall partitions S = G ∪ G

— |λ2(C)| determines rate of convergence

Google’s PageRankRandom Walk On WWW Link Structure

Hij ={

1/(total # outlinks from page Pi) if Pi → Pj,

0 otherwise

Hij ={

0 otherwise

Google Matrix

P = α(H + E) + (1 − α)F

— (H + E) & F are stochastic rank (E) = rank (F) = 1

— 0 < α < 1

Hij ={

0 otherwise

Google Matrix

P = α(H + E) + (1 − α)F

— 0 < α < 1

— PageRank = πT

Hij ={

0 otherwise

Google Matrix

P = α(H + E) + (1 − α)F

— 0 < α < 1

— PageRank = πT

Power Law Distribution

If ordered by magnitude π(1) ≥ π(2) ≥ . . . ≥ π(n), then

— π(i) ≈ αi−k for k ≈ 2.109[Donato, Laura, Leonardi, 2002]

[Pandurangan, Raghavan,& Upfal, 2004]

— Relatively few large states (i.e., important sites)

“L” Curves

0 200 400 6000

0.035censorship

0 200 400 6000

0.04movies

0 200 400 6000

0.05mathworks

0 1000 2000 0 5000 100000

0.025abortion

0 1000 2000 30000

0.01genetics CA

Experiments

The Updates

# Nodes Added = 3

# Nodes Removed = 50

# Links Added = 10 (Different values have little effect on results)

# Links Removed = 20

Stopping Criterion

1-norm of residual < 10−10

Movies

Power Method Iterative Aggregation

Iterations Time

17 .40

|G| Iterations Time

5 12 .3910 12 .3715 11 .3620 11 .3525 11 .3150 9 .31100 9 .33200 8 .35300 7 .39400 6 .47

nodes = 451 links = 713

Censorship

Iterations Time

38 1.40

|G| Iterations Time

5 38 1.6810 38 1.6615 38 1.5620 20 1.0625 20 1.0550 10 .69100 8 .55200 6 .53300 6 .65400 5 .70

nodes = 562 links = 736

MathWorks

Iterations Time

54 1.25

|G| Iterations Time

5 53 1.1810 52 1.2915 52 1.2320 42 1.0525 20 1.1350 18 .70100 16 .70200 13 .70300 11 .83400 10 1.01

nodes = 517 links = 13,531

Abortion

Iterations Time

106 37.08

|G| Iterations Time

5 109 38.5610 105 36.0215 107 38.0520 107 38.4525 97 34.8150 53 18.80100 13 5.18250 12 5.62500 6 5.21750 5 10.221000 5 14.61

nodes = 1,693 links = 4,325

Genetics

Iterations Time

92 91.78

|G| Iterations Time

5 91 88.2210 92 92.1220 71 72.5350 25 25.42100 19 20.72250 13 14.97500 7 11.141000 5 17.761500 5 31.84

nodes = 2,952 links = 6,485

California

Iterations Time

176 5.85

|G| Iterations Time

500 19 1.121000 15 .921250 20 1.041500 14 .902000 13 1.175000 6 1.25

nodes = 9,664 links = 16,150

“L” Curves

0 200 400 6000

0.035censorship

0 200 400 6000

0.04movies

0 200 400 6000

0.05mathworks

0 1000 2000 0 5000 100000

0.025abortion

0 1000 2000 30000

0.01genetics CA

“L” Curves

0 200 400 6000

0.035censorship

1000 200

50400 600

0.04movies

500 200 400 600

0.05mathworks

2500 1000 2000

0.025abortion

2500 1000 2000 3000

0.025genetics

20000 5000 10000

0.01CA

Quadratic Extrapolationnodes = 10,000 links = 101,118 [Kamvar, Haveliwala, Manning, Golub, 2003]

0 20 40 60 80 100 120 140 160 180 12

iteratioiteration

lolog 1010

power methopower methodpower method with quad. extrap.power method with quad. extrap.

0 20 40 60 80 100 120 140 160 180 12

iteratioiteration

lolog 1010

power methopower methodpower method with quad. extrap.power method with quad. extrap.i A D A D

0 20 40 60 80 100 120 140 160 180 12

iteratioiteration

lolog 1010

power methopower methodpower method with quad. extrap.power method with quad. extrap.i A D A Di A D A D with quad. extrap. with quad. extrap.

Timings

Iterations Time (sec) |G|

Power 162 9.69

Power+Quad 81 5.93

IAD 21 2.22 2000

IAD+Quad 16 1.85 2000

nodes = 10,000 links = 101,118

Conclusion

Iterative aggregation shows

promise for updating

Markov chains

Especially for those having

power law distributions

LevelingOffPointπ(i) ≈ αi−k

∣∣∣∣dπ(i)di

∣∣∣∣ ≈ ε for some user-defined tolerance ε

ilevel ≈(

)1/k+1

Perhaps better: ilevel ≈ f (n)(

)1/k+1

For WWW: gopt ≈ f (n)[2.109α

]1/3.109

Raleigh, NC North Carolina State University Department of … · A. A. Markov Anniversary Meeting...

Documents

Transcript of Raleigh, NC North Carolina State University Department of … · A. A. Markov Anniversary Meeting...

Security with Noisy Data Boris Škorić TU Eindhoven Ei/Ψ anniversary, 24 April 2009 1.

Introductioniannuzzi/OldRevised.pdfin the classi cation of envelopes of holomorphy of invariant subdomains of +. 1. Introduction Let G=K be a non-compact, irreducible, Riemannian symmetric

Alpha Phi Omega Alpha Gamma Chapter 80th Anniversary Alpha Phi Omega Alpha Gamma Chapter Purdue University 80th Anniversary Banquet 1932 – 2012 Our 80.

THE IRREDUCIBLE COMPONENTS OF THE PRIMAL COHOMOLOGY …eizadi/papers/Simplicity... · 2020. 2. 10. · Mori and Mukai [MM83] showed that is dominant. The ppav (A;) with singular theta

SL(n,R) and Diff(n,R) - Decontraction formula and Unitary Irreducible Representations

Classical Studies | - ΦΗΜΗ · 2015-11-15 · ICCS 50th Anniversary Page 14 L’Année Philologique Page 15 Featured Courses Page 16 DKU Page 18 Graduate Student News Page 22 Grad.

1BDJGJD +PVSOBM PG .BUIFNBUJDT · irreducible representations of GL(%, R). In this paper we address the question of whether a given re-presentation 77 of G c is the lifting of some

ΤΟΥ ΔΟΜΗΝΙΚΟΥ ΘΕΟΤΟΚΟΠΟΥΛΟΥ ΤΙΜΗΣ & …...A TRIBUTE TO EL GRECO FOR THE 400th ANNIVERSARY OF HIS DEATH ΤΟΥ ΔΟΜΗΝΙΚΟΥ ΘΕΟΤΟΚΟΠΟΥΛΟΥ

Index FormulaeforLine Bundle Cohomology onComplex Surfaces › ... › 3078385 › 4_Brodie.pdf · irreducible negative self-intersection divisors. Then the following map, D!D~ =

Anti-logarithm table for GF(256) with irreducible polynomial 283 … · 2019-07-28 · 26 6 63 205 100 16 137 231 174 137 211 222 248 28 ... by Cody Planteen; https: ... 24 76 61

1914 Martin Gardner - Williams Collegeweb.williams.edu/.../pme/pme100problemspart2formatted4.pdf · 2014. 1. 6. · THE PI MU EPSILON 100TH ANNIVERSARY PROBLEMS: PART I STEVEN J.

Unitary representations of real split groupsapantano/pdf_documents/utah_2.pdf · 2010-03-01 · Unitary dual of FINITE groups G: a ﬂnite group † Every irreducible repr. of G is

Past, Present and Future - endo.gr filePast, Present and Future International Conference commemorating the 50 Years Anniversary of the ΕΤΑ 11-13 June 2015 European Cultural Centre

ANNIVERSARY - Μπαζάκα Plus - Αντλίες - Γεννήτριες ...el]file.pdfADDITIONAL CHARGE FOR SPECIAL WINDINGS PRICE 110 V - 50 Hz (Single Phase) + 10% 240 V - 50 Hz

arXiv:q-alg/9708016v1 17 Aug 1997 · arXiv:q-alg/9708016v1 17 Aug 1997 Classiﬁcation of irreducible modules of W3 algebra with c= −2 Weiqiang Wang Max-Planck Institut fur Mathematik,

A Golden Anniversary for a Golden Computer Brian L. Stuart Drexel University.

HIRZEBRUCH–ZAGIER CYCLES AND TWISTED …€“ZAGIER CYCLES AND TWISTED TRIPLE PRODUCT SELMER GROUPS 3 Let σbe the irreducible cuspidal automorphic representation of GL2(A) associated

60 th Anniversary Department of Atmospheric Sciences National Taiwan University George Tai-Jen Chen Department of Atmospheric Sciences National Taiwan.

Restricted Irreducible Representations for the Non-graded ... · Horacio Guerra We classify the simple restricted modules for the minimal p-envelope of the non-graded, non-restricted

Lecture 2: Stochastic Models and MCMC October 5, 2010 · Introduction Matrix in R Irreducibility The transition probability matrix P is said to be irreducible if it consists of only