Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised...

74
Background Web graph Google’s matrix Teleportation 1 α Personalised vector Sensitivity Proofs Local algorithms New Chapter 7 PageRank Angsheng Li Institute of Software Chinese Academy of Sciences Advanced Algorithms U CAS 1st, April, 2017

Transcript of Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised...

Page 1: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Chapter 7 PageRank

Angsheng Li

Institute of SoftwareChinese Academy of Sciences

Advanced AlgorithmsU CAS

1st, April, 2017

Page 2: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Outline

1. Backgrounds

2. Web graph

3. Google’s matrix

4. Teleportation

5. Personalised vector

6. Sensitivity

7. Proofs

8. Local algorithms

9. Exrcises

Page 3: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The new phenomena

Brin and Page, 1995 - 1998

1. The current-generation search engine

2. Billions of queries everyday

3. What is the principle behind?

4. How good is the current-generation search engine?

Page 4: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The graph

• Massive directed graph

• Nodes: webpages

• Directed edges, hyperlines, including inlinks and outlinks

• The question: Rank the web pages by importance.

Page 5: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The PageRank thesis

A page is important, if it is pointed to by many important pages.

Page 6: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Brin and Page, 1998

Established the equation of the PageRank thesis.The PageRank of a page Pi , written r(Pi), is the sum of thePageRanks of all the pages pointing to Pi , that is,

r(Pi) =∑

Pj∈Bi

r(Pj)

|Pj |, (1)

• Bi : the set of pages pointing to Pi ,

• |Pj |: the number of outlinks from page Pj .

Page 7: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Recurrence of the PageRank

rk+1(Pi) =∑

Pj∈Bi

rk (Pj )

|Pj |

r0(Pi) =1n

(2)

The stationary solution of the recursive equation in Equation (2)gives rise to the PageRank of a graph G.

Page 8: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Matrix representation

Hij =

1|Pi |

if there is an edge from node i to node j ,

0 o.w.(3)

|Pi |: The number of outlinks from node i .H = (Hij) is the PageRank matrix of G.

Page 9: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

PageRank solution

Let πT be a 1× n vector.Set

π(k+1)T = π(k)TH,

π(0)T = 1n eT,

(4)

where eT = (1,1, · · · ,1).For the equation (4), we require:

• convergence and the interpretation of the solution

• Uniqueness of the solution

• Invariance of π(0)

• The number of iterations of the convergent solution

Page 10: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Rank sinks

1 2

ցւ3

All the PageRanks go to node 3.

Page 11: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Matrix S

To solve the sink problem, define a vector a,

ai =

1 if node i has no outgoing links,

0 o.w.(5)

DefinitionDefine

S = H +1n

aeT,

where eT = (1,1, · · · ,1).Intuition : If node i has no outgoing link, then from node i , therandomly walks to any other nodes uniformly.S is the transition probability matrix of a Markov chain.

Page 12: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Google’s matrix G

DefinitionDefine the Google’s matrix by

G = αS + (1− α)J,

where Jij =1n .

• J is called teleportation matrix

• 1− α is called the teleportation parameter.

Page 13: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Expander

Recall: If G is a graph with λ = λ(G) < 1, then for A = AG,

A = (1− λ)J + λC,

for some C with ‖C‖ ≤ 1.We thus know that Google’s matrix is an expander. However,the parameter α is chosen arbitrarily. Of course, α determinesthe spectral gap of the graph.

Page 14: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Properties of G - I

(1) G is stochasticIt is a convex combination of two stochastic matrices S andJ.

(2) G is irreducible.Every page is directly connected to every other page.

(3) G is aperiodic.Gii > 0. Every node has a self-loop.

(4) G is primitive.There exists a k such that Gk > 0Because: G is an expander. There is a unique πT such that

‖pGl − πT‖ ≈ 0

for a small l . – Power method works

Meng Wenxia
Text Box
随机游走
Meng Wenxia
Text Box
因为有J,可以直接连其他点
Page 15: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Properties of G - II

(5) G is rank-one updated

G = αS + (1− α)1n

eeT

= α(H +1n

aeT) + (1− α)1n

eeT

= αH + (α1n

a + (1− α)1n

e)eT. (6)

• H is sparse• α 1

n a + (1− α) 1n e is dense, but only one-dimensional vector.

(6) G is artificial due to the choice of α.G may not well reflect the real world H.

Page 16: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Computation of πT

Power method

π(k+1)T = π(k)TG

= απ(k)TS +1− α

nπ(k)TeeT

= απ(k)TH + (απ(k)Ta + (1− α)e)eT/n. (7)

Suppose that 1, λ2, · · · , λn are the eigenvalues of G with1 > |λ2| ≥ · · · ≥ |λn|.Then:

G = G1 + λ2G2 + · · · + λnGn,

– G2i = Gi ,

– For i 6= j , GiGj = 0.Then

Gl = G1 + λl2G2 + · · · λl

nGn

Since λ2 < 1, Gl quickly converges to G1.Furthermore, for any probabilistic vector p,

Page 17: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

λ(G)

LemmaFor the Google matrix G = αS + (1− α)J,

|λ2(G)| ≤ α.

Page 18: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

λ(G) again

LemmaIf the spectrum of the stochastic matrix S is 1, λ2, · · · , λn,then the spectrum of the Google matrix G = αS + (1− α)evT is

1, αλ2, · · · , αλn,where vT is the personalised vector.

Page 19: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Proofs - I

Since S is stochastic, (1,e) is an eigenpair of S. Let Q = (eX )be a nonsingular matrix that has the eigenvector e as its firstcolumn.Set

Q−1 =

(

yT

Y T

)

(8)

Then:

Q−1Q =

(

yTe yTXY Te Y TX

)

=

(

1 00 I

)

(9)

Page 20: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Proofs - II

Similarly,

Q−1SQ =

(

yTe Y TSXY Te Y TSX

)

=

(

1 yTSX0 Y TSX

)

(10)

This implies that Y TSX contains the remaining eigenvalues ofS, i.e., λ2, · · · , λn.In addition,

Q−1GQ =

(

1 αyTSX + (1− α)vTX0 αY TSX

)

(11)

The eigenvalues of G are

1, αλ2, · · · , αλn.Since λ2 ≤ 1, αλ2 ≤ α.

Page 21: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The role α

G = (1− α)J + αS.

If α is small, then 1− α is large, G is basically an artificialrandom graph, failing to reflect the real world matrix S.If α is large, then

• there is no unique stationary distribution

• even if there is a stationary distribution, it is hard tocompute

• the power method fails

Google’s choice : α = 0.85.

Page 22: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Personalised PageRank

For a personalised probability vector vT,

G = αS + (1− α)evT.

The power method works as before.The stationary distribution is a personalised PageRank.Significance : Real applications.

Page 23: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The stationary distribution

TheoremThe Pagerank πT(α) of Gα is

πT(α) =1

n∑

i=1Di(α)

(D1(α),D2(α), · · · ,Dn(α))

where Di(α) is the i-th principal minor determinant of ordern − 1 in I −Gα.Furthermore, every Di(α) is differentiable for α.

Proof.By definition.

Page 24: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Differential

TheoremIf πT(α) = (π1(α), π2(α), · · · , πn(α)), then

1. For each j,

|dπj(α)

dα| ≤ 1

1− α.

2.

‖dπT(α)

dα‖1 ≤

21− α

.

• If α is small, then the PageRank πT(α) is not sensitive.

• If α is large, then the upper bounds 11−α and 2

1−α are bothapproaching to infinity.

Page 25: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Representation

Theorem

dπT(α)

dα= −vT(I − S)(I − αS)−2.

Page 26: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Sensitive to H

1.dπT(hij)

dhij= απi(e

Tj − vT )(I − αS)−1

2.(I − αS)−1 →∞,

as α goes to 1.

πT is sensitive to perturbations in H is α ≈ 1.Therefore, if α ≈ 1, then πT is sensitive to small changes of thematrix H.

Page 27: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Sensitive to vT

dπT(vT )

dvT = (1− α+ α∑

i∈D

πi)(I − αS)−1,

D is the set of nodes that have no outgoing links.The same as before, as α goes to 1, (I − αS)−1 goes to∞.

Page 28: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Summary of sensitivity

If α ≈ 1, then

1. Computing πT(α) is hard, since the power method fails

2. πT(α) is sensitive to the perturbation of H

3. πT(α) is sensitive to the personalised vector vT

Google’s tradeoff:

α = 0.85

Page 29: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Proof of upper bounds - ITheoremIf πT(α) = (π1(α), π2(α), · · · , πn(α)), then

1. For each j,

|dπj(α)

dα| ≤ 1

1− α.

2.

‖dπT(α)

dα‖1 ≤

21− α

.

πT (α) is a probability vector, so

n∑

i=1

πi(α) = 1

givingπT(α)e = 1, eT = (1,1, · · · ,1).

Page 30: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Proof of upper bounds- IIBy definition,πT (α) = πT (α)G(α) = πT (α)(αS + (1− α)evT ).By differential,

dπT (α)

dα= πT (α)(S − evT )(I − αS)−1. (12)

For (1). For every real x , xT⊥e, i.e.,∑

xi = 0, and for all realvector y , column vector,

|xTy | = |n

i=1

xiyi |

≤ ‖xT‖1 ·ymax − ymin

2. (13)

By Equation (12),

dπj(α)

dα= πT(α)(S − evT)(I − αS)−1ej .

Page 31: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Prof of upper bounds - III

Since πT(α)(S − evT)e = 0, set xT = πT(α)(S − evT) andy = (I − αS)−1ej .By Inequality (13),

|dπj(α)

dα| ≤ ‖πT(α)(S − evT)‖1 ·

ymax − ymin

2.

Since ‖πT(α)(S − evT)‖1 ≤ 2,

|dπj (α)dα | ≤ ymax − ymin.

Since (I − αS)−1 ≥ 0 and (I − αS)e = (1− α)e, and hence(I − αS)−1 = (1− α)−1e.This shows that ymin ≥ 0.For ymax, we haveymax ≤ maxi ,j [(I − αS)−1]ij ≤ 1

1−α .(1) follows.

Page 32: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Proof of upper bounds - IV

For (2).

‖dπT(α)

dα‖1 = ‖πT(α)(S − evT)(I − αS)−1‖1

≤ ‖πT(α)(S − evT)‖1 · ‖(I − αS)−1‖∞≤ 2

11− α

=2

1− α. (14)

Page 33: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Conductance

Given a graph G = (V ,E) and S ⊂ V , the conductance of S inG is:

Φ(S) =|E(S, S)|

minvol(S), vol(S).

The conductance of G isΦ = minΦ(S) | |S| ≤ n

2.

Page 34: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Push(u)

Andersen, Chung and Lang, FOCS, 2006.Define an operatorPush(u):

1. p(u)← p(u) + αr(u)

2. r(u)← (1− α)r(u)/2

3. For each v with v ∼ u,set

r(v)← r(v) + (1− α)r(u)/(2d(u)).

Page 35: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Approximate PageRank

Given a node v ,

1. set p = 0, r(v) = 1, and r(u) = 0 for all u 6= v .

2. For every u, if r(u) ≥ ǫd(u), then:– Apply push(u).

3. Otherwise, Then output p and r .

Page 36: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

ACL local algorithm

1. To find the RageRank from a given input vertex v ,

2. To rank the pages by decreasing of the normalisedPageRank, i.e., pv

d(v) . Suppose that v1, v2, · · · , vl

is listed such that

pv1

d(v1)≥ pv2

d(v2)≥ · · · ≥ pvl

d(vl).

3. (Pruning) To take an initial segment of the list as acommunity associated with the given input v .Let j be such that

Φ(Xj) = minΦ(Xi) | 1 ≤ i ≤ l,where φ(X ) is the conductance of X in G, andXk = v1, · · · , vk.Output Xj .

Meng Wenxia
Text Box
conductance小,图出去的少
Page 37: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Question for local algorithm

For every query Q, we rank the set of answers for the query byPageRank, however, the list is a too long list.The question is to determine a short list of ranks as the outputof the query.Still open.

Page 38: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

The great idea

• The PageRank thesis

• The teleportation parameter 1− α.This is a great idea, which may be used in many otherareas, such as learning, data processing.The essence of the idea here is to make sure that theRanking matrix is a well-defined stochastic procedure sothat PageRank exists and can be computed.We may also regard the introduction of 1− α as amplifyingnoises, playing a role similar to that in the error correctingcodes.

• Google’s success: Making big money by randomness

Page 39: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

A grand challenge

• What is the principle for determining α? Is there a metric ofnetworks which determines the optimum α?

• What are principles for structuring the unstructured andnoisy data?

• Making money by connection and interaction???

Page 40: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Reference

1. Amy N. Langville and Carl D. Meyer, Google’s PageRankand Beyond: The Science of Search Engine Ranking,Princeton University Press, 2006.2. Andersen, Chung and Lang, Local graph partitioning usingPageRank vectors, FOCS, 2006.

Page 41: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Natural rank

The natural rank based on the structural information theory isthe answer.

Page 42: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 1

Let X1, · · · ,Xn be independent random variables such that Xi isequal to 1 with probability 1− δ and equal to 0 with probability

δ. Let X =n∑

i=1Xi( mod 2). Prove that

Pr[X = 1] =

12 + (1− 2δ)n/2, n is odd,12 − (1− 2δ)n/2, n is even.

Significance?

Page 43: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 1 - proof 1Let Yi = (−1)Xi , and Y =

n∏

i=1Yi .

Assume n odd.Let Pr[X = 1] = α.Then

E [Y ] = 1− 2α.

Since Xi and then Yi are independent,

E [Y ] = (−1 + 2δ)n

Therefore

(−1 + 2δ)n = 1− 2α

α =12+

(1− 2δ)n

2.

Page 44: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 1 - proof 2Proof.

(1 − 2δ)n = ((1 − δ) − δ)n

=

n∑

i=0

(

ni

)

(1 − δ)i (−δ)n−i .

Case 1. n odd(1− 2δ)n = Pr[X = 1]− Pr[X = 0],

Pr[X = 1] =12+

12(1 − 2δ)n

Case 2. n even (1− 2δ)n = Pr[X = 0]− Pr[X = 1],

Pr[X = 1] =12− 1

2(1 − 2δ)n

Page 45: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 2

Prove that if there exists a δ-density distribution H such thatPr

x∈RH[C(x) = f (x)] ≤ 1

2 + ǫ for every circuit C of size at most s

with s ≤√

ǫ2δ2n/100, then there exists a subset I ⊆ 0,1n ofsize at least δ

22n such that

Prx∈RI

[C(x) = f (x)] ≤ 12+ 2ǫ

for every circuit C of size at most s.

Page 46: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 2 - Proof

Some problems?Leave this to Mingji

Page 47: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3

1. Let f : F→ F be any function. Suppose integer d ≥ 0 and

number ǫ > 2√

d|F| . Prove that there are at most 2/ǫ degree

d polynomials that agree with f on at least an ǫ fraction ofits coordinates.Significance?

2. Prove that if Q(x , y) is a bivariate polynomial over somefield F and P(x) is a univariate polynomial over F such thatQ(x ,P(x)) is the zero polynomial, thenQ(x , y) = (y − P(x))A(x , y) for some polynomial A(x , y).

Page 48: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 1Suppose that

P1,P2, · · · ,Pl

are the all degree d polynomials that agree with f in at least ǫfraction of coordinates.For each i , define a vector vi by

vi(j) =

1, if Pi(j) = f (j),

0,otherwise

for every j ∈ F.Then for every i ,

‖vi‖1 ≥ ǫ ·m,

where m = |F|.

ǫm ≤ 〈vi , vi〉 ≤ m

Page 49: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 2

Set

v =l

i=1

vi

Then

〈v , v〉 =l

i=1

〈vi 〉+∑

i 6=j

〈vi vj〉

≤ l ·m + (l2 − l)d .

Page 50: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 3

And

〈v , v〉 =∑

k∈F

(v(k))2

=∑

k

(

l∑

i=1

vi(k))2

≥ (∑

k∑

i vi(k))2

m

=(∑

i∑

k vi(k))2

m

≥ (lǫm)2

m.

Page 51: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 4

This gives

l ≤ 1− dm

ǫ2 − dm

for ǫ >√

dm .

Page 52: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 5For

ǫ > 2

dm

and

l ≤ 2ǫ.

ǫ+ (ǫ− dm) + · · · (ǫ− (l − 1)d

m) ≥ 1

with

ǫ− (l − 1)dm

≥ 0

Solving this, we have

l ≤ 2ǫ.

Page 53: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 3 - proof 4

Take Q(x , y) as a polynomial of y with coefficients beingpolynomials of x .Divide Q(x , y) by the linear function y − P(x), linear in variabley , giving

Q(x , y) = (y − P(x))A(x , y) + R(x)

By the assumption,

Q(x ,P(x)) = R(x) ≡ 0.

Page 54: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4

Linear codes We say that an ECC E : 0,1n → 0,1m islinear, if for every x , x ′ ∈ 0,1n, E(x + x ′) = E(x) + E(x ′)(componentwise addition modulo 2). A linear ECC can be seenan m × n matrix A such that E(x) = Ax , thinking of x as acolumn vector.

1. Prove that the distance of a linear ECC is equal to theminimum over all nonzero x ∈ 0,1n of the fraction of 1’sin E(x).

2. Prove that for every δ > 0, there exists a linear ECCE : 0,1n → 0,1m for m = Ω(n)/(1 − H(δ)) withdistance δ.

3. Prove that for some δ > 0, there is an ECCE : 0,1n → 0,1poly(n) of distance δ with poly timeencoding, and decoding algorithms.

Page 55: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 1

Let A be an m × n 0,1 matrix which defines a linear ECC.The distance of A is:

δ = minx 6=x ′

1m· |i | yi 6= y ′

i |

where

yi = ai ,1x1 + ai ,2x2 + · · · ai ,nxn

and

y ′i = ai ,1x ′

1 + ai ,2x ′2 + · · · ai ,nx ′

n

This is

δ = minx 6=0

1m· |i |yi = 1|.

Page 56: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 2

Remove the condition that E is linear.Given a vector y ∈ 0,1m, define the δ-ball of y to be the set ofthe vectors z ∈ 0,1m such that the distance between y and zis less than δ.Denoted by Bδ

y . Then

|Bδy | ≤

(

mδ ·m

)

= o(1) · 2H(δ)·m

In increasing order, for each x ∈ 0,1n, we define E(x) to be ay ∈ 0,1m such that Bδ

y disjoins all the δ-balls associated withthe codewords of x ′ < x .Suppose that m ≥ n

1−H(δ) . Then the definition above neverstops, since there are at least 2n many disjoint δ-balls in0,1m.

Page 57: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 3

Consider now the linear ECC.Each linear ECC is given by an m × n matrix A.Two approaches:Case 1. Consider the random matrix A.With nonzero probability that A is such an ECC.Case 2. Counting the number of linear ECC that have distance< δ.

Page 58: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 4Consider the first approach.Let A be a random m × n matrix.We say that x = (x1, x2, · · · , xn) is a witness showing that A hasdistance < δ, if x 6= 0 and there are < δm many j such thatyj = 1, where

yj = aj1x1 + · · ·+ ajnxn.

For each j , define

Yj =

1, if yj = 1,

0, otherwise.

Let

Y =m∑

j=1

Yj .

Page 59: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 5Then for each j ,

E [Yj ] =12

E [Y ] = µ =m2

Clearly, all Yj ’s are independent.By the Chernoff bound, for ǫ = 1− 2δ,

Pr[Y < δm] = Pr[Y < (1 − ǫ)µ]

≤ [e−ǫ

(1− ǫ)(1−ǫ)]

m2

≤ 12c·m ,

for some constant c.

Page 60: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 6

By the union bound,the probability that A has a witness for distance < δ is

12c·m−n

which is ≈ 0 if

m = Ω(n).

Page 61: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 7

Consider the Reed-Solomon code

RS : Fn → Fm

It is an ECC with distance δ1 = 1− nm .

For every x = (a0,a1, · · · ,an−1) ∈ Fn,

RS(x) = (z0, z1, · · · , zm−1)

where

zj =

n−1∑

i=0

ai ji

j ∈ F.

Page 62: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 4 - proof 8

Let |F| = 2k .Then each element f ∈ F is interpreted as an element inGF (2k ).For x ∈ F

n, we interpret it as an element in 0,1k ·n. Weencode RS(x) by

WH(z0),WH(z1), · · · ,WH(zm−1)

This is an ECC from 0,1k ·n to 0,1m·2k.

Choosing k such that m · 2k is a polynomial of k · n.

Page 63: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 5

1. Recall the spectral norm of a matrix A, written ‖A‖ to bethe maximum ‖Av‖2 for unit v . Let A be symmetricstochastic, i.e., A = AT, and every row and column of A hasnonnegative entries summing up to 1. Prove that ‖A‖ ≤ 1.

2. Let A, B be symmetric stochastic matrices. Prove thatλ(A + B) ≤ λ(A) + λ(B).

3. Let A, B be two n × n matrices.(a) Prove that ‖A + B‖ ≤ ‖A‖+ ‖B‖.(b) Prove that ‖AB‖ ≤ ‖A‖ · ‖B‖

Page 64: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 5 - proof

For 1.First,

‖A‖ ≤ n2.

Second, for every such A,

• A2 is symmetric stochastic matrix

•‖A2‖ ≥ ‖A‖2.

• If there is an A such that ‖A‖ = 1+ α for α > 0. Then thereis such a B with ‖B‖ unbounded.

Page 65: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 6

Let G be an (n,d , λ)-expander graph, and B be a set of verticesof size at most βn for 0 < β < 1. Let X1,X2, · · · ,Xk be arandom walk of k steps in G from X1 that is randomly anduniformly chosen.

1. Prove that for every subset I ⊆ [k ],

Pr[(∀i ∈ I)[Xi ∈ B]] ≤ (1− λ)√

β + λ)|I|−1.

2. Conclude that if B < n/100 and λ < 1/100, then theprobability that there exists a subset I ⊆ [k ] such that|I| > k/10 and ∀i≤|I|Xi ∈ B is at most 2−k/100.

3. To show that every BPP algorithm that uses m coins anddecides a language L with probability 0.99 into analgorithm B that uses m + O(k) coins and decides thelanguage L with probability 1− 2−k .

Page 66: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 6: Proof - I

For each i , 1 ≤ i ≤ k , letBi : the event Xi ∈ B. For I ⊆ [k ], let I = j1 < j2 < · · · ji. Then:

Pr[∧i∈IBi ]

= Pr[Bj1] · Pr[Bj2 |Bj1] · · · · · Pr[Bji |Bj1, · · · ,Bji−1]. (15)

Define B to be a linear transformation from Rn to R

n that keepsthe values indexed in B. That is, for (u1,u2, · · · ,un), define

(Bu)i =

ui , if i ∈ B,

0, otherwise.

Page 67: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 6: Proof - II

For every probability vector p,(i) Bp is the vector whose coordinates sum to the probabilitythat a vertex i is chosen according to p, is in B.(ii) The normalised Bp is the distribution of p conditioned to theevent that the vertex is in B.

Page 68: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 6: Proof - III

Let pj be the distribution of Xj conditioned on the eventsBj1 , · · · ,Bji . Then:

p1 =1

Pr[Bj1]· B1

p2 =1

Pr[Bj2 |Bj1 ]Pr[Bj1 ]BAB1

pi =1

Pr[Bji |Bji−1, · · ·Bj1] · · ·Pr[Bj1]

(BA)i−1B1.

Hence,

Pr[Bj1 ] · · ·Pr[Bji |Bji−1· · ·Bj1]p

i = (BA)i−1B1.

Page 69: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 6: Proof - IV

Pr[∧j∈IBj ] = Pr[B1] · · ·Pr[Bji |Bji−1· · ·Bj1 ] = ‖(BA)i−1B1‖1.

Let A = (1− λ)J + λC for some C with ‖C‖ ≤ 1.Then BA = (1− λ)BJ + λBC.Noting:(i) ‖B1‖2 ≤

√β‖1‖2

(ii) ‖BJ‖ ≤√β, ‖B‖ ≤ 1, ‖BC‖ ≤ 1.

(iii) ‖BA‖ ≤ (1− λ)√β + λ

Therefore,

|(BA)i−1B1|1 ≤ ‖(BA)i−1B1‖2 ·√

n

≤ ((1 − λ)√

β + λ)i−1. (16)

Page 70: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 7

(1) Give a probabilistic polynomial time algorithm that given a3CNF formula φ with exactly three distinct variables ineach clause, outputs an assignment satisfying at least a 7

8fraction of φ’s clauses.

(2) Give a deterministic polynomial time algorithm with thesame approximation guarantee as Exercise 1 above.

(3) Show a polynomial time algorithm that given a satisfiable2CSP instance φ over binary alphabet with m clausesoutputs a satisfying assignment for φ.

(4) Show a deterministic poly (n,2q)-time algorithm that givena qCSP-instance φ over binary alphabet with m clausesoutputs an assignment satisfying m/2q of the constraintsof φ.

Page 71: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 7 - proof

Easy

Page 72: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 8

(5) Suppose that G = (V ,E) is an (n,d , λ)-expander. Showthat for any S ⊂ V of size ≤ n

2 , the following holds:

Pr(u,v)∈RE

[u ∈ S ∧ v ∈ S] ≤ |S|n

(12+

λ

2).

Page 73: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 8 - proof- 1

Pre=(u,v)∈RE

[u ∈ S& v ∈ S]

= Pr[u ∈ S] · Pr[v ∈ S | u ∈ S].

Clearly,

Pr[u ∈ S] =sn,

where s = |S|.Recall the expander mixing lemma, for any X , and Y ,

|e(X ,Y )− vol X · vol Yvol G

| ≤ λ√

vol X · vol Y .

Page 74: Chapter 7 PageRank · Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New Brin and Page, 1998 Established the

Background Web graph Google’s matrix Teleportation 1 − α Personalised vector Sensitivity Proofs Local algorithms New

Exercise 8 - proof -2

For X = Y = S, using the lemma,

Pr[v ∈ S | u ∈ S] ≤ 12(1 + λ).