Significant scales in community structure

Post on 19-Jun-2015

99 views 3 download

description

Presentation at ECCS 2013, Barcelona, September 17, 2013

Transcript of Significant scales in community structure

Significant scales in community structure

V.A. Traag1,2, G. Krings3, P. Van Dooren4

1KITLV, Leiden, the Netherlands2e-Humanities, KNAW, Amsterdam, the Netherlands

3Real Impact, Brussels, Belgium,4UCL, Louvain-la-Neuve, Belgium

September 17, 2013

eRoyal Netherlands Academy of Arts and SciencesHumanities

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj) = −∑

c(ec − γn2c)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj) = −∑

c(ec − γn2c)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj) = −∑

c(ec − γn2c)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

Community Detection

Contant Potts Model (CPM)

• Minimize H(γ) = −∑ij(Aij − γ)δ(σi , σj) = −∑

c(ec − γn2c)

• Resolution-limit-free

• Internal density pc > γ

• Density between pcd < γ

How to choose γ?

Resolution profile

10−3 10−2 10−1 100103

104

105

106

γ

N E

Significance

How significant is a partition?

Significance

E = 14

E = 9

Fixed partition

E = 11

Better partition

Significance

E = 14

E = 9

Fixed partition

E = 11

Better partition

• Not: Probability to find E edges in partition.

• But: Probability to find partition with E edges.

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Subgraph probability

Decompose partition

• Probability to find partition with E edges.

• Probability to find communities with ec edges.

• Asymptotic estimate

• Probability for subgraph of nc nodes with density pc

Pr(S(nc , pc) ⊆ G (n, p)) ≈ exp[−n2cD(pc ‖ p)

]

Significance

• Probability for all communities Pr(σ) ≈∏c

exp[−n2cD(pc ‖ p)

].

• Significance S(σ) = − log Pr(σ) =∑c

n2cD(pc ‖ p).

Significance

10−3 10−2 10−1 100103

104

105

106

γ

N E

Significance

10−3 10−2 10−1 100103

104

105

106

γ

N E S

Benchmark

0.25

0.5

0.75

1

NM

In = 5000, Small

0

1S S∗

0 0.2 0.4 0.6 0.8 101

µ

S∗ 〈S〉

CPM+SigSignificanceModularity

InfomapOSLOM

Conclusions

• Scan γ efficiently.

• Significance applicable in all methods.

• Correct comparison to random graph.

Traag, Krings, Van Dooren Significant scales in Community StructurearXiv:1306.3398

Thank you!Questions?

e-mail: vincent@traag.net twitter: @vtraag