Scalable Algorithms in the Age of Big Data and Network...

75
Shang-Hua Teng USC Scalable Algorithms in the Age of Big Data and Network Sciences

Transcript of Scalable Algorithms in the Age of Big Data and Network...

Page 1: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Shang-Hua Teng

USC

Scalable Algorithms in the Age of Big

Data and Network Sciences

Page 2: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Asymptotic Complexity

O( f(n) )

Ω( g(n) )

Θ( h(n) )

for problems with massive input

Page 3: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Characterization of Efficient Algorithms

Polynomial Time

O(nc) for a constant c

Page 4: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

• Tera Web pages

• unbounded amount of Web logs

• billions of variables

• billions of transistors.

Big Data and Massive Graphs

Page 5: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

• Tera Web pages

• unbounded amount of Web logs

• billions of variables

• billions of transistors.

Happy Asymptotic World for Theoreticians

Big Data and Massive Graphs

Page 6: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Efficient Algorithms for Big Data

Quadratic time algorithms could be too slow!!!!

Page 7: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Modern Notion of

Algorithmic Efficiency

Page 8: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Therefore, more than ever before, it is not just

desirable, but essential, that efficient

algorithms should be scalable. In other words,

their complexity should be nearly linear or

sub-linear with respect to the problem size.

Thus, scalability — not just polynomial-time

computability — should be elevated as the

central complexity notion for characterizing

efficient computation.

Page 9: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Big Data and Scalable Algorithms

A Practical Match Made in the Digital Age

Page 10: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Big Data and Scalable Algorithms

Page 11: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Big Data and Scalable Algorithms

Page 12: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Big Data and Scalable Algorithms

• Nearly-Linear Time Algorithms

• Sub-Linear Time Algorithms

Page 13: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Algorithmic Paradigms: Scorecard

• Greedy often scalable (limited applications)

• Dynamic Programming usually not scalable (even when applicable)

• Divide-and-Conquer sometimes scalable

• Mathematical Programming rarely scalable

• Branch-and-Bound hardly scalable

• Multilevel Methods mostly scalable (lack of proofs)

• Local Search and Simulated Annealing can be scalable

Page 14: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Examples:

Scalable Geometry Algorithms

Sorting O(n logn)

Nearest neighbors

Delaunay Triangulation/3D convex hull

Fixed Dimensional Linear Programming O(n)

e–net in fixed-dimensional VC space

Page 15: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Examples:

Scalable Graph Algorithms

Breadth-First Search O(|V|+|E|)

Depth-First Search

Shortest Path Tree

Minimum Spanning Tree

Planarity Testing

Bi-connected components

Topological sorting

Sparse matrix vector product

Page 16: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Examples:

Scalable Numerical Algorithms

N-Body simulation O(n)

Sparse matrix vector product

FFT/Multiplication O(n log n)

Multilevel algorithms

Multigrid

Page 17: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

We need more provably-good

scalable algorithms for network

analysis, data mining, and

machine learning (in real-time

applications)

Page 18: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Methodology: Talk Outline

• Scalable Primitives and Reduction

– The Laplacian Paradigm

• Electrical Flows & Maximum Flows; Spectral Approximation; Tutte’s

embedding and Machine Learning

• Scalable Technologies:

– Spectral Graph Sparsification

• Sparse Newton’s Method and Sampling from Graphical Models

– Computing Without the Whole Data: Local Exploration and

Advanced Sampling

• Significant PageRanks

• Challenges: Computation over Dense/High-Dimensional

Models with Succinct/Sparse Representations• Social Influence; high order clustering;

Page 19: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Primitives and Reduction

Algorithm Design is like Building a Software Library

Scalable Reduction: Once scalable algorithms are

developed, they can be used as primitives or

subroutines for designing new scalable algorithms.

Page 20: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Laplacian Primitive

Solve A x = b, where A is a weighted Laplacian

matrix

Page 21: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Laplacian Primitive

Solve A x = b, where A is a weighted Laplacian

matrixA is Laplacian matrix: symmetric

non-positive off diagonal

row sums = 0

Isomorphic to weighed graphs

1 2

34

Page 22: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Compute in time

For symmetric diagonally dominant (SDD) A, any b

Scalable Laplacian Solvers

(Spielman-Teng)

Greatly improved by Koutis-Miller-Peng, Kelner-Orecchia-Sidford-

Zhu, …, Lee, Peng, and Spielman, to essentially O(m log (1/ε))

Page 23: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

The Laplacian Paradigm

To apply the Laplacian Paradigm to solve a problem

defined on massive networks or big matrices, we

attempt to reduce the computational and optimization

problem to one or more linear algebraic or spectral

graph-theoretical problems.

Beyond scalable Laplacian solvers

Page 24: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Tutte’s Embedding

Learning from labeled data on directed graphs

[Zhou-Huang-Schölkopf]

Page 25: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

For Laplacian A, is vector vT1 = 0 such that

Scalable Spectral Approximation

Can find v using inverse power method, in time

Approximate Fiedler Vector

Page 26: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Theorem: Constant degree graph G, Fiedler value λ:

scalable computation of a cut of conductance

Scalable Cheeger Cut

O

Page 27: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Electrical Flows

Electrical potentials: L φ= χst

in time Õ(m log ε-1)

Page 28: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Undirected Maximum Flow

Previously Best: O(m3/2) [Even-Tarjan 75]

Page 29: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Maximum Flow

(Christiano-Kelner-Mądry-Spielman-Teng)

Iterative Electrical Flows: φ= L-1 χst : Õ(m4/3 ε-3)

Previously Best: Õ(min(m3/2 , m n2/3)) [Goldberg-Rao]

Page 30: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Path to Scalable Maximum Flow

Previously Best: Õ(min(m3/2 , m n2/3)) [Goldberg-Rao]

Iterative Electrical Flows: φ= L-1 χst : Õ(m4/3 ε-3)

Scalable: Sherman; Kelner-Lee-Orecchia-Sidford, Peng

Page 31: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Applications of The Laplacian Paradigm

• Electrical flow computation

• Spectral approximation

• Tutte’s embedding

• Learning from labeled data on a directed graph [Zhou-Huang-Schölkopf]

• Cover time approximation [Ding-Lee-Peres]

• Maximum flows and minimum cuts [Christiano-Kelner-Madry-Spielman-Teng]

• Elliptic finite-element solver [Boman-Hendrickson-Vavasis]

• Rigidity solver [Shklarski-Toledo; Daitch-Spielman]

• Image processing [Koutis-Miller-Tolliver]

• Effective resistances of weighted graphs [Spielman-Srivastava]

• Generation of random spanning trees [Madry-Kelner]

• Generalized lossy flows [Daitch-Spielman]

• Geometric means [Miller-Pachocki]

• …

Page 32: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Techniques

• Algebraic Formulation of Network Problems

• Spectral Sparsification of Matrices and Networks

• Computing without the Whole Data: Local

Exploration of Networks

Page 33: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

For a graph G (with Laplacian L), a sparsifier is a graph

(with Laplacian ) with at most edges s.t.

Graph Spectral Sparsifiers

Improved by Batson, Spielman, and Srivastava

Page 34: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Sampling From Graphical Models

Joint probability distribution of n-dimensional

random variables x

Graphical model for

local dependencies

Sampling according to the model

Page 35: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Gibbs’ Markov Chain Monte Carlo

Process

Locally resample each variable, conditioned on the values

of its graphical neighbors

• In limit, exact mean and covariance

[Hammersley-Clifford]

• Easy to implement

• Many iterations

• Sequential

Page 36: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

A Holy Grail Sampling Question

Characterization of graphical models that have scalable

parallel sampling algorithms with poly-logarithmic depth?

Page 37: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Gaussian Markov Random Fields

• Precision matrix – symmetric positive definite

• Potential vector

• Goal: Sampling from Gaussian distribution N(Λ-1 h , Λ-1)

Page 38: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

GMRF with H-Precision Matrices

Johnson-Saunderson-Willsky (NIPS 2013)

DΛD is SDD

If the precision matrix Λ is (generalized) diagonally

dominant, then Hogwild Gibbs distributed sampling

process converges

Page 39: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

• Time complexity:

• Parallel complexity:

• Randomness complexity:

Scalable Parallel Gaussian Sampling?

It remains open even if the precision matrix is symmetric diagonally dominant (SDD).

Page 40: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

A Numerical Program for

Gaussian Sampling

1. Find the mean:

m = Λ-1 h

2. Compute an inverse square-root factor:

CCT = Λ-1

3. Sampling:

generate standard Gaussian variables z

x = C z + m

Page 41: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Canonical Inverse Square-Root

= C CT

Page 42: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Canonical Inverse Square-Root

Focus on normalized Laplacian:

I-X

= C CT

Page 43: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Newton’s Method

Page 44: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Newton’s Method

Page 45: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Sparse Newton’s Method

Page 46: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Sparse Newton Chain

Page 47: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Random-Walk Polynomials and

Sparsification

Page 48: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Path Sampling

e

e

Scalable Sparsification of Random-

Walk Polynomials

Page 49: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

• Time complexity:

• Parallel complexity:

• Randomness complexity:

Scalable Parallel Gaussian Sampling

for H-Precision Matrices

Cheng-Cheng-Liu-Peng-Teng (COLT 2015)

Page 50: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Sparse Newton’s Method

Page 51: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Matrix Roots

Nick Higham at Brain Davies’ 65 Birthday: An email from a

power company regarding the usage of electricity networks

“I have an Excel spreadsheet containing the transition

matrix of how a company’s [Standard & Poor’s] credit

rating charges from on year to the next. I’d like to be

working in eighths of a year, so the aim is to find the eighth

root of the matrix.”

Page 52: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Family of Scalable Techniques

• Algebraic Formulation of Network Problems

• Spectral Sparsification of Matrices and Networks

• Computing without the Whole Data: Local

Exploration of Networks

Page 53: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Local Network Algorithms

Page 54: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Local Network Algorithms

Page 55: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Local Network Algorithms

Page 56: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Local Network Algorithms

Page 57: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Local Network Algorithms

Page 58: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

PageRank

• PageRank: Stationary Distribution

of the Markov Process:

– Probability 1-a: random walk

on the network

– Probability a: random restarting

– Stationary Distribution:

Page 59: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Significant PageRank without Explore

the Entire Network?

Input: G, 1 ≤ Δ ≤ n, and c>1

Output: Identify a subset S ⊆ V containing:

• all nodes of PageRank at least Δ

• no nodes with PageRank less than Δ/c

O(n/Δ) time algorithm?

Page 60: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Personalized PageRank Matrix

Personalized PageRank

Page 61: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Local Personalized PageRank

Jed-Widom

Andersen-Chung-Lang

O(dmax/e)

Fogaras-Racz-Csalogany-Sarlos

Borgs-Brautbar-Chayes-Teng

O(log n/(er2))

Page 62: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

An Abstract Problem: Vector Sum

Input: v (an unknown vector from [0,1]n)

1 ≤ Δ ≤ n (a threshold value)

Query Model: ?(v,i,e)

Cost: 1/e

Output: Is sum(v) more than Δ or less than Δ/2?

Question: O(n/Δ) cost algorithm?

Page 63: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Riemann EstimatorBorgs-Brautbar-Chayes-Teng

Page 64: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Scalable Methodology: Talk Outline

• Scalable Primitives and Reduction

– The Laplacian Paradigm

• Electrical Flows & Maximum Flows; Spectral Approximation; Tutte’s

embedding and Machine Learning

• Scalable Technologies:

– Spectral Graph Sparsification

• Sparse Newton’s Method and Sampling from Graphical Models

– Computing Without the Whole Data: Local Exploration and

Advanced Sampling

• Significant PageRanks

Page 65: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

• Computation over dense models with

succinct/sparse representations• High order clustering;

• Computation over high-dimensional models

with succinct/sparse representations• Social Influence;

• Computation over incomplete data• ML

Challenges

Page 66: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Clustering Based on Personalized Page-

Rank Matrix

suspiciousness measure (Hooi-Song-Beutel-Shah-Shin-Faloutsos)

Open Question: scalable 2-approximation?

Page 67: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Reversed Diffusion Structure and

Process

Page 68: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Influence Through Social Networks

Page 69: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Reversed Diffusion Process

• Scalable Influence Maximization

• Borgs, Brautbar, Chayes, Lucier

• Tang, Shi, Xiao

• Scalable Shapley Centrality of Social Influence

• Chen and Teng

Page 70: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Given a vertex v of interest in a massive network

find a provably-good cluster near v

in time O(cluster size)

Local Graph Clustering

Spielman-Teng

Open Question: What other clustering problems can it

be extended to?

High order clustering?

Page 71: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and
Page 72: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and
Page 73: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Big Data and Network Sciences:

Going Beyond Graph Theory

• Set Functions

• Distributions

• Dynamics

• Multilayer Networks

Page 74: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Thanks

• Coauthors: Dan Spielman; Paul Christiano,

Jon Kelner, Aleksander Mądry, Dehua Cheng,

Yu Cheng, Yan Liu, Richard Peng, Christian

Borgs, Michael Brautbar, Jennifer Chayes,

Wei Chen

• Funding: NSF (large) and Simons Foundation

(curiosity-driven investigator award)

Page 75: Scalable Algorithms in the Age of Big Data and Network ...helper.ipam.ucla.edu/publications/bdcws4/bdcws4_15684.pdfShang-Hua Teng USC Scalable Algorithms in the Age of Big Data and

Shang-Hua Teng

USC

Scalable Algorithms for Big Data and

Network Sciences: Characterization,

Primitives, and Techniques