Simplicial Complexes and Their...

Simplicial Complexes and Their Applications

Abhineet Agarwal, Noah Goss

February 2019

1 A Review of Simplices

Definition : Let S be a discrete set. An abstract simplicial complex is a col-lection X of finite subsets of S, closed under restriction: ∀σ ∈ X, all subsets ofσ ⊂ X.

Let’s recall some properties of simplicial complexes:

• Every σ ∈ X is called a simplex.

• The k simplex corresponds to σ ⊂ X s.t. |σ| = k + 1

• For a given k simplex, the faces of the k simplex are the simplices corre-sponding to all subsets of σ ⊂ X.

Definition : For k+1 affinely independent points in Rk+1, i.e. a set of pointsu0, u1 . . . uk, s.t. u1−u0, . . . , uk−u0 are linearly independent, we can topologizea simplicial complex X as the quotient space built from topological simplices.The standard K-simplex can then be defined as

∆k = {x0uo + x1u1 + . . . xkuk :

k∑i=0

xi = 1} (1)

Definition : An independence complex, JO, is an abstract simplicial complexdefined on the vertex set O whose k-simplices are collections of k+1 independentobjects.

Applications of these independence complexes can be seen in the form of

• Linearly independent vectors

• Linearly independent Differential Equations

• Statistical independence of random variables

1

2 Vietoris-Rips Complexes and Point Clouds

It will be useful for the rest of this lecture to start by defining the concepts ofHomotopy Equivalence and of Convex Hulls.

Definition : Two continuous functions from one topological space to anotherare called homotopic if one can be continuously deformed into the other. Twospaces X and Y are considered Homotopic Equivalent if there exists con-tinous maps f : X → Y and g : Y → X s.t. g ◦ f is homotopic to the identitymapping IdY and g ◦ f is homotopic to the identity mapping IdX .

Definition : The Convex Hull of a set X of points defined in Rn, is thesmallest convex set that contains X. Recall that a set of points is convex if itcontains a line segment connecting each possible combinations of pairs of pointsin X.

Figure 1: In the plane of R2, a very useful way of thinking about the convexhull of a set of points is a rubber band wrapping around the bordering elements

Definition : Consider a discreet subset Q ⊂ Rn sampled from a submanifold

2

(recall all manifolds are locally homeomorphic to Rn), we can define this discreetsubset as a point cloud. One could reasonably ask if there is some method bywhich we can reconstruct a picture of our original submanifold just given thedatum contained in our point cloud.

Definition : The Vietoris-Rips Complex of scale ε defined on Q, V Rε(Q),is the simplicial complex, whose simplices are the finite collections of points inQ of pairwise distance ≤ ε. (Fun fact, the eponymous Austrian mathematicianLeopold Vietoris lived to 110!)

Now that we have constructed an abstract simplicial complex from our pointcloud, it is worth asking how useful this construct is in gleaning informationabout our original submanifold. Some immediate questions we can ask are/con-cerns we may have are :

• What is a good choice of ε for our point cloud?

• What time complexity would an algorithm computing Vietoris-Rips com-plexes for a point cluster have? (Exponential O(2n))

• Although our point cloud lives in Rn, for sufficiently dense clusters ofpoints, we may construct simplices of order much greater than n.

We can define the projection map S : VR → Rn by taking the vertices (0 sim-plices) in VR to Q and taking the K simplices in VR to the convex hulls ofthe associated vertices in Q. The image of this projection S in Rn defines theshadow of our Vietoris-Rips complex.

Can we generally expect our shadow to be homeomorphic to our Vietoris-Ripscomplex? Unfortunately not, as generally speaking the domain of our projectionwill likely be of higher dimension than our image in Rn. Unfortunately homo-topy equivalence is also not something we can generally expect either, but ingeneral Vietoris-Rips complexes have applications to topological computation,and have value in characterizing data with large holes.

3 The Cech Complex

One method to extract topological characteristics of a submanifold from a pointcloud that better avoids unwanted higher dimensional features appearing is theCech Complex.

Definition : Given a point cloud Q the Cech Complex Cε is the simplicialcomplex built on Q given by: a k simplex in Cε is a collection of k+1 distinctelements xi of Q such that the net intersection of diameter ε balls centered onthe xi’s is non-empty.

3

Figure 2: A Vietoris-Rips complex defined in R2 for a point cloud. Red pointsare 0-simplices, Edges are 1-simplices, Light Blue triangles define 2-simplices,Dark Blue defines 3-simplices

One can immediately see that the algorithm for constructing Cech Complexeswill be more computationally intensive than their Vietoris-Rips counterparts, sothe question should be asked: is there any advantage to trying to characterizeour point cloud with Cech Complexes? Short answer: yes because Homotopyequivalence is guaranteed by the Nerve Lemma between the union of the ballsand the Cech Complex.

Definition : Given a collection U = {Uα} of compact subsets of a topologicalspace X, we can construct the nerve N (U) as follows: the k-simplices containedin N (U) correspond to non-empty intersections of k+1 distinct elements in U .

Definition : A topological space X is contractible if the identity map on Xis homotopic to some constant map. More intuitively, a topological space X iscontractible if it can be continuously shrunk to a point in space.

The Nerve Lemma : If U is a finite collection of open contractible sub-

4

Figure 3: An example of the Cech Complex constructed by points sampled froma circle in the plane

sets of X with all non-empty intersections of sub-collections of U contractible,then N (U) is homotopic to

⋃α Uα.

Figure 4: Which of the following spaces are contractible?

Theorem : ∀ε > 0 Cε ⊂ VRε ⊂ C2ε and thus if we know we can piece togethera solid picture of our submanifold using data from the point cloud with CechComplexes of ε and 2ε then, we know that the Vietoris-Rips Complex of ε willbe a good way of topologically characterizing our data as well.

5

4 Computational Applications

Now that we have defined abstract simplicial complexes, something that we areinterested in is the computational applications of simplicies – one of these ar-eas is Machine Learning. Broadly in Machine Learning, we have the followingtask: we are given some training data, and some test data that we would liketo classify . For example, we are given the heights and weights and from thatwould like to classify if a person is male or female. The point of training data isto train our machine learning algorithm so that it is able to perform well whenhanded test data.

We can formalise this definition as follows; we are given labelled training data:

(−→x1, y1), ...(−→xn, yn) (2)

We would use this training data to learn a f : X → Y , such that when givensome test data, we are able to predict it correctly.

4.1 Linearly Seperable Data and Perceptron

One such data set we can consider is the linearly seperable data set as we see inthe picture below. Therefore, we see there exists a linearly seperable boundary

Figure 5: Linearly Seperable Data Set

and that implies that we can simply have the following function to classify ourdata:

g(−→x ) = −→w .−→x + b = 0 (3)

6

f = linear classifier = sign(g(−→x )). Therefore, learning f implies learning w suchthat we can classify our training data. We can do this using the perceptronalgorithm as follows:

Perceptron Algorithm

Given: labelled training data S = (−→x1,y1),...(−→xn,yn)Step 1: Initialize −→wo = 0For t = 1,2,3,....If there exists (−→xi ,yi)) such that sign(−→w t−1.

−→x 6= y) then do the following :

−→w t = −→w t−1 + yx (4)

Terminate when there does not exist a mistake.

Claim: Perceptron algorithm will make at most T = (Rγ )2, where γ is the

distance of the closest point to the boundary and R = max||−→x ||

Figure 6: Perceptron Algorithm

Pf: Since the data is linearly seperarable, there exists a plane −→w∗ such thatthe data is linearly seperable. Suppose the perceptron algorithm makes a mis-take in iteration t, then

−→w t.−→w ∗ = (−→w t−1 + y−→x ).−→w ∗ ≥ −→w t−1.

−→w ∗ + γ (5)

7

||−→wt||2 = ||−→w t−1+yx||2 = ||−→w t−1||2+||yx||2+2y(−→w t−1.−→x ) ≤ ||−→w t−1||2+R2 (6)

Therefore after T rounds:

Tγ ≤ −→w T .−→w ∗ ≤ ||−→w T ||||−→w ∗|| ≤ R

√T (7)

Therefore: T ≤ (Rγ )2

Non-Linearly Seperable Data and Simplices

Data is rarely Linearly Seperable, usually we require complex data boundariesfor classification as shown below:

Figure 7: NonLinear Decision Boundary for Classification

How do we classify this data? Clearly generating a linearly seperable boundaryis impossible. But what if there existed a representation space such that wecould linearly seperate our data? It turns out there is: lets simply map ourdata points onto vertices of a simplex! Consider the mapping:

φ(−→xi) =

00...1...0

(8)

8

Then our linear seperation boundary simply becomes:

−→w∗ ==

y1y2...yn

(9)

Therefore, we can map our data to another representation space. The geometryof our points becomes a simplex and by working in this space, our data is nowlinearly seperable! Of course, this is not the only mapping(it may not even bethe best mapping), but the point is that this is one example of a mapping wecan do to interesting geometries such that we can work in higher-dimensionalspace that better respects our algorithms.

5 Strategy Complexes

Before we formally define strategy complexes, let us consider the following ex-ample: Imagine an ambulance rescue in an old complicated city, perhaps afteran earthquake. In the figure below, we have an ambulance A trying to reach apatient X, and they have a variety of paths. For example, from A → X directlyor A→B→C. We can represent the strategy described as the solid triangle inthe figure. In other words, this is a non-deterministic graph but it is also asimplex! The vertices of the simplex are the individual actions to be executedat any particular location.

Figure 8: Strategy Complex

We can also augment the graph above to include more actions:

9

Figure 9: Augmented Strategy Complex

Note that we do not want to include the cycle B to C since it is not a validstrategy to reach X. This is because if we did then we would be caught in aninfinite cycle. Now that we have this example in mind, we can formally definestrategy complexes and non-deterministic graphs.

Definition : A nondeterministic graph G = (V,A) is a set of states V anda collection of (non-deterministic) actions A. Each a ∈ A is a nonempty set ofdirected edges: {(v, u1), ....} with v and all possible ui in V. We refer to v as A’ssource and to each ui as a non-deterministic target of A. G is also consideredacyclic if for a sequence of states: v0, ....vk in V such that (vi, vi+1) ∈ A, if noneof the possible paths have v0 = vj for j ≥ 1

Definition : Given a non-deterministic graph G = (V,A), we let ∆G be thesimplicial complex whose simplices are the acyclic collections of actions B ⊂ A.If V = Φ then we also let ∆G = Φ. We refer to ∆G as G’s strategy complexand to every simplex in ∆G as a (non-deterministic) strategyExamples: Consider the following graph.

Figure 10: The directed graph on the left defines the strategy complex on theright

10

Note that while you would naively expect that this directed graph wouldgenerate a simplex that is a tetrahedron because of all your four actions youcan take, this is not the case. Two of the actions 1→2 and 2→1, could give riseto a cycle in the graph, so no simplex of the strategy complex can contain boththese actions. The complex is in fact generated by two triangles.

The two triangles, as well as three of the five edges in the complex, consti-tute strategies for attaining state 3 in the graph. The central edge, consistingof the actions {1→ 3, 2→ 3}. Also note that a strategy complex may consist ofstrategies for a variety of goals. For instance, the top left edge of the complexin Fig.5, comprising the actions {1→ 3, 1→ 2} is simply the strategy that saysmove away from 1. Similarly, the edge below that is the strategy that says moveaway from 2.

For contrast, consider the following figure.

Figure 11: The graph on the left

It contains two actions, ones each at state 1 and state 2. Each Each action hastwo nondeterministic outcomes. The two actions cannot appear together asasimplex since, depending on the actual nondeterministic transitions at runtime,these actions could cause cycling in the graph between states 1 and 2. As a re-sult, the strategy complex consists of two isolated vertices, representing the twostrategies “move away from state 1” and “move away from state 2.” In particu-lar, there are no strategies guaranteed to attain state 3 from the other two states.

In fact, it was proved that any directed graph that can be written as the disjointunion of its strongly connected components generate a strategy complex topo-logically similar to a sphere of dimension n-k-1, where n is the number of statesin the graph and k is the number of strongly connected components. All otherdirected graphs produce contractible strategy complexes – recall that meansroughly that the directed graph can produce strategy complexes that can beshrunk to a point.

11

5.1 Loopback Graphs and Complexes

Let us modify the graph in Figure 9 by adding artificial deterministic transitionsfrom state 3 to each of states 1 and 2. We call these added transitions loopbackactions.

Figure 12: A loopback graph and loopback complex. The two triangular endcapsoutlined in dashed red are not part of the complex, since each give rise to a cycle.

The complex in this case looks roughly like a polygonal cylinder. The complexis homotopic to a circle(we can also see that by an application of the previousstatement) Now we also add loopback graphs to the non-deterministic graph of

Figure 13: A loopback graph and loopback complex.

earlier. The complex is shown to the right, and it is homotopic to a point! Thisgives us a slight hint to a later theorem.

In fact, no matter how complicated the non-deterministic graph, if we addall loopback actions to it that transition from state s to the remaining states,then the resulting loopback complex will always be homotpic either to a sphereor a point. A phere tells us that there is a strategy guaranteed to attain states from all states in the graph; a point tell us no such strategy exists.

Definitions : Let G = (V,A) be a nondeterministic graph and suppose s∈ V is some desired stop state. We make the following definitions:

12

• G contains a complete guaranteed strategy for attaining s if there is someacyclic set of actions B ⊂ A such that B contains at least one actionswith source v for every v ∈ V \ {s}. Observe that any possible path inthe graph(V,B) terminates at some vk 6= s, may be extended to a pathconverging at s. B is a complete guaranteed strategy for attainings.

• Define G←s to be the non-deterministic graph identical to G except thesource s have been discarded, replaced instead by (|V |−1) many loopbackactions, each consisting of a single edge from s to some v, with v rangingover V \ {s}

• ∆G←s is the strategy complex associated with G←s

Theorem: Let G = (V,A) be a non-deterministic graph and s ∈ V. If G containsa complete guaranteed strategy for attaining s, then ∆G←s

is homotopic tosphere Sn−2, with n = |V |. Otherwise, G is contractible.

13

Simplicial Complexes and Their...

Documents

Transcript of Simplicial Complexes and Their...