Hodge theory and persistent homology
Persistent homology is a great way to get information about “approximate topology” of some space. The standard way to approach it is through filtrations of topological spaces. Say we have a simplicial complex with a filtration function . This filtration gives each simplex of a filtration level or weight. We then have a sequence of inclusions for each pair of weight values in the filtration. This leads to a sequence of maps , which turn out to decompose nicely in the sense that there exists a basis for each so that for every the matrix representing is diagonal: each basis element of is either mapped to zero or to a basis element of .
In essence, we can represent the way the homology of evolves by a collection of bars, where a bar represents a homology class that persists across a certain distance in the filtration of . Bars which persist for a long time then represent, perhaps, features of the filtered space which are close to being homology classes.
Another way to approach approximate topology of a weighted simplicial complex is with discrete Hodge theory. Here we start with a weighted simplicial complex ; the weights on -simplices give a natural inner product structure on . This means that we can produce adjoints of the boundary maps . The Hodge Laplacian of is the graded operator . On , this restricts to . The main theorem of discrete Hodge theory is that . The -cochains in are canonical representatives for homology classes of , and the eigenvectors of with eigenvalues close to can be seen as representing -chains which are almost homology classes.
I’ve wanted a way to connect these two perspectives for a while now. I recently discovered a paper that provides a straightforward way of relating them. The paper is Persistent Homology and Floer-Novikov Theory by Michael Usher and Jun Zhang. On their way to building a connection between persistence and Floer theory, the authors develop a new way of looking at persistent homology that makes the analogy with Hodge theory much more evident.
The key is to express persistent homology in terms of a non-Archimedean singular value decomposition of the boundary operator. To see what this means, we’ll have to go through a little bit of background. Fortunately, we don’t need to go into all the generality that Usher and Zhang do.
Definition: A valuation on a field is a map such that
- iff ,
- , and
- .
These conditions make into a (multiplicative) norm or absolute value on : iff , , and . Indeed, this is what you might call an ultrametric absolute value. Every field has the trivial valuation with if and otherwise, and this is the only valuation we will actually need.
Definition: A non-Archimedean normed vector space over is a vector space together with a filtration function such that
- iff ,
- for , and
- .
Taking we get iff , , and . Again, this is a non-Archimedean or ultrametric-type norm on .
A simplicial complex with a filtration function has a natural non-Archimedean norm on its space of -chains , where has the trivial valuation. This norm is given by the filtration function for with .
Definition: Let be a non-Archimedean normed vector space. Subspaces and of are orthogonal if for all , . That is, . Compare this to orthogonality for an inner product space, where we have when and are orthogonal. Similarly, a tuple of vectors is orthogonal if for all choices of .
Not all non-Archimedean normed vector spaces admit an orthogonal basis. However, if one exists, any basis can be orthogonalized via a sort of Gram-Schmidt algorithm. This orthogonalized basis is not unique.
We’re finally at the point where we can define a singular value decomposition. Let and be non-Archimedean normed vector spaces with orthogonal bases, and let be a linear map of rank . A singular value decomposition of is a choice of orthogonal ordered bases for , for such that
- is an orthogonal ordered basis for
- is an orthogonal ordered basis for
- for
- .
The last condition is what determines the ordering for the singular values. If we write this in terms of norms, this says that . The singular values should be the inverses of these, i.e., .
Usher and Zhang show that these non-Archimedean SVDs exist, and can be computed via a Gaussian elimination-like algorithm. Now let’s interpret this when , with the norm given by the simplicial filtration. We get bases of and of , and the basis has elements corresponding to -cycles which are either the boundary of some -chain, or are not. The -cycles which are never killed by a boundary are those for , and the others have , and hence are paired with basis elements of . A singular value less than 1 corresponds to a pair of a -cycle and a -chain bounded by , where the is born at an earlier filtration level than . In persistent homology terms, we have a bar of length . The for correspond to bars starting at and extending to infinity, because they are not the boundary of any -chain, no matter how large its norm.
So we’ve arrived at the remarkable fact that the non-Archimedean SVD of contains all the information in the barcode for the -dimensional persistent homology of the filtered complex . We’re nearly to the connection with the Hodge-theoretic version of approximate homology—all we need to do is recall the connection between the eigendecomposition of and the singular value decomposition of .
Since contains , the eigenvalues of split into three parts: the set of zero eigenvalues, the nonzero eigenvalues of , and the nonzero eigenvalues of . We define the up-Laplacian and the down-Laplacian . The eigenvalues of are the squares of the singular values of ; restricting to just means we ignore the subspace spanned by the nonzero eigenvalues of . Returning to the interpretation of the SVD, we see that the eigenvectors of with nonzero eigenvalues correspond to -cycles which are the boundaries of certain -chains; the eigenvalues tell us about the relative sizes of the cycles and the chains that they bound. In particular, an eigenvector with small eigenvalue corresponds to a -cycle which is only filled in by a large -chain.
What about the other nonzero eigenvalues, the ones that come from ? I think the best way to look at these is to switch to cohomology. Eigenvectors of form an orthogonal basis of -cocycles which are paired with an orthogonal basis of -cochains that they cobound. The eigenvalues are the relative norms of the cocycles and the cochains that kill them. A small eigenvalue represents a cocycle which almost fails to be a coboundary—it requires a lot of “energy” to produce it as a coboundary, perhaps.
The eigendecomposition of the full Hodge Laplacian then contains representatives of the genuine homology classes of , as , as well as approximate homology (eigenvectors of ) and approximate cohomology (eigenvectors of ) of .
In any event, we have a formal analogy between discrete Hodge theory and persistent homology by way of two types of singular value decomposition. One might interpret the distinction between the two in terms of the way the size of chains are measured—Hodge theory explicitly works with norms to measure size, while persistent homology feels more like using an norm (although this is not exactly the case). But both approaches try to find “small” -cycles that are killed by “large” -chains. This relationship should be helpful in developing a deeper understanding and interpretation of both persistence and Hodge theory.