Persistent homology is a great way to get information about “approximate topology” of some space. The standard way to approach it is through filtrations of topological spaces. Say we have a simplicial complex XX with a filtration function f:XRf: X \to \mathbb{R}. This filtration gives each simplex of XX a filtration level or weight. We then have a sequence of inclusions iab:Xa=f1((,a])f1((,b])=Xbi_a^b: X^a = f^{-1}((-\infty,a]) \to f^{-1}((-\infty, b]) = X^b for each pair of weight values aba \leq b in the filtration. This leads to a sequence of maps Hk(iab):Hk(Xa)Hk(Xb)H_k (i_a^b): H_k(X^a) \to H_k(X^b), which turn out to decompose nicely in the sense that there exists a basis for each Hk(Xa)H^k(X^a) so that for every aba \leq b the matrix representing HkiabH_k i_a^b is diagonal: each basis element of Hk(Xa)H^k(X^a) is either mapped to zero or to a basis element of Hk(Xb)H^k(X^b).

In essence, we can represent the way the homology of XaX^a evolves by a collection of bars, where a bar represents a homology class that persists across a certain distance in the filtration of XX. Bars which persist for a long time then represent, perhaps, features of the filtered space which are close to being homology classes.

Another way to approach approximate topology of a weighted simplicial complex is with discrete Hodge theory. Here we start with a weighted simplicial complex XX; the weights on kk-simplices give a natural inner product structure on Ck(X;R)C_k(X;\mathbb{R}). This means that we can produce adjoints of the boundary maps k\partial_k. The Hodge Laplacian of XX is the graded operator Δ=(+)2=+\Delta = (\partial + \partial^\ast)^2 = \partial \partial^\ast + \partial^\ast \partial. On CkC_k, this restricts to Δk=k+1k+1+kk\Delta_k = \partial_{k+1}\partial_{k+1}^\ast + \partial_k^\ast\partial_k. The main theorem of discrete Hodge theory is that kerΔkHk(X;R)\ker \Delta_k \simeq H_k(X;\mathbb{R}). The kk-cochains in kerΔk\ker \Delta_k are canonical representatives for homology classes of XX, and the eigenvectors of Δk\Delta_k with eigenvalues close to 00 can be seen as representing kk-chains which are almost homology classes.

I’ve wanted a way to connect these two perspectives for a while now. I recently discovered a paper that provides a straightforward way of relating them. The paper is Persistent Homology and Floer-Novikov Theory by Michael Usher and Jun Zhang. On their way to building a connection between persistence and Floer theory, the authors develop a new way of looking at persistent homology that makes the analogy with Hodge theory much more evident.

The key is to express persistent homology in terms of a non-Archimedean singular value decomposition of the boundary operator. To see what this means, we’ll have to go through a little bit of background. Fortunately, we don’t need to go into all the generality that Usher and Zhang do.

Definition: A valuation on a field KK is a map ν:KR{}\nu: K \to \mathbb{R} \cup \{\infty\} such that

  1. ν(x)=\nu(x) = \infty iff x=0x = 0,
  2. ν(xy)=ν(x)+ν(y)\nu(xy) = \nu(x) + \nu(y), and
  3. ν(x+y)min(ν(x),ν(y))\nu(x+y) \geq \min(\nu(x),\nu(y)).

These conditions make N(x)=exp(ν(x))N(x) = \exp(- \nu(x)) into a (multiplicative) norm or absolute value on KK: exp(ν(x))=0\exp(-\nu(x)) = 0 iff ν(x)=\nu(x) = \infty, N(xy)=N(x)N(y)N(xy) = N(x)N(y), and N(x+y)max(N(x),N(y))N(x)+N(y)N(x+y) \leq \max(N(x),N(y)) \leq N(x) + N(y). Indeed, this is what you might call an ultrametric absolute value. Every field has the trivial valuation with ν(x)=\nu(x) = \infty if x=0x = 0 and ν(x)=0\nu(x) = 0 otherwise, and this is the only valuation we will actually need.

Definition: A non-Archimedean normed vector space over KK is a vector space CC together with a filtration function :CR{}\ell: C \to \mathbb{R} \cup \{-\infty\} such that

  1. (x)=\ell(x) = -\infty iff x=0x = 0,
  2. (λx)(x)ν(λ)\ell(\lambda x) - \ell(x) - \nu(\lambda) for λK\lambda \in K, and
  3. (x+y)max((x),(y))\ell(x + y) \leq \max(\ell(x),\ell(y)).

Taking v=exp((v))\|{v}\| = \exp( \ell(v)) we get x=0\|x\| = 0 iff x=0x = 0, λx=N(λ)x\|\lambda x\| = N(\lambda)\|x\|, and x+ymax(x,y)\|x+y\| \leq \max(\|x\|,\|y\|). Again, this is a non-Archimedean or ultrametric-type norm on CC.

A simplicial complex XX with a filtration function f:XRf: X \to \mathbb{R} has a natural non-Archimedean norm on its space of kk-chains Ck(X;K)C_k(X;K), where KK has the trivial valuation. This norm is given by the filtration function (iaiσi)=maxf(σi)\ell(\sum_{i} a_i \sigma_i) = \max f(\sigma_i) for ii with ai0a_i \neq 0.

Definition: Let CC be a non-Archimedean normed vector space. Subspaces VV and WW of CC are orthogonal if (v+w)=max((v),(w))\ell(v+w) = \max(\ell(v),\ell(w)) for all vVv \in V, wWw \in W. That is, v+w=max(v,w)\|v + w\| = \max(\|v\|,\|w\|). Compare this to orthogonality for an inner product space, where we have v+w2=v2+w2\|v + w\|^2 = \|v\|^2 + \|w\|^2 when vv and ww are orthogonal. Similarly, a tuple of vectors (v1,,vr)(v_1,\ldots,v_r) is orthogonal if (λivi)=maxi(λiwi)\ell(\sum \lambda_i v_i) = \max_i (\lambda_i w_i) for all choices of λ1,,λrK\lambda_1,\ldots,\lambda_r \in K.

Not all non-Archimedean normed vector spaces admit an orthogonal basis. However, if one exists, any basis can be orthogonalized via a sort of Gram-Schmidt algorithm. This orthogonalized basis is not unique.

We’re finally at the point where we can define a singular value decomposition. Let CC and DD be non-Archimedean normed vector spaces with orthogonal bases, and let A:CDA: C \to D be a linear map of rank rr. A singular value decomposition of AA is a choice of orthogonal ordered bases (y1,,yn)(y_1,\ldots,y_n) for CC, (x1,,xn)(x_1,\ldots,x_n) for DD such that

  1. (yr,,yn)(y_r,\ldots,y_n) is an orthogonal ordered basis for kerA\ker A
  2. (x1,,xr)(x_1,\ldots,x_r) is an orthogonal ordered basis for im A\text{im } A
  3. Ayi=xiA y_i = x_i for 1ir1 \leq i \leq r
  4. C(y1)D(x1)C(yr)D(xr)\ell_C(y_1) - \ell_D(x_1) \geq \cdots \geq \ell_C(y_r)- \ell_D(x_r).

The last condition is what determines the ordering for the singular values. If we write this in terms of norms, this says that y1/x1yr/xr\|y_1\|/\|x_1\| \geq \cdots \geq \|y_r\|/\|x_r\|. The singular values should be the inverses of these, i.e., σi=xi/yi\sigma_i = \|x_i\|/\|y_i\|.

Usher and Zhang show that these non-Archimedean SVDs exist, and can be computed via a Gaussian elimination-like algorithm. Now let’s interpret this when A=k+1:Ck+1(X;K)Zk(X;K)=kerkA = \partial_{k+1}: C_{k+1}(X;K) \to Z_k(X;K) = \ker \partial_k, with the norm given by the simplicial filtration. We get bases YY of Ck+1(X;K)C_{k+1}(X;K) and XX of Zk(X;K)Z_k(X;K), and the basis XX has elements corresponding to kk-cycles which are either the boundary of some (k+1)(k+1)-chain, or are not. The kk-cycles which are never killed by a boundary are those xix_i for i>ri > r, and the others have iri \leq r, and hence are paired with basis elements yiy_i of Ck+1C_{k+1}. A singular value σi\sigma_i less than 1 corresponds to a pair of a kk-cycle xix_i and a (k+1)(k+1)-chain yiy_i bounded by xix_i, where the xix_i is born at an earlier filtration level than yiy_i. In persistent homology terms, we have a bar of length (yi)(xi)\ell(y_i) - \ell(x_i). The xix_i for i>ri > r correspond to bars starting at (xi)\ell(x_i) and extending to infinity, because they are not the boundary of any (k+1)(k+1)-chain, no matter how large its norm.

So we’ve arrived at the remarkable fact that the non-Archimedean SVD of k+1\partial_{k+1} contains all the information in the barcode for the kk-dimensional persistent homology of the filtered complex XX. We’re nearly to the connection with the Hodge-theoretic version of approximate homology—all we need to do is recall the connection between the eigendecomposition of Δk\Delta_k and the singular value decomposition of k+1\partial_{k+1}.

Since kerk\ker \partial_k contains im k+1\text{im } \partial_{k+1}, the eigenvalues of Δk\Delta_k split into three parts: the set of zero eigenvalues, the nonzero eigenvalues of k+1k+1\partial_{k+1}\partial_{k+1}^*, and the nonzero eigenvalues of kk\partial_{k}^*\partial_k. We define the up-Laplacian Δk+=k+1k+1\Delta_k^+ = \partial_{k+1}\partial_{k+1}^* and the down-Laplacian Δk=kk\Delta_k^- = \partial_{k}^*\partial_k. The eigenvalues of Δk+\Delta_k^+ are the squares of the singular values of k+1\partial_{k+1}; restricting to kerk=Zk\ker \partial_k = Z_k just means we ignore the subspace spanned by the nonzero eigenvalues of Δk\Delta_k^-. Returning to the interpretation of the SVD, we see that the eigenvectors of Δk+\Delta_k^+ with nonzero eigenvalues correspond to kk-cycles which are the boundaries of certain (k+1)(k+1)-chains; the eigenvalues tell us about the relative sizes of the cycles and the chains that they bound. In particular, an eigenvector with small eigenvalue corresponds to a kk-cycle which is only filled in by a large (k1)(k-1)-chain.

What about the other nonzero eigenvalues, the ones that come from Δk\Delta_k^-? I think the best way to look at these is to switch to cohomology. Eigenvectors of Δk\Delta_k^- form an orthogonal basis of kk-cocycles which are paired with an orthogonal basis of (k1)(k-1)-cochains that they cobound. The eigenvalues are the relative norms of the cocycles and the cochains that kill them. A small eigenvalue represents a cocycle which almost fails to be a coboundary—it requires a lot of “energy” to produce it as a coboundary, perhaps.

The eigendecomposition of the full Hodge Laplacian Δk\Delta_k then contains representatives of the genuine homology classes of XX, as kerΔk\ker \Delta_k, as well as approximate homology (eigenvectors of Δk+\Delta_k^+) and approximate cohomology (eigenvectors of Δk\Delta_k^-) of XX.

In any event, we have a formal analogy between discrete Hodge theory and persistent homology by way of two types of singular value decomposition. One might interpret the distinction between the two in terms of the way the size of chains are measured—Hodge theory explicitly works with 2\ell^2 norms to measure size, while persistent homology feels more like using an \ell^\infty norm (although this is not exactly the case). But both approaches try to find “small” kk-cycles that are killed by “large” (k+1)(k+1)-chains. This relationship should be helpful in developing a deeper understanding and interpretation of both persistence and Hodge theory.