Finding the Homology of Submanifolds with High Confidence from Random Samples

Partha Niyogi; Stephen Smale; Shmuel Weinberger. 12 November, 2004.
Communicated by Partha Niyogi.


Recently there has been a lot of interest in geometrically motivated approaches to data analysis in high dimensional spaces. We consider the case where data is drawn from sampling a probability distribution that has support on or near a submanifold of Euclidean space. We show how to "learn" the homology of the submanifold with high confidence. We discuss an algorithm to do this and provide learning-theoretic complexity bounds. Our bounds are obtained in terms of a condition number that limits the curvature and nearness to self-intersection of the submanifold. We are also able to treat the question where the data is noisy and lies near rather than on the submanifold in question.

Original Document

The original document is available in Postscript (uploaded 12 November, 2004 by Partha Niyogi).