Geometric structure of graph Laplacian embeddings
Abstract
We analyze the spectral clustering procedure for identifying coarse structure in a data set x₁,…,x_n, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, we assume that the data are sampled from a mixture model supported on a manifold M embedded in Rd, and pick a connectivity length-scale ε>0 to construct a kernelized graph Laplacian. We introduce a notion of a well-separated mixture model which only depends on the model itself, and prove that when the model is well separated, with high probability the embedded data set concentrates on cones that are centered around orthogonal vectors. Our results are meaningful in the regime where ε=ε(n) is allowed to decay to zero at a slow enough rate as the number of data points grows. This rate depends on the intrinsic dimension of the manifold on which the data is supported.
Additional Information
© 2021 Nicolás García Trillos, Franca Hoffmann, Bamdad Hosseini. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Submitted 8/19; Revised 11/20; Published 3/21. The authors would like to thank Ulrike von Luxburg for pointing them to the paper Schiebinger et al. (2015) which was the starting point of this work. NGT was supported by NSF grant DMS 1912802. FH was partially supported by Caltech's von Karman postdoctoral instructorship. BH is supported in part by a postdoctoral fellowship granted by Natural Sciences and Engineering Research Council of Canada.Attached Files
Published - 19-683.pdf
Submitted - 1901.10651.pdf
Files
Name | Size | Download all |
---|---|---|
md5:577d7dd6ebe49e02f57cccdfef166731
|
565.4 kB | Preview Download |
md5:38734e04fa14de93b35b491ab585afe0
|
684.8 kB | Preview Download |
Additional details
- Eprint ID
- 102184
- Resolver ID
- CaltechAUTHORS:20200331-074327697
- NSF
- DMS-1912802
- Caltech
- Natural Sciences and Engineering Research Council of Canada (NSERC)
- Created
-
2020-03-31Created from EPrint's datestamp field
- Updated
-
2023-06-02Created from EPrint's last_modified field