Data mining a large digital sky survey: from the challenges to the scientific results
Abstract
The analysis and an efficient scientific exploration of the digital Palomar observatory sky survey represents a major technical challenge. The input data set consists of 3 Terabytes of pixel information, and contains a few billion sources. We describe some of the specific scientific problems posed by the data, including searches for distant quasars and clusters of galaxies, and the data-mining techniques we are exploring in addressing them Machine- assisted discovery methods may become essential for the analysis of such multi-Terabyte data sets. New and future approaches involve unsupervised classification and clustering analysis in the Giga-object data space, including various Bayesian techniques. In addition to the searches for known types of objects in this database, these techniques may also offer the possibility of discovering previously unknown, rare types of astronomical objects.
Additional Information
© 1997 Society of Photo-optical Instrumentation Engineers (SPIE). This work was supported in part by the funds from NASA, the Norris Foundation, and the NSF PYI award AST-9157412. We acknowledge the efforts of the POSS-II team at Palomar, the digitization team at STScI. N. Weir and U. Fayyad made important initial contributions to this project. We also thank J. Kennefick, J. Darling, and V. Desai for their contributions to the quasar search project. The DPOSS work at Caltech is a part of the CRONA international collaboration.Attached Files
Published - 98.pdf
Files
Name | Size | Download all |
---|---|---|
md5:5ccaad70963c19467ad7f5c3c9a4b6e9
|
690.2 kB | Preview Download |
Additional details
- Eprint ID
- 87747
- Resolver ID
- CaltechAUTHORS:20180711-101525969
- NASA
- Kenneth T. and Eileen L. Norris Foundation
- NSF
- AST-9157412
- Created
-
2018-07-11Created from EPrint's datestamp field
- Updated
-
2021-11-15Created from EPrint's last_modified field
- Series Name
- Proceedings of SPIE
- Series Volume or Issue Number
- 3164