Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published October 30, 1997 | Published
Book Section - Chapter Open

Data mining a large digital sky survey: from the challenges to the scientific results

Abstract

The analysis and an efficient scientific exploration of the digital Palomar observatory sky survey represents a major technical challenge. The input data set consists of 3 Terabytes of pixel information, and contains a few billion sources. We describe some of the specific scientific problems posed by the data, including searches for distant quasars and clusters of galaxies, and the data-mining techniques we are exploring in addressing them Machine- assisted discovery methods may become essential for the analysis of such multi-Terabyte data sets. New and future approaches involve unsupervised classification and clustering analysis in the Giga-object data space, including various Bayesian techniques. In addition to the searches for known types of objects in this database, these techniques may also offer the possibility of discovering previously unknown, rare types of astronomical objects.

Additional Information

© 1997 Society of Photo-optical Instrumentation Engineers (SPIE). This work was supported in part by the funds from NASA, the Norris Foundation, and the NSF PYI award AST-9157412. We acknowledge the efforts of the POSS-II team at Palomar, the digitization team at STScI. N. Weir and U. Fayyad made important initial contributions to this project. We also thank J. Kennefick, J. Darling, and V. Desai for their contributions to the quasar search project. The DPOSS work at Caltech is a part of the CRONA international collaboration.

Attached Files

Published - 98.pdf

Files

98.pdf
Files (690.2 kB)
Name Size Download all
md5:5ccaad70963c19467ad7f5c3c9a4b6e9
690.2 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
January 14, 2024