Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published August 20, 2012 | Published
Journal Article Open

CLaSPS: A New Methodology for Knowledge Extraction from Complex Astronomical Data Sets

Abstract

In this paper, we present the Clustering-Labels-Score Patterns Spotter (CLaSPS), a new methodology for the determination of correlations among astronomical observables in complex data sets, based on the application of distinct unsupervised clustering techniques. The novelty in CLaSPS is the criterion used for the selection of the optimal clusterings, based on a quantitative measure of the degree of correlation between the cluster memberships and the distribution of a set of observables, the labels, not employed for the clustering. CLaSPS has been primarily developed as a tool to tackle the challenging complexity of the multi-wavelength complex and massive astronomical data sets produced by the federation of the data from modern automated astronomical facilities. In this paper, we discuss the applications of CLaSPS to two simple astronomical data sets, both composed of extragalactic sources with photometric observations at different wavelengths from large area surveys. The first data set, CSC+, is composed of optical quasars spectroscopically selected in the Sloan Digital Sky Survey data, observed in the x-rays by Chandra and with multi-wavelength observations in the near-infrared, optical, and ultraviolet spectral intervals. One of the results of the application of CLaSPS to the CSC+ is the re-identification of a well-known correlation between the α_(OX) parameter and the near-ultraviolet color, in a subset of CSC+ sources with relatively small values of the near-ultraviolet colors. The other data set consists of a sample of blazars for which photometric observations in the optical, mid-, and near-infrared are available, complemented for a subset of the sources, by Fermi γ-ray data. The main results of the application of CLaSPS to such data sets have been the discovery of a strong correlation between the multi-wavelength color distribution of blazars and their optical spectral classification in BL Lac objects and flat-spectrum radio quasars, and a peculiar pattern followed by blazars in the WISE mid-infrared colors space. This pattern and its physical interpretation have been discussed in detail in other papers by one of the authors.

Additional Information

© 2012 American Astronomical Society. Received 2012 February 22; accepted 2012 June 11; published 2012 July 31. R. D'Abrusco acknowledges the financial support of the US Virtual Astronomical Observatory, which is sponsored by the National Science Foundation and the National Aeronautics and Space Administration. We acknowledge partial support by NASA Contract NAS-39073 (CXC). The CLaSPS method is implemented in R (R Development Core Team 2012), an open-source free statistical environment developed under the GNU GPL (http://www.r-project.org/). TOPCAT and STILTS (http://www.star.bris.ac.uk/∼mbt/topcat/) (Taylor 2005) were extensively used for the preparation and manipulation of the tabular data in this work.

Attached Files

Published - 0004-637X_755_2_92.pdf

Files

0004-637X_755_2_92.pdf
Files (4.5 MB)
Name Size Download all
md5:f8029a55fa4face6609738c8f3891faa
4.5 MB Preview Download

Additional details

Created:
September 14, 2023
Modified:
October 23, 2023