Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published December 2021 | Supplemental Material + Accepted Version + Published
Journal Article Open

A method for finding anomalous astronomical light curves and their analogues

Abstract

Our understanding of the Universe has profited from deliberate targeted studies of known phenomena, as well as from serendipitous unexpected discoveries, such as the discovery of a complex variability pattern in the direction of KIC 8462852 (Boyajian's star). Upcoming surveys such as the Vera C. Rubin Observatory Legacy Survey of Space and Time will explore the parameter space of astrophysical transients at all time-scales, and offer the opportunity to discover even more extreme examples of unexpected phenomena. We investigate strategies to identify novel objects and to contextualize them within large time-series data sets in order to facilitate the discovery of new classes of objects as well as the physical interpretation of their anomalous nature. We develop a method that combines tree-based and manifold-learning algorithms for anomaly detection in order to perform two tasks: 1) identify and rank anomalous objects in a time-domain data set; and 2) group those anomalies according to their similarity in order to identify analogues. We achieve the latter by combining an anomaly score from a tree-based method with a dimensionality manifold-learning reduction strategy. Clustering in the reduced space allows for the successful identification of anomalies and analogues. We also assess the impact of pre-processing and feature engineering schemes and investigate the astrophysical nature of the objects that our models identify as anomalous by augmenting the Kepler data with Gaia colour and luminosity information. We find that multiple models, used in combination, are a promising strategy to identify novel light curves and light curve families.

Additional Information

© 2021 The Author(s). Published by Oxford University Press on behalf of Royal Astronomical Society. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model). Accepted 2021 September 8. Received 2021 September 8; in original form 2020 September 15. We thank the referee for a very detailed report that made this article significantly better. We thank the organizers and participants of the Detecting the Unexpected workshop that took place at STScI in 2017. The ideas for this work came from a hack during that workshop and have produced also other papers. In particular, we thank Lucianne Walkovicz for a continuous exchange of ideas and for proposing the original hack. We also thank Dalya Baron for useful insight about the use of the URF method. We thank the original hackers' team which included Kelle Cruz, and Umaa Rebbapragada. The authors acknowledge the support of the Vera C. Rubin Observatory Legacy Survey of Space and Time Transient and Variable Stars Science Collaboration (TVS SC), of which most of the authors are member and that provided opportunities for collaboration and exchange of ideas and knowledge. This paper includes data collected by the Kepler mission and obtained from the MAST data archive at the Space Telescope Science Institute (STScI). Funding for the Kepler mission is provided by the NASA Science Mission Directorate. STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 5-26555. This work has made use of data from the European Space Agency (ESA) Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. This work made use of several PYTHON modules including: (i) numpy (Harris et al. 2020) (ii) maplotlib (Hunter 2007) (iii) scikit-learn (Pedregosa et al. 2011) (iv) seaborn (Waskom et al. 2017) DATA AVAILABILITY. The data underlying this article were accessed from the Mikulski Archive for Space Telescopes (MAST), at https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html. The derived data generated in this research can be accessed from the GitHub repository https://github.com/kushaltirumala/WaldoInSky.

Attached Files

Published - stab2588.pdf

Accepted Version - 2009.06760.pdf

Supplemental Material - stab2588_supplemental_file.zip

Files

stab2588_supplemental_file.zip
Files (30.1 MB)
Name Size Download all
md5:f9ad3e83070ba90daa6a03516206a8ce
455.6 kB Preview Download
md5:e8875b61450d1d53b6fee148b9374602
19.2 MB Preview Download
md5:557e2f830db3e04af21dca70ba7a7000
10.4 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
October 23, 2023