Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published September 17, 2019 | Published
Journal Article Open

Anomaly detection using Deep Autoencoders for the assessment of the quality of the data acquired by the CMS experiment

Abstract

The certification of the CMS experiment data as usable for physics analysis is a crucial task to ensure the quality of all physics results published by the collaboration. Currently, the certification conducted by human experts is labor intensive and based on the scrutiny of distributions integrated on several hours of data taking. This contribution focuses on the design and prototype of an automated certification system assessing data quality on a per-luminosity section (i.e. 23 seconds of data taking) basis. Anomalies caused by detector malfunctioning or sub-optimal reconstruction are difficult to enumerate a priori and occur rarely, making it difficult to use classical supervised classification methods such as feedforward neural networks. We base our prototype on a semi-supervised approach which employs deep autoencoders. This approach has been qualified successfully on CMS data collected during the 2016 LHC run: we demonstrate its ability to detect anomalies with high accuracy and low false positive rate, when compared against the outcome of the manual certification by experts. A key advantage of this approach over other machine learning technologies is the great interpretability of the results, which can be further used to ascribe the origin of the problems in the data to a specific sub-detector or physics objects.

Additional Information

© 2019 The Authors, published by EDP Sciences. This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Published online 17 September 2019. We thank the CMS collaboration for providing the data set used in this study. We are thankful to the members of the CMS Physics Performance and Dataset project for useful discussions, suggestions, and support. We acknowledge the support of the CMS CERN group for providing the computing resources to train our models. This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation (grant agreement no. 772369).

Attached Files

Published - epjconf_chep2018_06008.pdf

Files

epjconf_chep2018_06008.pdf
Files (5.3 MB)
Name Size Download all
md5:ce40a1b98da77d9fb4999bdb19ccc3b6
5.3 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 19, 2023