Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published 2012 | Published
Journal Article Open

Alert Messaging in the CMS Distributed Workflow System

Maxa, Zdenek

Abstract

WMAgent is the core component of the CMS workload management system. One of the features of this job managing platform is a configurable messaging system aimed at generating, distributing and processing alerts: short messages describing a given alert-worthy information or pathological condition. Apart from the framework's sub-components running within the WMAgent instances, there is a stand-alone application collecting alerts from all WMAgent instances running across the CMS distributed computing environment. The alert framework has a versatile design that allows for receiving alert messages also from other CMS production applications, such as PhEDEx data transfer manager. We present implementation details of the system, including its Python implementation using ZeroMQ, CouchDB message storage and future visions as well as operational experiences. Inter-operation with monitoring platforms such as Dashboard or Lemon is described.

Additional Information

© 2013 IOP Publishing Ltd. This work was supported by the US CMS Operations Program funded by the US Department of Energy.

Attached Files

Published - 1742-6596_396_3_032074.pdf

Files

1742-6596_396_3_032074.pdf
Files (860.6 kB)
Name Size Download all
md5:3f4602801000896e45b2895ec3a7fa2e
860.6 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 23, 2023