Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published November 2008 | Published
Book Section - Chapter Open

Large Scale Job Management and Experience in Recent Data Challenges within the LHC CMS experiment

Abstract

From its conception the job management system has been distributed to increase scalability and robustness. The system consists of several applications (called ProdAgents) which manage Monte Carlo, reconstruction and skimming jobs on collections of sites within different Grid environments (OSG, NorduGrid, LCG) and submission systems such as GlideIn, local batch, etc... Production of simulated data in CMS mainly takes place on so called Tier2s (small to medium size computing centers) resources. Approximately ~50% of the CMS Tier2 resources are allocated to running simulation jobs. While the so-called Tier1s (medium to large size computing centers with high capacity tape storage systems) will be mainly used for skimming and reconstructing detector data. During the last one and a half years the job management system has been adapted such that it can be configured to convert Data Acquisition (DAQ) / High Level Trigger (HLT) output from the CMS detector to the CMS data format and manage the real time data stream from the experiment. Simultaneously the system has been upgraded to facilitate the increasing scale of the CMS production and adapting to the procedures used by its operators. In this paper we discuss the current (high level) architecture of ProdAgent, the experience in using this system in computing challenges, feedback from these challenges, and future work including migration to a set of core libraries to facilitate convergence between the different data management projects within CMS that deal with analysis, simulation, and initial reconstruction of real data. This migration is important, as it will decrease the code footprint used by these projects and increase maintainability of the code base.

Additional Information

Copyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-ShareAlike. This work is partly supported by US Department of Energy grant DOE DE-FG02-06ER86271 and US National Science Foundation grant NSF PHY-0533280. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and don't necessarily reflect the views of the Department of Energy or NSF.

Attached Files

Published - ACAT08_032.pdf

Files

ACAT08_032.pdf
Files (616.8 kB)
Name Size Download all
md5:12c19f73f905eb6987f9f6f7e9c5af1e
616.8 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
January 14, 2024