Large Scale Job Management and Experience in Recent Data Challenges within the LHC CMS experiment
Abstract
From its conception the job management system has been distributed to increase scalability and robustness. The system consists of several applications (called ProdAgents) which manage Monte Carlo, reconstruction and skimming jobs on collections of sites within different Grid environments (OSG, NorduGrid, LCG) and submission systems such as GlideIn, local batch, etc... Production of simulated data in CMS mainly takes place on so called Tier2s (small to medium size computing centers) resources. Approximately ~50% of the CMS Tier2 resources are allocated to running simulation jobs. While the so-called Tier1s (medium to large size computing centers with high capacity tape storage systems) will be mainly used for skimming and reconstructing detector data. During the last one and a half years the job management system has been adapted such that it can be configured to convert Data Acquisition (DAQ) / High Level Trigger (HLT) output from the CMS detector to the CMS data format and manage the real time data stream from the experiment. Simultaneously the system has been upgraded to facilitate the increasing scale of the CMS production and adapting to the procedures used by its operators. In this paper we discuss the current (high level) architecture of ProdAgent, the experience in using this system in computing challenges, feedback from these challenges, and future work including migration to a set of core libraries to facilitate convergence between the different data management projects within CMS that deal with analysis, simulation, and initial reconstruction of real data. This migration is important, as it will decrease the code footprint used by these projects and increase maintainability of the code base.
Additional Information
Copyright owned by the author(s) under the term of the Creative Commons Attribution-NonCommercial-ShareAlike. This work is partly supported by US Department of Energy grant DOE DE-FG02-06ER86271 and US National Science Foundation grant NSF PHY-0533280. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and don't necessarily reflect the views of the Department of Energy or NSF.Attached Files
Published - ACAT08_032.pdf
Files
Name | Size | Download all |
---|---|---|
md5:12c19f73f905eb6987f9f6f7e9c5af1e
|
616.8 kB | Preview Download |
Additional details
- Eprint ID
- 89499
- Resolver ID
- CaltechAUTHORS:20180910-131321917
- Department of Energy (DOE)
- DE-FG02-06ER86271
- NSF
- PHY-0533280
- Created
-
2018-09-10Created from EPrint's datestamp field
- Updated
-
2021-11-16Created from EPrint's last_modified field
- Series Name
- Proceedings of Science
- Series Volume or Issue Number
- 070