Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published 2016 | public
Book Section - Chapter

Monitoring and control of large-scale distributed systems

Abstract

An important part of managing large-scale, distributed computing systems is a monitoring service that is able to monitor and track in real-time many site facilities, networks, and tasks in progress. The monitoring information gathered is essential for developing the required higher level services, the components that provide decision support and some degree of automated decisions and for maintaining and optimizing workflow in large-scale distributed systems. Our strategy in trying to satisfy the demands of data intensive applications was to move to more synergetic relationships between the applications, computing and storage facilities and the network infrastructure. These orchestration and global optimization functions are performed by higher-level agent-based services which are able to collaborate and cooperate in performing a wide range of distributed information-gathering and processing tasks.

Additional Information

© 2016 IOS Press.

Additional details

Created:
August 20, 2023
Modified:
January 13, 2024