Snapshot Processing in Streaming Environments
- Creators
- Zimmerman, Daniel M.
- Chandy, K. Mani
Abstract
Computational issues related to streaming data, and in particular the monitoring and rapid correlation of multiple sources of streaming data, are becoming increasingly important in contexts ranging from business processes to crisis detection. For example, a government system to detect bioterror attacks must correlate multiple streams of possibly low-confidence data from sensors and local and national public health information networks with cues from indicators such as news and government sources indicating geographical locations, tactics and timing of possible attacks. The results of this correlation trigger appropriate responses, such as flagging information for more in-depth analysis or sending alerts to public health officials. Monitoring and correlation applications of this type are ideal for deployment on distributed computing grids, because they have high transaction throughput, require low latency, and can be partitioned into sets of small communicating computations with regular communication patterns. An important consideration in these applications is the need to ensure that, at any given time, computations are carried out on an accurate - or at least close to accurate - picture of the environment being monitored. One way of doing this, which we call snapshot processing, is to treat collections of events that occur at approximately the same time as representing a global snapshot - a valid state - of the environment. Computation on the resulting series of snapshots is much like computation on a real-time video of the entire environment. We briefly describe our model for these stream processing computations and introduce the concept of snapshot processing
Additional Information
© 2006 IEEE. The research described here has been supported in part by the National Science Foundation under grant CCR-0312778, ITR: Information Infrastructures for Crisis Management, and by the Lee Center for Advanced Networking at Caltech.Attached Files
Published - Zimmerman2006p91802006_7Th_IeeeAcm_International_Conference_On_Grid_Computing.pdf
Files
Name | Size | Download all |
---|---|---|
md5:ce0ec76a984175c36b07dbc33cc5a72e
|
60.3 kB | Preview Download |
Additional details
- Eprint ID
- 22404
- Resolver ID
- CaltechAUTHORS:20110222-093942206
- NSF
- CCR-0312778
- Information Infrastructures for Crisis Management (ITR)
- Caltech Lee Center for Advanced Networking
- Created
-
2011-02-22Created from EPrint's datestamp field
- Updated
-
2023-10-23Created from EPrint's last_modified field