Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 2018 | public
Journal Article

Datum: Managing Data Purchasing and Data Placement in a Geo-Distributed Data Market

Abstract

This paper studies two design tasks faced by a geo-distributed cloud data market: which data to purchase (data purchasing) and where to place/replicate the data for delivery (data placement). We show that the joint problem of data purchasing and data placement within a cloud data market can be viewed as a facility location problem and is thus NP-hard. However, we give a provably optimal algorithm for the case of a data market made up of a single data center and then generalize the structure from the single data center setting in order to develop a near-optimal, polynomial-time algorithm for a geo-distributed data market. The resulting design, Datum, decomposes the joint purchasing and placement problem into two subproblems, one for data purchasing and one for data placement, using a transformation of the underlying bandwidth costs. We show, via a case study, that Datum is near optimal (within 1.6%) in practical settings.

Additional Information

© 2018 IEEE. Manuscript received April 13, 2017; revised October 14, 2017 and February 2, 2018; accepted February 3, 2018; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor M. Mellia. This work was supported in part by the National Science Foundation under Grant 1254169, Grant 1518941, Grant 1331343, and Grant 1637598, in part by the National Science Foundation Graduate Fellowship, and in part by the Resnick Sustainability Institute Fellowship.

Additional details

Created:
August 19, 2023
Modified:
October 18, 2023