Distributed Kd-Trees for Ultra Large Scale Object Recognition

Creators: Aly, Mohamed; Munich, Mario; Perona, Pietro

Others:: Hoey, Jesse; McKenna, Stephen; Trucco, Emanuele

Style

An error occurred while generating the citation.

Abstract

Distributed Kd-Trees is a method for building image retrieval systems that can handle hundreds of millions of images. It is based on dividing the Kd-Tree into a "root subtree" that resides on a root machine, and several "leaf subtrees", each residing on a leaf machine. The root machine handles incoming queries and farms out feature matching to an appropriate small subset of the leaf machines. Our implementation employs the MapReduce architecture to efficiently build and distribute the Kd-Tree for millions of images. It can run on thousands of machines, and provides orders of magnitude more throughput than the state-of-the-art, with better recognition performance. We show experiments with up to 100 million images running on 2048 machines, with run time of a fraction of a second for each query image.

Additional Information

© 2011. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms. This work was supported by ONR grant #N00173-09-C-4005 and was implemented during an internship at Google Inc. The implementation of distributed Kd-Trees is pending a US patent GP-2478-00-US [4]. We would like to thank Ulrich Buddemeier and Alessandro Bissacco for allowing us to use their implementation. We would also like to thank James Philbin, Hartwig Adam, and Hartmut Neven for their valuable help.

Attached Files

Published - paper40.pdf

Files

paper40.pdf

Files (462.7 kB)

Name	Size	Download all
paper40.pdf md5:a83cbfaaecd743bd3c6f96793a05c60f	462.7 kB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes