A VLSI Architecture for Concurrent Data Structures

Creators: Dally, William J.

Style

An error occurred while generating the citation.

Abstract

Concurrent data structures simplify the development of concurrent programs by encapsulating commonly used mechanisms for synchronization and communication into data structures. This thesis develops a notation for describing concurrent data structures, presents examples of concurrent data structures, and describes an architecture to support concurrent data structures. Concurrent Smalltalk (CST), a derivative of Smalltalk-80 with extensions for concurrency, is developed to describe concurrent data structures. CST allows the programmer to specify objects that are distributed over the nodes of a concurrent computer. These distributed objects have many constituent objects and thus can process many messages simultaneously. They are the foundation upon which concurrent data structures are built. The balanced cube is a concurrent data structure for ordered sets. The set is distributed by a balanced recursive partition that maps to the subcubes of a binary n-cube using a Gray code. A search algorithm, VW search, based on the distance properties of the Gray code, searches a balanced cube in O(log N) time. Because it does not have the root bottleneck that limits all tree-based data structures to O(1) concurrency, the balanced cube achieves O(~og ) concurrency. Considering graphs as concurrent data structures, graph algorithms are presented for the shortest path problem, the mix-flow problem, and graph partitioning. These algorithms introduce new synchronization techniques to achieve better performance than existing algorithms. A message-passing, concurrent architecture is developed that exploits the characteristics of VLSI technology to support concurrent data structures. Interconnection topologies are compared on the basis of dimension. It is shown that minimum latency is achieved with a very low dimensional network. A deadlock-free routing strategy is developed for this class of networks, and a prototype VLSI chip implementing this strategy is described. A message-driven processor complements the network by responding to messages with a very low latency. The processor directly executes messages, eliminating a level of interpretation. To take advantage of the performance offered by specialization while at the same time retaining flexibility, processing elements can be specialized to operate on a single class of objects. These object experts accelerate the performance of all applications using this class.

Files

5209-TR-86.pdf

Files (10.2 MB)

Name	Size	Download all
5209-TR-86.pdf md5:c6247d1044784e5bbcb6b72a794a351a	10.2 MB	Preview Download

Additional details

	All versions	This version
Views	18	18
Downloads	9	9
Data volume	91.5 MB	91.5 MB