Content area

Abstract

A classic problem in Computer Science is how to store information dynamically to allow for quick lookup. This searching problem arises quite often, for example, in dictionaries, telephone listings, symbol tables for compilers, and storing a company's business records.

Each package of information is stored in computer memory as a record. There is a special field in each record, called the key, that uniquely identifies it. The job of a searching algorithm is to take an input K and return the record (if any) that has K as its key.

Hashing is one of the most important solutions to the searching problem, because no matter how many records are stored, the average search times remain bounded. Suppose we want to store N records in a contiguous area of memory containing at least M locations. The common element of all hashing algorithms is a pre-defined hash function that assigns the N records to M memory slots in a uniform manner. Hashing algorithms differ from one another in how they resolve a collision that results when a record's assigned memory location is already occupied.

This thesis analyzes the coalesced hashing method, in which a portion of memory (called the address region) serves as the range of the hash function while the rest of memory (called the cellar) is devoted solely to storing records that collide when inserted. If the cellar should get full, subsequent colliders must be stored in empty slots in the address region and, thus, may cause later collisions. Varying the relative size of the cellar affects search performance.

The main result of this thesis expresses the average search times as a function of the number of records and the cellar size, solving the long-standing open problem described in {Knuth, The Art of Computer Programming, Vol. 3, 6.4-43}. We use these formulas to pick the cellar size that leads to optimum search performance and then show that this "tuned" method is competitive with several well-known hashing schemes. We also address other important practical considerations, such as developing good deletion algorithms and efficient implementations, and outline some interesting problems for future research.

Details

Title
ANALYSIS OF COALESCED HASHING
Author
VITTER, JEFFREY SCOTT
Year
1980
Publisher
ProQuest Dissertations & Theses
ISBN
979-8-204-39329-5
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
302983555
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.