Content area

Abstract

Database management systems (DBMS) with high throughput and 24-hour availability have been in great demand in the marketplace in the recent years. A parallel DBMS can potentially meet the high throughput demands. The 24-hour availability requirement constrains the parallel DBMS from being taken off-line even for basic maintenance operations such as reorganization. However, on-line reorganization can also decrease throughput (by competing for resources) and availability (by holding the data for a long duration and making it unavailable for transactions). Therefore, on-line reorganization techniques should be designed to minimize the decrease in throughput and availability. Data placement reorganization is a type of reorganization unique to parallel DBMS. Data placement is critical for parallel DBMS because it is a principal determinant of the throughput of the system. In this dissertation, we study a number of issues in on-line data placement reorganization.

First, we propose a new method to determine the degree of allocation of relations in a parallel database system. A change in the degree of allocation can trigger a data placement reorganization. We show that significant performance improvements can be achieved by performing the data placement reorganization even though it is expensive.

Second, we examine a core design issue, namely the choice of a data placement strategy. We consider a number of data placement strategies, and conclude that the data placement strategies perform equally well in balancing the load, however, the reorganization costs for the different strategies differ significantly. The importance of this result lies in the fact that design decisions also contribute significantly to the cost of data placement reorganization.

The next issue we examine is index modification during data placement reorganization. We identify two classes of techniques with which data can be moved and indexes modified. The class OAT (One-page-At-a-Time transfer) requires very little extra disk space e but can take significantly longer time. The class BULK (BULK transfer of entire data) needs large quantities of extra disk space but is very fast. We compare the best methods of both classes and find that BULK provides better performance for transactions during reorganization in most situations.

Details

1010268
Classification
Identifier / keyword
Title
On-line tuning of data placement in parallel databases
Number of pages
127
Degree date
1995
School code
0078
Source
DAI-B 57/01, Dissertation Abstracts International
ISBN
979-8-208-95646-5
University/institution
Georgia Institute of Technology
University location
United States -- Georgia
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
9614079
ProQuest document ID
304203802
Document URL
https://www.proquest.com/dissertations-theses/on-line-tuning-data-placement-parallel-databases/docview/304203802/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic