Content area

Abstract

The dramatic increase in the number of transistors available on a single chip and the diminishing returns from enhancements of a single processor have ushered new paradigms in computing. Among the recent trends in processor architectures, chip multiprocessors(CMP) and simultaneous multithreading (SMT) are the two most promising approaches. CMP instantiates multiple processor cores on a single die, while a single core SMT allows instructions from multiple threads to be simultaneously fetched and executed in the shared pipeline. Both these architectures boost the chip throughput. However, some factors still prevent them from fulfilling their potential performance and degrade the reliability of the systems.

First, the fast widening gap between speeds of off-chip memory and on chip cores makes the data access extremely expensive as measured in processor clock cycles, which leads to over time stalls in the pipeline and resources under-utilization. Second, limited instruction level parallelism (ILP) not only results in poor single thread performance, but also hampers the chip throughput by stuffing shared resources and influences the thread level parallelism (TLP) negatively. Third, contentions from vying threads on shared resources, such as shared caches and issue windows in SMT, further contribute considerable degradation in affordable system performance. Fourth, the effects of technology scaling such as transistor size, low operating voltages, and narrowing noise margins are making the on chip components more susceptible to faults.

This research proposes on-chip components that enhance performance and/or dependability of SMT and CMP systems. These components are shared among all the on-chip threads, attempting to relieve these performance bottlenecks and improve the fault tolerance of the systems. We first examine the issue queue (IQ) which has a tight limit on its physical size and faces severe pressure under the memory wall effect in SMT/CMP systems. We propose techniques to achieve more efficient and fair usage of IQ in these systems. We propose coorperative prefetcher that is common to all the threads/cores to achieve accurate and timely prefetches to achieve both faster data access and reasonable usage of off-chip bandwidth. Besides, our study on fault tolerant issues in CMP reveals that shared hardware components on chip when susceptible to faults are vital to the entire system, as any faults in such common components are highly contaminous while hard to detect by the existing schemes. Our proposed shared hardware logic can dynamically detect such errors with a minimal impact to the overall performance.

Details

Title
Conjoint component designs for high performance dependable single chip multithreading systems
Author
Wang, Hui
Year
2007
Publisher
ProQuest Dissertations & Theses
ISBN
978-0-549-35106-1
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
304763539
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.