Analysis and optimization for processing

Abstract

In the field of Scientific Computing, two trends are clear: the size of data sets in use is growing rapidly and microprocessor performance is improving through increases in parallelism, rather than through clock rate increases. Further, Extensible Markup Language (XML) is increasingly being used to encode large data sets, and SOAP is being used to provide Grid services – uses XML and SOAP were never designed for, and naïve implementations of these standards can lead to performance penalties. As these trends continue, past assumptions about the value of seeking out parallel algorithms should be revisited.

Lexical analysis has traditionally been seen as an inherently serial process. This work seeks to challenge that viewpoint. We start by tracking the performance of state of the art in XML parsers and SOAP toolkits through benchmarks for scientific computing applications. We continue to study the space through an examination of the effects of current workstation- and server-class computer systems' caching mechanisms on parser performance. Finally, we propose Piximal, an NFA-based parser which uses spare processors to reduce XML parse time. The limits of the Piximal approach to parallel XML parsing are examined.

Details

Title

Analysis and optimization for processing grid-scale XML datasets

Author

Head, Michael Reuben

Year

2009

Publisher

ProQuest Dissertations & Theses

ISBN

978-1-109-56418-1

Source type

Dissertation or Thesis

Language of publication

English

ProQuest document ID

305108571

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Analysis and optimization for processing grid-scale XML datasets

Jump to:

Abstract

Details

Suggested sources