A study of a data parallel algorithm for XML DOM

Abstract

XML parsing is a core operation performed on an XML document and can cause performance bottlenecks in applications and system processing large volumes of XML data. Parallelism is a natural way to boost the parsing performance. Leveraging multicore processors can offer a cost-effective solution. We study a data parallel algorithm called ParDOM for XML DOM parsing that builds an in-memory tree structure for an XML document. ParDOM has two phases In the first phase, an XML document is partitioned and parsed in parallel. In the second phase, the partial DOM node tree structures, are linked together (in parallel) to build a complete DOM node tree. ParDOM offers fine-grained parallelism by adopting a flexible chunking scheme and it can be conveniently implemented using a data parallel programming model that supports map and sort operations. We show that ParDOM yields better scalability than PXP [24] – a recently proposed parallel DOM parsing algorithm.

Details

Title

A study of a data parallel algorithm for XML DOM parsing

Author

Shah, Bhavik Bharatkumar

Year

2009

Publisher

ProQuest Dissertations & Theses

ISBN

978-1-109-61969-0

Source type

Dissertation or Thesis

Language of publication

English

ProQuest document ID

304944228

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

A study of a data parallel algorithm for XML DOM parsing

Jump to:

Abstract

Details

Suggested sources