Content area

Abstract

Earth-based science research has undergone significant advancement in recent years in part to the advancement in sensor technology that has made sensors cheaper and reliable and hence plentiful. As the volume of real time data grows exponentially, analyzing every new bit of data becomes computationally impractical. Yet it remains critical to identify and process events of importance. Scientific workflows are a widely accepted programming model that allows scientists to model their scientific problems concentrating on science aspects instead of computer science details. There are many scientific applications that are amenable to representation as scientific workflows yet also need to incorporate external, continuous event streams. Scientific workflows today generally work under the assumption of static or non-real time data, and do not adequately capture continuous behavior.

This dissertation presents a hybrid programming model of scientific workflow and declarative query based event processing that will enable data mining of high volumes of data streams, and facilitates setting up gateways that may consist of much needed features like triggered computing, alert systems and real time analysis. Contributions of this thesis include a programming abstraction that preserves the simplicity and user friendliness of scientific workflows while allowing event streams to be first class citizens in the programming model by defined streaming semantics. The stream semantics allow the high level computational graph to be preserved while allowing the processing between workflow system and declarative event processing system to go on as necessary. Further it proposes algorithms to identify and partition control flow sub-graphs in an event processing graph with the objective of matching run-time characteristics of the sub-graphs with the proper quality of service for execution. The thesis argues that there are different phases of the computation that require different run-time quality of service requirements ranging from high throughput event processing to computationally intensive HPC applications. Finally, the thesis shows how the proposed programming model addresses the different run-time aspects of the event processing applications.

Details

1010268
Classification
Title
Programming abstraction for resource aware stream processing for scientific workflows
Number of pages
167
Degree date
2011
School code
0093
Source
DAI-B 73/04, Dissertation Abstracts International
ISBN
978-1-267-07840-7
Advisor
Committee member
Fox, Geofferey C.; Gannon, Dennis; Plale, Beth; Sabry, Amr
University/institution
Indiana University
Department
Computer Sciences
University location
United States -- Indiana
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
3488071
ProQuest document ID
914706608
Document URL
https://www.proquest.com/dissertations-theses/programming-abstraction-resource-aware-stream/docview/914706608/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic