Content area

Abstract

Video data is abundant; thousands of hours of video are uploaded to YouTube every minute, millions of surveillance cameras collect video data every second. Mobile devices have also enabled mass video capture, resulting in more content than humans can feasibly consume. Consequently, significant research is dedicated to developing techniques for analyzing and understanding video data across various communities. Applications benefiting from advanced video processing techniques are numerous, including video surveillance, monitoring, news production, and autonomous driving.

Recent advances in video processing using deep learning primitives have led to breakthroughs in fundamental video analysis tasks such as frame classification, object detection, and tracking, unlocking a wide array of new applications. Despite these advances, developing applications that leverage real-world video streams remains challenging due to the massive data sizes and the complexity of managing large numbers of camera feeds.

In this thesis, we first propose an interactive declarative query processing framework capable of processing large video streams. In particular, we introduce a set of approximate filters to speed up queries that involve objects of a specific type (e.g., cars, trucks, etc.) on video frames with associated spatial relationships among them (e.g., car left of truck). The resulting filters can assess quickly if the query predicates are true to proceed with further analysis of the frame or otherwise not considering the frame further avoiding costly object detection operations. We propose two classes of filters Image Classification (IC) and Object Detection (OD), that adapt principles from deep image classification and object detection. The filters utilize extensible deep neural architectures and are easy to deploy and use. In addition, we propose statistical query processing techniques to process aggregate queries involving objects with spatial constraints on video streams. Finally, we introduce an algorithm based on extreme value theory to detect unexpected objects on video streams.

Then, we extend the proposed framework to explore declarative queries for real-time video streams involving objects and their interactions. We seek to efficiently identify frames in which an object is interacting with another in a specific way. We propose the Progressive Filters (PF) algorithm which deploys a sequence of inexpensive and less accurate filters to detect the presence of query-specified objects on frames. We demonstrate that PF derives a least-cost sequence of filters given the query objects’ current selectivities. Since selectivities may vary as the video evolves, we present a statistical test to determine when to trigger filters’ re-optimization. Finally, we present Interaction Sheave, a filtering approach that uses learned spatial information about objects and interactions to prune frames that are unlikely to involve the query-specified action between them, thus improving the frame processing rate.

Finally, we consider the problem of detecting and recovering from data drift in video streams. We present algorithms to monitor a video stream and detect when the underlying data distribution has changed in a lightweight manner. The basis of our proposal is conformal martingales that can efficiently construct an understanding of the current video stream and detect changes in it very efficiently. We present the Drift Inspector Algorithm (DI) that encompasses such martingales to detect changes in the video distribution efficiently. We then propose two algorithms, namely Model Selection Based on Output (MSBO) and Model Selection Based on Input (MSBI) to efficiently select new models to continue processing the video stream when the distribution has changed or train new models when none of the available is suitable to process the incoming video frames.

Via a thorough experimental evaluation, we showcase the efficacy and effectiveness of our proposed methods on real video streams.

Details

1010268
Business indexing term
Title
Streaming Video Queries
Author
Number of pages
141
Publication year
2025
Degree date
2025
School code
0779
Source
DAI-A 87/1(E), Dissertation Abstracts International
ISBN
9798290911038
Advisor
Committee member
Marbach, Peter; Shkurti, Florian; Dayan, Niv; Li, Chen
University/institution
University of Toronto (Canada)
Department
Computer Science
University location
Canada -- Ontario, CA
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31842153
ProQuest document ID
3234789378
Document URL
https://www.proquest.com/dissertations-theses/streaming-video-queries/docview/3234789378/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic