Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment

Abstract

User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive. We summarize a transitory UGV into several keyframes in linear time via fast graph sampling based on Gershgorin disc alignment (GDA). Specifically, we first model a sequence of \(N\) frames in a UGV as an \(M\)-hop path graph \(\mathcal{G}^o\) for \(M \ll N\), where the similarity between two frames within \(M\) time instants is encoded as a positive edge based on feature similarity. Towards efficient sampling, we then "unfold" \(\mathcal{G}^o\) to a \(1\)-hop path graph \(\mathcal{G}\), specified by a generalized graph Laplacian matrix \(\mathcal{L}\), via one of two graph unfolding procedures with provable performance bounds. We show that maximizing the smallest eigenvalue \(\lambda_{\min}(\mathbf{B})\) of a coefficient matrix \(\mathbf{B} = \textit{diag}\left(\mathbf{h}\right) + \mu \mathcal{L}\), where \(\mathbf{h}\) is the binary keyframe selection vector, is equivalent to minimizing a worst-case signal reconstruction error. We maximize instead the Gershgorin circle theorem (GCT) lower bound \(\lambda^-_{\min}(\mathbf{B})\) by choosing \(\mathbf{h}\) via a new fast graph sampling algorithm that iteratively aligns left-ends of Gershgorin discs for all graph nodes (frames). Extensive experiments on multiple short video datasets show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.

Details

Subject

User generated content;
Eigenvalues;
Lower bounds;
Similarity;
Algorithms;
Video data;
Alignment;
Matrices (mathematics);
Frames (data processing);
Signal reconstruction;
Social networks;
Sampling

Business indexing term

Subject:

Social networks

Identifier / keyword

Computer Vision and Pattern Recognition; Image and Video Processing; Signal Processing

URL

http://arxiv.org/abs/2408.01859

Title

Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment

Author

Sahami, Sadid; Cheung, Gene; Chia-Wen, Lin

Publication title

arXiv.org; Ithaca

Publication year

2024

Publication date

Aug 3, 2024

Section

Computer Science; Electrical Engineering and Systems Science

Publisher

Cornell University Library, arXiv.org

Source

arXiv.org

Place of publication

Ithaca

Country of publication

United States

University/institution

Cornell University Library arXiv.org

Publication subject

Biology, Statistics, Physics, Mathematics, Engineering--Electrical Engineering, Computers--Computer Engineering, Business And Economics--Banking And Finance

e-ISSN

2331-8422

Source type

Working Paper

Language of publication

English

Document type

Working Paper

Publication history

Online publication date

2024-08-06

Milestone dates

2024-08-03 (Submission v1)

Publication history

First posting date

06 Aug 2024

ProQuest document ID

3089694188

Document URL

https://www.proquest.com/working-papers/graph-unfolding-sampling-transitory-video/docview/3089694188/se-2?accountid=208611

Full text outside of ProQuest

http://arxiv.org/abs/2408.01859

© 2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Last updated

2024-08-07

Database

ProQuest One Academic

Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject

Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment

Content area

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject