Content area

Abstract

The rapid evolution of generative models has revolutionized image and video processing, paving the way for innovative approaches in creation and optimization. This thesis explores the intersection of generative modeling and neural compression, emphasizing their roles in addressing the growing demands of high-quality media processing. Building upon three key works, we synthesize insights to present a unified perspective. 

First, the foundational study on neural video compression through generative modeling demonstrates the utility of variational autoencoders and autoregressive flows for efficient rate-distortion tradeoffs. By introducing structured priors and temporal dependencies, this work establishes the viability of generative models in surpassing traditional codecs, marking a significant leap in neural video coding. Second, leveraging the strengths of diffusion probabilistic models for video generation, we address the challenge of forecasting high-dimensional video dynamics. This approach integrates deterministic and stochastic components, refining frame predictions with diffusion-based residual modeling to enhance perceptual quality and probabilistic accuracy. Third, the innovative application of conditional diffusion models in lossy image compression underscores a paradigm shift from deterministic decoders to diffusion-based architectures. By encoding semantic content as latent variables and reconstructing texture via diffusion, this framework achieves remarkable perceptual gains while maintaining competitive distortion metrics.

Collectively, these contributions illustrate the convergence of generative techniques in optimizing rate-distortion performance, enabling high-fidelity reconstructions, and fostering perceptually appealing outputs. This work situates itself within the broader research landscape by addressing the dual objectives of efficiency and realism in media compression and generation. Applications span diverse domains, including video conferencing, autonomous systems, and content delivery networks, underscoring the transformative potential of generative modeling in a data-driven era. Through the synthesis of creation and optimization, this thesis lays the groundwork for novel multimedia processing systems that are both generative and compression-ready.

Details

1010268
Business indexing term
Title
Generative Compression: Bridging Creation and Representation Learning in Image and Video Processing
Number of pages
180
Publication year
2025
Degree date
2025
School code
0030
Source
DAI-B 86/11(E), Dissertation Abstracts International
ISBN
9798314858479
Committee member
Levorato, Marco; Folkess, Charless
University/institution
University of California, Irvine
Department
Computer Science
University location
United States -- California
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31842465
ProQuest document ID
3201305888
Document URL
https://www.proquest.com/dissertations-theses/generative-compression-bridging-creation/docview/3201305888/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic