Content area

Abstract

To guarantee the reliability and integrity of audio, data have been focused on as an essential topic as the fast development of generative AI. Significant progress in machine learning and speech synthesis has increased the potential for audio tampering. In this paper, we focus on the digital watermarking method as a promising method to safeguard the authenticity of audio evidence. Due to the integrity of the original data with probative importance, the algorithm requires reversibility, imperceptibility, and reliability. To meet the requirements, we propose a reversible digital watermarking approach that embeds feature data concentrating in high-frequency intDCT coefficients after transforming data from the time domain into the frequency domain. We explored the appropriate hiding locations against spectrum-based attacks with novel proposed methodologies for spectral expansion for embedding. However, the drawback of fixed expansion is that the stego signal is prone to being detected by a spectral analysis. Therefore, this paper proposes two other new expansion methodologies that embed the data into variable locations—random expansion and adaptive expansion with distortion estimation for embedding—which effectively conceal the watermark’s presence while maintaining high perceptual quality with an average segSNR better than 21.363 dB and average MOS value better than 4.085. Our experimental results demonstrate the efficacy of our proposed method in both sound quality preservation and log-likelihood value, indicating the absolute discontinuity of the spectrogram after embedding is proposed to evaluate the effectiveness of the proposed reversible spectral expansion watermarking algorithm. The result of EER indicated that the adaptive hiding performed best against attacks by spectral analysis.

Details

1009240
Business indexing term
Title
Reversible Spectral Speech Watermarking with Variable Embedding Locations Against Spectrum-Based Attacks
Author
Huang, Xuping 1   VIAFID ORCID Logo  ; Ito, Akinori 2   VIAFID ORCID Logo 

 Department of Communications Engineering, Graduate School of Engineering, Tohoku University, Sendai 980-8577, Japan; [email protected]; Interdisciplinary Faculty of Science and Engineering, Shimane University, Matsue 690-8504, Japan 
 Department of Communications Engineering, Graduate School of Engineering, Tohoku University, Sendai 980-8577, Japan; [email protected] 
Publication title
Volume
15
Issue
1
First page
381
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20763417
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-01-03
Milestone dates
2024-11-27 (Received); 2024-12-30 (Accepted)
Publication history
 
 
   First posting date
03 Jan 2025
ProQuest document ID
3153579679
Document URL
https://www.proquest.com/scholarly-journals/reversible-spectral-speech-watermarking-with/docview/3153579679/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-07-23
Database
ProQuest One Academic