Content area

Abstract

Time-domain signal models have been widely applied to single-channel music source separation tasks due to their ability to overcome the limitations of fixed spectral representations and phase information loss. However, the high acoustic similarity and synchronous temporal evolution between vocals and accompaniment make accurate separation challenging for existing time-domain models. These challenges are mainly reflected in two aspects: (1) the lack of a dynamic mechanism to evaluate the contribution of each source during feature fusion, and (2) difficulty in capturing fine-grained temporal details, often resulting in local artifacts in the output. To address these issues, we propose an attention-driven time-domain convolutional network for vocal and accompaniment source separation. Specifically, we design an embedding attention module to perform adaptive source weighting, enabling the network to emphasize components more relevant to the target mask during training. In addition, an efficient convolutional block attention module is developed to enhance local feature extraction. This module integrates an efficient channel attention mechanism based on one-dimensional convolution while preserving spatial attention, thereby improving the ability to learn discriminative features from the target audio. Comprehensive evaluations on public music datasets demonstrate the effectiveness of the proposed model and its significant improvements over existing approaches.

Details

1009240
Business indexing term
Title
Attention-Driven Time-Domain Convolutional Network for Source Separation of Vocal and Accompaniment
Author
Zhao, Zhili 1   VIAFID ORCID Logo  ; Luo, Min 2   VIAFID ORCID Logo  ; Qiao Xiaoman 3   VIAFID ORCID Logo  ; Shao Changheng 1 ; Sun Rencheng 1   VIAFID ORCID Logo 

 College of Computer Science and Technology, Qingdao University, Qingdao 266071, China; [email protected] (Z.Z.); [email protected] (C.S.); [email protected] (R.S.) 
 Arts College, Qingdao University, Qingdao 266071, China 
 Information Technology Department, Qingdao Vocational and Technical College of Hotel Management, Qingdao 266100, China; [email protected] 
Publication title
Volume
14
Issue
20
First page
3982
Number of pages
28
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-10-11
Milestone dates
2025-08-18 (Received); 2025-10-09 (Accepted)
Publication history
 
 
   First posting date
11 Oct 2025
ProQuest document ID
3265895189
Document URL
https://www.proquest.com/scholarly-journals/attention-driven-time-domain-convolutional/docview/3265895189/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-10-28
Database
ProQuest One Academic