Content area

Abstract

Recently, music generation models based on deep learning have made remarkable progress in the field of symbolic music generation. However, the existing methods often have problems of violating musical rules, especially since the control of harmonic structure is relatively weak. To address these limitations, this paper proposes a novel framework, the Entropy-Regularized Latent Diffusion for Harmony-Constrained (ERLD-HC), which combines a variational autoencoder (VAE) and latent diffusion models with an entropy-regularized conditional random field (CRF). Our model first encodes symbolic music into latent representations through VAE, and then introduces the entropy-based CRF module into the cross-attention layer of UNet during the diffusion process, achieving harmonic conditioning. The proposed model balances two key limitations in symbolic music generation: the lack of theoretical correctness of pure algorithm-driven methods and the lack of flexibility of rule-based methods. In particular, the CRF module learns classic harmony rules through learnable feature functions, significantly improving the harmony quality of the generated Musical Instrument Digital Interface (MIDI). Experiments on the Lakh MIDI dataset show that compared with the baseline VAE+Diffusion, the violation rates of harmony rules of the ERLD-HC model under self-generated and controlled inputs have decreased by 2.35% and 1.4% respectively. Meanwhile, the MIDI generated by the model maintains a high degree of melodic naturalness. Importantly, the harmonic guidance in ERLD-HC is derived from an internal CRF inference module, which enforces consistency with music-theoretic priors. While this does not yet provide direct external chord conditioning, it introduces a form of learned harmonic controllability that balances flexibility and theoretical rigor.

Details

1009240
Title
ERLD-HC: Entropy-Regularized Latent Diffusion for Harmony-Constrained Symbolic Music Generation
Author
Yang, Li  VIAFID ORCID Logo 
Publication title
Entropy; Basel
Volume
27
Issue
9
First page
901
Number of pages
22
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
10994300
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-08-25
Milestone dates
2025-07-02 (Received); 2025-08-22 (Accepted)
Publication history
 
 
   First posting date
25 Aug 2025
ProQuest document ID
3254508763
Document URL
https://www.proquest.com/scholarly-journals/erld-hc-entropy-regularized-latent-diffusion/docview/3254508763/se-2?accountid=208611
Copyright
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-09-26
Database
ProQuest One Academic