Content area

Abstract

We compare phone labels and articulatory features as input for cross-lingual transfer learning in text-to-speech (TTS) for low-resource languages (LRLs). Experiments with FastSpeech 2 and the LRL West Frisian show that using articulatory features outperformed using phone labels in both intelligibility and naturalness. For LRLs without pronunciation dictionaries, we propose two novel approaches: a) using a massively multilingual model to convert grapheme-to-phone (G2P) in both training and synthesizing, and b) using a universal phone recognizer to create a makeshift dictionary. Results show that the G2P approach performs largely on par with using a ground-truth dictionary and the phone recognition approach, while performing generally worse, remains a viable option for LRLs less suitable for the G2P approach. Within each approach, using articulatory features as input outperforms using phone labels.

Details

1009240
Title
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
Publication title
arXiv.org; Ithaca
Publication year
2023
Publication date
Jun 1, 2023
Section
Computer Science; Electrical Engineering and Systems Science
Publisher
Cornell University Library, arXiv.org
Source
arXiv.org
Place of publication
Ithaca
Country of publication
United States
University/institution
Cornell University Library arXiv.org
e-ISSN
2331-8422
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Online publication date
2023-06-02
Milestone dates
2023-06-01 (Submission v1)
Publication history
 
 
   First posting date
02 Jun 2023
ProQuest document ID
2821741190
Document URL
https://www.proquest.com/working-papers/effects-input-type-pronunciation-dictionary-usage/docview/2821741190/se-2?accountid=208611
Full text outside of ProQuest
Copyright
© 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2023-06-03
Database
ProQuest One Academic