Full text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years.

Details

Title
An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
Author
Lleida, Eduardo 1   VIAFID ORCID Logo  ; Rodriguez-Fuentes, Luis Javier 2   VIAFID ORCID Logo  ; Tejedor, Javier 3   VIAFID ORCID Logo  ; Ortega, Alfonso 1   VIAFID ORCID Logo  ; Miguel, Antonio 1   VIAFID ORCID Logo  ; Bazán, Virginia 4   VIAFID ORCID Logo  ; Pérez, Carmen 4   VIAFID ORCID Logo  ; de Prada, Alberto 4   VIAFID ORCID Logo  ; Penagarikano, Mikel 2   VIAFID ORCID Logo  ; Varona, Amparo 2   VIAFID ORCID Logo  ; Bordel, Germán 2   VIAFID ORCID Logo  ; Torre-Toledano, Doroteo 5   VIAFID ORCID Logo  ; Álvarez, Aitor 6   VIAFID ORCID Logo  ; Arzelus, Haritz 6   VIAFID ORCID Logo 

 Vivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, Spain; [email protected] (A.O.); [email protected] (A.M.) 
 Department of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, Spain; [email protected] (L.J.R.-F.); [email protected] (M.P.); [email protected] (A.V.); [email protected] (G.B.) 
 Institute of Technology, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28668 Boadilla del Monte, Spain; [email protected] 
 Corporación Radiotelevisión Española, 28223 Madrid, Spain; [email protected] (V.B.); [email protected] (C.P.); [email protected] (A.d.P.) 
 AUDIAS, Electronic and Communication Technology Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Av. Francisco Tomás y Valiente, 11, 28049 Madrid, Spain; [email protected] 
 Fundación Vicomtech, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastián, Spain; [email protected] (A.Á.); [email protected] (H.A.) 
First page
8577
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
20763417
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2848989990
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.