Full Text

Turn on search term navigation

© 2022. This work is published under https://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

This paper describes the latest iteration of our Estonian speech recognition system and the publicly available transcription editing service. The system is now based on an end-to-end wav2vec2.0 model. It achieves a word error rate of 6.9% on a test set of broadcast conversations. Besides recognition it performs speaker diarization, speaker identification, Estonian language detection, and punctuation restoration. The service consists of a speech processing pipeline, web server and a web-based user interface for end-users, offering transcript editing and speaker annotation functionality. The core components of the service have been made open-source and deployed internally by multiple public and private institutions.

Details

Title
Estonian Speech Recognition and Transcription Editing Service
Author
Olev, Aivo; Alumae, Tanel
Pages
409-421
Publication year
2022
Publication date
2022
Publisher
University of Latvia
ISSN
22558942
e-ISSN
22558950
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2721358803
Copyright
© 2022. This work is published under https://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.