Content area

Abstract

SyBig-r-Morph is a versatile tool for generating pseudowords designed for Greek, but it can be easily modified to work with any language. By allowing researchers to produce phonotactically and morphologically well-formed pseudowords that are specifically tailored to particular morphosyntactic categories, such as nouns or verbs, it overcomes the shortcomings of current multilingual generators. This tool is especially valuable for designing controlled linguistic experiments, including studies on stress assignment, lexical access, and morphophonological and lexical processing. By serving as an important link between orthographic representation and phonological realization—an important step in the text-to-speech pipeline—SyBig-r-Morph offers a valuable tool for psycholinguistic research, computational phonology, and speech synthesis applications that require linguistically authentic pseudoword stimuli.

Pseudowords are essential in (psycho)linguistic research, offering a way to study language without meaning interference. Various methods for creating pseudowords exist, but each has its limitations. Traditional approaches modify existing words, risking unintended recognition. Modern algorithmic methods use high-frequency n-grams or syllable deconstruction but often require specialized expertise. Currently, no automatic process for pseudoword generation is designed explicitly for Greek, which is our primary focus. Therefore, we developed SyBig-r-Morph, a novel application that constructs pseudowords using syllables as the main building block, replicating Greek phonotactic patterns. SyBig-r-Morph draws input from word lists and databases that include syllabification, word length, part of speech, and frequency information. It categorizes syllables by position to ensure phonotactic consistency with user-selected morphosyntactic categories and can optionally assign stress to generated words. Additionally, the tool uses multiple lexicons to eliminate phonologically invalid combinations. Its modular architecture allows easy adaptation to other languages. To further evaluate its output, we conducted a manual assessment using a tool that verifies phonotactic well-formedness based on phonological parameters derived from a corpus. Most SyBig-r-Morph words passed the stricter phonotactic criteria, confirming the tool’s sound design and linguistic adequacy.

Details

1009240
Title
Syllable-, Bigram-, and Morphology-Driven Pseudoword Generation in Greek
Author
Kosmidis Kosmas 1   VIAFID ORCID Logo  ; Apostolouda Vassiliki 2   VIAFID ORCID Logo  ; Revithiadou Anthi 2   VIAFID ORCID Logo 

 Department of Physics, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece 
 School of Philology, Department of Linguistics, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece; [email protected] (V.A.); [email protected] (A.R.) 
Publication title
Volume
15
Issue
12
First page
6582
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20763417
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-06-11
Milestone dates
2025-04-29 (Received); 2025-06-06 (Accepted)
Publication history
 
 
   First posting date
11 Jun 2025
ProQuest document ID
3223874103
Document URL
https://www.proquest.com/scholarly-journals/syllable-bigram-morphology-driven-pseudoword/docview/3223874103/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-11-07
Database
ProQuest One Academic