Content area
Full text
(ProQuest: ... denotes non-US-ASCII text omitted.)
Articles
1
Introduction
Natural language inherently has ambiguity. There are many words in all languages that have two or more meanings. For example, palm is a noun in English, but depending on its context can mean surface of the hand or a type of tree. Clarifying this ambiguity and selecting the proper meaning based on the context is called Word Sense Disambiguation (WSD). Hence, many people are interested in automating WSD in machine translation process. In addition to machine translation, fields such as information retrieval, search engines technologies, speech processing and speech to text translations may also benefit from this technology. For instance, Hayat in Persian means life or yard with the same pronunciation but different spellings.
Word Sense Disambiguation based on supervised learning methods produce the best results for distinguishing sense of the word in public evaluation (Palmer et al. 2001; Snyder and Palmer, 2004; Pradhan et al. 2007; Zhong and Ng 2010). Preprocessing and creating effective and efficient feature sets increase their prediction rate and accuracy. Some of the previous works, which are discussed in the related work section, took into account the ordered sequences of words in this context. However, to the best of our knowledge, this is the first work that shows the effectiveness of iterative patterns as a means for WSD algorithms. Iterative patterns are ordered sequences of words in the context that are not necessarily consecutive. Grammatical structure of most languages is based on rules that are iterative; therefore, considering the iterative patterns to correctly identify word senses is essential.
According to Markov assumptions (Jurafsky and Martin 2008), which state that current word does not depend on the entire history of the words in the context but at most on the last few words, sequence of n words, which is a subset of sentence, is investigated to help WSD. The size of 'n' is very important in order to generate effective models. Unfortunately, no consensus exists on the value of 'n' for all WSD problems. Considering a large 'n' for n-grams means increasing the probability of getting the correct sense in WSD problems, but large n-grams may not occur in the training data. In contrast, as the size of 'n'...