Content area
Full text
(ProQuest: ... denotes non-US-ASCII text omitted.)
1
Introduction
Capturing word meaning is one of the challenges of natural language processing (NLP). Taxonomies and, in general, semantic networks of words (Miller 1995) are often used as formal models of meaning in intermediate NLP tasks, such as word sense disambiguation (Agirre and Rigau 1996), selectional preference induction (Resnik 1993), and textual entailment recognition (corley and Mihalcea 2005; Zanzotto et al. 2009), as well as in final applications, such as question-answering (Clark, Fellbaum and Hobbs 2008). In these networks, words are connected with other words by means of taxonomic and, in general, semantic relations. This is a way to capture part of the knowledge described in traditional dictionaries. For example, this informal definition of 'wheel':
a wheel is a circular frame turning about an axis . . . used for supporting vehicles. . .
contains a taxonomic relation, i.e. the wheel is a circular frame, and a sort of part-of relation, i.e. the wheel is used for supporting vehicles.
Yet, to be effectively used in applications, semantic networks have to be large or, at least, adapted to specific domains. Even large lexical knowledge repositories (e.g. WordNet, Miller 1995) are extremely poor when used in specific domains, such as medicine (Toumouth et al. 2006). Automatically creating, adapting, or extending existing knowledge repositories using domain texts is, then, a very important and active area. Building on the distributional hypothesis (Harris 1964) or on the notion of the lexico-syntactic patterns (originally used in Robison 1970), a large variety of methods have been proposed: ontology learning methods (Medche 2002; Navigli and Velardi 2004; Cimiano, Hotho and Staab 2005) in knowledge representation as well as knowledge harvesting methods in NLP (Hearst 1992; Pantel and Pennacchiotti 2006). This learning task is generally seen as a classification (Pekar and Staab 2002; Snow, Jurafsky and Ng 2006) or a clustering (Cimiano et al. 2005) problem.
Many models for learning generic semantic relations between words are binary classifiers (Pantel and Pennacchiotti 2006; Snow et al. 2006). In this case the task is deciding whether the two words are in a specific semantic relationship. Lexico-syntactic patterns are used as features to build vector spaces for word pairs where...





