Full Text

Turn on search term navigation

Copyright International Journal of Advanced Computer Research Jul 2016

Abstract

Keyword extraction is an important task in text mining. In this paper a novel, unsupervised, domain independent and language independent approach for automatic keyword extraction from single documents have been proposed. We have used the word intermediate distance vector and its mean value to extract keywords. We have compared our approach with results from the standard deviation of intermediate distances approach as standard and found that there is heavy overlapping between the results of both approaches with the advantage that our approach is faster, especially in case of long documents as it removes the need to compute the standard deviation of word intermediate distance vector. Two famous works viz. "Origin of Species" and "A Brief History of Time" to demonstrate the experimental results have been used. Experiments show that the proposed approach works almost as better as the standard deviation approach and the percentage overlap between top 30 extracted keywords is more than 50%.

Details

Title
Keyword extraction from single documents using mean word intermediate distance
Author
Siddiqi, Sifatullah; Sharan, Aditi
Pages
138-145
Section
Research Article
Publication year
2016
Publication date
Jul 2016
Publisher
Accent Social and Welfare Society
ISSN
22497277
e-ISSN
22777970
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1811730732
Copyright
Copyright International Journal of Advanced Computer Research Jul 2016