Full text

Turn on search term navigation

Copyright © 2015 Xin Bi et al. Xin Bi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

With the exponentially increasing volume of XML data, centralized learning solutions are unable to meet the requirements of mining applications with massive training samples. In this paper, a solution to distributed learning over massive XML documents is proposed, which provides distributed conversion of XML documents into representation model in parallel based on MapReduce and a distributed learning component based on Extreme Learning Machine for mining tasks of classification or clustering. Within this framework, training samples are converted from raw XML datasets with better efficiency and information representation ability and taken to distributed learning algorithms in Extreme Learning Machine (ELM) feature space. Extensive experiments are conducted on massive XML documents datasets to verify the effectiveness and efficiency for both classification and clustering applications.

Details

Title
Distributed Learning over Massive XML Documents in ELM Feature Space
Author
Bi, Xin; Zhao, Xiangguo; Wang, Guoren; Zhang, Zhen; Chen, Shuang
Publication year
2015
Publication date
2015
Publisher
John Wiley & Sons, Inc.
ISSN
1024123X
e-ISSN
15635147
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1686348520
Copyright
Copyright © 2015 Xin Bi et al. Xin Bi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.