Full Text

Turn on search term navigation

© 2021. This work is published under https://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Web Page Classification is decisive for information retrieval and management task and plays an imperative role for natural language processing (NLP) problems in web engineering. Traditional machine learning algorithms excerpt covet features from web pages whereas deep leaning algorithms crave features as the network goes deeper. Pre-trained models such as BERT attains remarkable achievement for text classification and continue to show state-ofthe-art results. Knowledge Graphs can provide rich structured factual information for better language modelling and representation. In this study, we proposed an ensemble Knowledge Based Deep Inception (KBDI) approach

for web page classification by learning bidirectional contextual representation using pre-trained BERT incorporating Knowledge Graph embeddings and fine-tune the target task by applying Deep Inception network utilizing parallel multi-scale semantics. Proposed ensemble evaluates the efficacy of fusing domain specific knowledge embeddings with the pre-trained BERT model. Experimental interpretation exhibit that the proposed BERT fused KBDI model outperforms benchmark baselines and achieve better performance in contrast to other conventional approaches evaluated on web page classification datasets.

Details

Title
Knowledge Based Deep Inception Model for Web Page Classification
Author
Gupta, Amit  VIAFID ORCID Logo  ; Bhatia, Rajesh
Pages
2131-2168
Section
SPECIAL ISSUE: ADVANCED PRACTICES IN WEB ENGINEERING 2021
Publication year
2021
Publication date
2021
Publisher
River Publishers
ISSN
15409589
e-ISSN
15445976
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3055533182
Copyright
© 2021. This work is published under https://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.