Full Text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

As an important infrastructure in the era of big data, the knowledge graph can integrate and manage data resources. Therefore, the construction of tourism knowledge graphs with wide coverage and of high quality in terms of information from the perspective of tourists’ needs is an effective solution to the problem of information clutter in the tourism field. This paper first analyzes the current state of domestic and international research on constructing tourism knowledge graphs and highlights the problems associated with constructing knowledge graphs, which are that they are time-consuming, laborious and have a single function. In order to make up for these shortcomings, this paper proposes a set of systematic methods to build a tourism knowledge graph. This method integrates the BiLSTM and BERT models and combines these with the attention mechanism. The steps of this methods are as follows: First, data preprocessing is carried out by word segmentation and removing stop words; second, after extracting the features and vectorization of the words, the cosine similarity method is used to classify the tourism text, with the text classification based on naive Bayes being compared through experiments; third, the popular tourism words are obtained through the popularity analysis model. This paper proposes two models to obtain popular words: One is a multi-dimensional tourism product popularity analysis model based on principal component analysis; the other is a popularity analysis model based on emotion analysis; fourth, this paper uses the BiLSTM-CRF model to identify entities and the cosine similarity method to predict the relationship between entities so as to extract high-quality tourism knowledge triplets. In order to improve the effect of entity recognition, this paper proposes entity recognition based on the BiLSTM-LPT and BiLSTM-Hanlp models. The experimental results show that the model can effectively improve the efficiency of entity recognition; finally, a high-quality tourism knowledge was imported into the Neo4j graphic database to build a tourism knowledge graph.

Details

Title
Exploring the Potential of BERT-BiLSTM-CRF and the Attention Mechanism in Building a Tourism Knowledge Graph
Author
Xu, Hongsheng 1   VIAFID ORCID Logo  ; Fan, Ganglong 1 ; Kuang, Guofang 1 ; Wang, Chuqiao 1 

 College of Electronic Commerce, Luoyang Normal University, Luoyang 471934, China; Henan Key Laboratory for Big Data Processing & Analytics of Electronic Commerce, Luoyang Normal University, Luoyang 471934, China 
First page
1010
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2779543902
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.