Abstract

For Internet forum Points of Interest (PoI), existing analysis methods are usually lack of usability analysis under different conditions and ignore the long-term variation, which lead to blindness in method selection. To address this problem, this paper proposed a PoI variation prediction framework based on similarity analysis between long and short windows. Based on the framework, this paper presented 5 PoI analysis algorithms which can be categorized into 2 types, i.e., the traditional sequence analysis methods such as autoregressive integrated moving average model (ARIMA), support vector regressor (SVR), and the deep learning methods such as convolutional neural network (CNN), long-short term memory network (LSTM), Transformer (TRM). Specifically, this paper firstly divides observed data into long and short windows, and extracts key words as PoI of each window. Then, the PoI similarities between long and short windows are calculated for training and prediction. Finally, series of experiments is conducted based on real Internet forum datasets. The results show that, all the 5 algorithms could predict PoI variations well, which indicate effectiveness of the proposed framework. When the length of long window is small, traditional methods perform better, and SVR is the best. On the contrary, the deep learning methods show superiority, and LSTM performs best. The results could provide beneficial references for PoI variation analysis and prediction algorithms selection under different parameter configurations.

Details

Title
Interest Points Analysis for Internet Forum Based on Long-Short Windows Similarity
Author
Ju, Xinghai; Lu, Jicang; Luo, Xiangyang; Zhou, Gang; Wang, Shiyu; Li, Shunhang; Yang, Yang
Pages
3247-3267
Section
ARTICLE
Publication year
2022
Publication date
2022
Publisher
Tech Science Press
ISSN
1546-2218
e-ISSN
1546-2226
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2646011095
Copyright
© 2022. This work is licensed under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.