This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
In recent years, with the background of social media, forums have become a specific community for users who have the same interests. An increasing number of users post related reviews in forums [1]. These reviews cover a wide variety of content, ranging from breaking news, discussions on various topics, posts about one’s personal life, and the sharing of activities and interests [2]. As a significant platform for the users’ discussion, some forums maintain a high level of user activity. In addition, the feedback from forum users is usually an important source of information for potential consumers to access product features. Enterprises also aim to discover product defects and real users’ requirements via reviews in forums.
Due to the strong negative response to the initial exposure to erroneous information, it is difficult to correct such influences later. Once a network agrees on what happened, the collective memory becomes relatively resistant to competing information [3]. Thus, fake reviews in forums are now the biggest problem for forum users and enterprises.
Lots of current studies indirectly identify fake reviews by recognizing forum spammers based on behavioral features or sentiment analysis methods [4–7]. However, forum spammers are constantly updating their technology or changing their posting methods to prevent them from being detected by the fake reviews recognition system, which makes many methods no longer useful for recognizing forum spammers. Although the forum spammers try to disguise themselves as ordinary users, this purposeful posting will eventually show different behaviors from ordinary users. Therefore, this paper changes the research target from understanding abnormal reviews and the suspicious relationship among forum spammers to discovering how they must behave (follow or be followed) to achieve their monetary goals. Firstly, we classify forum users as automated spammers, marketing spammers, and normal users according to the different behavior patterns of forum users. Automated spammers are those forum users who are controlled by the spam software. They disguise themselves as normal users who display an intention to purchase the related product or express dissatisfaction toward a related product. Normally, automated spammers mislead forum users by posting reviews with a biased emotional tendency. Marketing spammers are real users who are hired by a spam company. In contrast to automated spammers, marketing spammers disguise themselves as leading users in forums to promote related products. They post deep, detailed, and positive reviews to overstate the quality of related products. In general, the more detailed analysis, the more useful information for forum users [8–10]. Moreover, marketing spammers, as a new but contemptible marketing mode, are emerging in many forums [11]. Then, we propose a behavior-driven automated spammer recognition (ASR) model and a marketing spammer recognition (MSR) model to recognize forum spammers based on the above three types of forum users. Final experimental results illustrate our behavior-driven recognition models are able to accurately detect forum spammers.
The paper is organized as follows: Section 2 reviews the related works. In Section 3, we define some variables to measure the behavior features of forum users. The proposed ASR and MSR models are introduced in Section 4. Subsequently, we describe the experimental dataset and discuss the main experimental results in Section 5. Finally, we conclude with a summary in Section 6.
2. Related Works
At present, the research on recognizing spammers and fake reviews is mainly focused on social media like Twitter. Some e-business websites, such as Amazon and Taobao, have also achieved more research attention. In terms of recognizing forum spammers, a few studies have been conducted in recent years, mainly focusing on the recognition of fake forums and forum spam automator tools. Some recognition methods based on abnormal text content have also been proposed by researchers. Some researchers attempt to use abnormal URL characteristics in reviews and the link structure of the graph rooted at the posted URL to recognize posts from the forum spammers [12, 13]. Additionally, contents unrelated to the target posts in the forum were used to recognize forum spammers [14]. Shin [15] discovered some features and operational mechanisms of a forum spam automator tool named XRumer. This study provided some ideas for recognizing the forum spammers who used this tool. Some researchers proposed an approach that uses features such as the submission time of replies, thread activeness, position of replies, and spamicity of a forum user’s first post to construct a forum spammer recognition model [5]. The significant differences in the action time and action frequency between forum spammers and normal users were also used to construct the forum spammer recognition model [7]. The performance of the classifier in [6], with an integrated semantic analysis, was quite promising in the real-world case study, as confirmed with both supervised learning and unsupervised learning techniques by comparing a nonsemantic and semantic analysis. As demonstrated in [16], by analyzing the features of forum users, forum spammer, and forums, the authors found that every forum has many fake reviews, including some forums with good reputations.
However, our work found that the methods mentioned above are no longer working well. For instance, most users are now able to easily distinguish rough and fake websites with many advertisements, so the number of fake reviews with URLs [12, 13] has become much lower. Additionally, we found that the recognition effect of the method in [14] would be compromised if a large number of forum spammers have occupied the forums. In our study, the abnormal feature named spamicity in the first post in [5] does not work currently for recognizing forum spammers. At the same time, we found that marketing spammers have a similar abnormal behavioral feature named the submission time of replies in [5] but we cannot find the same behavioral pattern among automated spammers. In [16], the method that recognizes spam pages based on spam content features is still effective, but this method cannot efficiently recognize forum spammers who have many reviews that are similar to those of normal users. In [6], the authors mentioned that once a mission is finished, a paid spam poster normally discards the user ID and never uses it again, potential paid spam posters are not willing to continue their activities for a long time.
In recent years, research on spammers in social media and e-business websites has been increasing. Liu [17] proposed a two-stage cascading model, named ProZombie, which balanced effectiveness and accuracy well in recognizing spammers in Weibo. In [18], message content, user behavior, and social relationship information were fully used to recognize spammers in Weibo. The work by Hayati et al. [19] proposed using a self-organizing map and neural networks to determine the features of spammers on the Internet. They classified spammers into four categories based on the different behavioral patterns of spammers: content submitters, profile editors, content viewers, and mixed behavior. Radford et al. [20] constructed an unsupervised representation learning system, which reached an accuracy of 91.8% in sentiment analysis by using reviews in Amazon as training datasets. Furthermore, the authors in [12, 21] recognized fake reviews via the difference of emoticons, URLs, @ symbols, and photos in different reviews from spammers and normal users. Dewang et al. [22] proposed a spam detection framework combining the PageRank algorithm to detect the spam host of websites. In [6], the authors distinguished the fake reviews by using word segmentation for the text and calculating the emotional tendency. Jiang and Ratkiewicz [23, 24] found that spammers have a “synchronized” behavioral pattern for a particular target and that it is significantly different from that of normal users. A spam detection model called SkyNet using user social networks and the posted photos in reviews has been proposed by Sun and Kenneth Loparo [25]. In [26], the final recognition accuracy for spammers was improved by 9.73% by integrating the social network and content information into a matrix decomposition-based learning model. The above recognition methods for spammers in social media and E-business websites are developed well. However, our work found that these methods cannot be directly used to recognize forum spammers as they are not well adapted to their special behavioral patterns.
Our work is inspired by the idea of using noncontent-based features. Furthermore, Asghar et al. [27] also illustrated the effectiveness of spam-related features on improving the performance of spam detection works. Thus, we construct behavior-driven forum spammer recognition models by understanding how forum spammers must behave (follow or be followed) for monetary purposes. To the best of our knowledge, this work is the first to construct forum spammer recognition models based on forum users’ different behavioral patterns. In addition, we achieved promising experimental results on real-world forum datasets.
3. Observed Features
Automated spammers and marketing spammers often cooperate with each other to mislead forum users via the different roles they play in forums. In addition, the differences in roles they play inevitably lead to differences in the behavioral patterns they exhibit in forums. To recognize these forum spammers, in this section, the features of abnormal behaviors that are likely to be linked with the forum spammers are proposed and some variables are defined to measure these features. Subsequently, these variables can be exploited in our recognition models.
3.1. Automated Spammer Features
In this section, we perform a statistical analysis to investigate the objective features that are useful in capturing the reply behavior of automated spammers. And for each feature, we define the relevant variable. The four features of automated spammers are fully described as follows.
3.1.1. Reply Manner
The work in [6] reported that the spammers usually tend to post new comments because they do not have enough patience to read the comments and replies of others. The authors also proposed the response indicator (whether the comment is a new comment or a reply to another comment) to capture the abnormal behavior. However, automated spammers in forums never post any replies to the comments of others, and they only post new replies. To recognize this more extreme abnormal behavioral pattern in forums, we define
As shown in Table 1, in the labelled dataset, we find 100% of automated spammers never reply to another comment, but only 1.68% of normal users have this similar behavior. On contrary, most normal users in forums not only post new replies but also post many replies to the comments of others.
Table 1
Reply indicators.
| RM | 0 (%) | 1 |
| Automated spammers | 100 | 0 |
| Normal users | 1.68 | 98.32% |
3.1.2. Replies Number
Posting a large number of replies within a single minute also indicates an abnormal behavior. As shown in Table 2, in the labelled dataset, some automated spammers post more than 30 replies in a single minute, which means that they can post a reply within 2 seconds on average. To capture this abnormal behavioral pattern, we define
Table 2
Percentage of the number of replies.
| MRN ≥ | 10 (%) | 20 (%) | 30 |
| Automated spammers | 6.29 | 0.98 | 0.39% |
| Normal users | 1.63 | 0.16 | 0 |
3.1.3. Cooccurrence Frequency
To avoid being detected, automated spammers in the forum utilize different reply content from their databases frequently to reply to different original posts. The phenomenon that a forum spammer uses the same content to reply to an original post continuously has become rare now. However, currently, spam teams that are constituted by different automated spammers start to post fake replies to target posts continuously. Thus, it leads to cooccurrence behavior. This means that many automated spammers appear together at the same time or within a short time period. As shown in Table 3, in our labelled dataset, 59.14% of the automated spammers have this behavior that any two forum users post replies together with one minute more than five times. In contrast, only 3.52% of normal users have the same behavioral pattern. Therefore, we define
Table 3
Percentage of the cooccurrence frequency.
| CF ≥ | 3 (%) | 4 (%) | 5 (%) | 6 (%) | 7 (%) |
| Automated spammers | 74.26 | 64.44 | 59.14 | 54.42 | 40.47 |
| Normal users | 8.23 | 5.25 | 3.52 | 2.72 | 2.20 |
3.1.4. Duplicate Replies (DR)
Automated spammers usually post duplicate replies under different original posts [28]. Our study finds that a few normal users also post some duplicate replies, such as “I support the original poster.” However, the higher the ratio of a user’s duplicate replies, the more likely he/she is an automated spammer in the forum. To capture this abnormal behavior, we define
As shown in Table 4, 55.40% of automated spammers have a duplicate replies rate of more than 0.5, but the rate for the normal users is extremely low.
Table 4
Percentage of the ratio of duplicate replies.
| DRR ≥ | 0.3 (%) | 0.4 (%) | 0.5 (%) | 0.6 (%) | 0.7 (%) |
| Automated spammers | 58.74 | 56.19 | 55.40 | 44.79 | 36.74 |
| Normal users | 15.93 | 8.70 | 3.65 | 1.05 | 0.03 |
3.2. Marketing Spammer Features
As discussed before, marketing spammers usually disguise themselves as the leading users in the forums. These spammers not only post replies but also publish many original posts as do normal users. In other words, they are real forum users but they do what the spammers always do. Therefore, it is difficult to recognize marketing spammers using a recognition model that is constructed based on the abnormal behavioral features of automated spammers. In this section, three abnormal behavior features are identified in terms of the posting behavior of marketing spammers.
3.2.1. Posting in Many Forums
Due to the increasing strict registration process in forums, a forum account, especially a reputable forum account, is becoming a rare resource for marketing spammers. To maximize their commercial interests, the forum accounts of marketing spammers normally work in several forums. In other words, marketing spammers may publish fake original posts for different targeted products in several forums. As shown in Table 5, in the labelled dataset, the average number of forums in which marketing spammers publish original posts is much higher than that of normal users. Therefore, the variable
Table 5
The number of forums in which marketing spammers publish original posts.
| Marketing spammer | The number of forums |
| MS1 | 45 |
| MS2 | 33 |
| MS3 | 57 |
| MS4 | 132 |
| MS5 | 73 |
| MS6 | 35 |
| MS7 | 52 |
| MS8 | 66 |
| MS9 | 75 |
| MS10 | 136 |
| MS11 | 49 |
| Average | 68.45 |
| Average (normal user) | 3.56 |
3.2.2. Posting Intensity Is High and Uneven
To strengthen the performance of the marketing effort, marketing spammers usually publish a series of original posts and actively interact with other forum users during the marketing period. In this period, marketing spammers promote the targeted product via the diffusion of a large number of positive word-of-mouth recommendations that they make. Moreover, they sometimes publish many negative word-of-mouth recommendations to slander their competitors. All of these are for their marketing purpose. Therefore, once the marketing period is finished, the activity of marketing spammers will decline sharply or the users even disappear completely. Moreover, the point in time at which marketing spammers post original posts usually is highly correlated with the targeted product’s marketing events. As shown in Figure 1, a new car named Tiggo7 began to sell from September 2016, and with the rising search number (yellow line), the activity of marketing spammers also began to increase. Apparently, the average number of postings of marketing spammers reached the maximum 3 months after the new car was put on the market. However, with the decline of the search numbers and the end of the marketing period, the average number of postings by marketing spammers began to decline sharply or even reached zero. Moreover, the average number of postings of normal users was always stable and low. That is, the posting and replying activities of marketing spammers show alternating or cyclical fluctuations. As such, two variables
[figure omitted; refer to PDF]
In addition, we notice that a few forum users are automobile evaluators who posted many original posts and replies in many forums. Their behavior patterns are similar to those of marketing spammers, so they may be considered marketing spammers by the MSR model. As a special user group in the automobile forum, these automobile evaluators are not considered in our experiments because there are no such users in other types of forums. Eventually, the ASR and MSR models recognized 41 forum spammers in all the Baojun610 forums. The experimental results show that our behavior-driven recognition models are effective and accurate.
More interestingly, we noticed that a forum user named “Baidu Knows” (in Chinese), indicated by the green circle in Figure 4, and the forum user named “Secret Passage” (in Chinese), indicated by the yellow circle in Figure 4, surprisingly posted original posts in 140 and 118 forums, respectively. As we can see in Figure 3, they completely stopped posting after many original posts. The number of original posts that they posted is significantly higher than the average number of original posts of other forum users. We then accessed their user profiles on the Bitauto website, as seen in Figures 5 and 6.
[figure omitted; refer to PDF]
As shown in Figure 5, the forum user named “Baidu Knows” (in Chinese) posted many original posts in forums on March 25, 2015. In the morning, he complained that his automobile, a VW Golf, could not be started. Then, in the afternoon, he watched a DCD in his automobile, an Infiniti QX70. His last original post was posted on August 04, 2017. Currently, his original posts and replies have been deleted by the officials, and the account has been closed. This also proves that our MSR model is effective and that the recognition result is precise.
As seen from Figure 6, the forum user named “Secret Passage” (in Chinese) is an officially verified forum user who has a high level of influence. He posted original posts in many forums in a single day, and this behavior is similar to that of the forum user named “Baidu Knows” (in Chinese). He not only praised his automobile, a Geely Vision that has been driven 60,000 km with few serious problems so far, but also complained about the idling problem of his Buick Regal automobile, which has been driven 20,000 km. In addition, he also wishes to sell his Senova D50 automobile. From his contradictory words, we can infer that he is a forum spammer.
Table 9
Comparison experiment with other models.
| Model | Precision | Recall | F1-score |
| Hu’s model [4] | 0.886 | 0.918 | 0.902 |
| Chen’s model [5] | 0.878 | 0.922 | 0.897 |
| Yu’s model [18] | 0.924 | 0.943 | 0.933 |
| The proposed architecture | 0.964 | 0.938 | 0.951 |
5.2.3. Experiment 3: Comparison with Other Methods
In this section, the proposed architecture is compared with three representative models [4, 5, 18]. Table 9 shows the comparisons of the precision, recall, F1-score of each model on the Tiggo7 dataset. It is obvious that the proposed model outperforms other models. We believe that this is because we take more account of the user’s behavior features. This also shows that the behavior feature-based method is better than the previous methods.
5.2.4. Experiment 4: Analysis of Running Time
Finally, we count the running time of the proposed model, as shown in Table 10, including feature extraction and two-level model. We can easily find that feature extraction takes up most of the time. This is because we need to calculate not only the personal behavior features of users but also the interactive behavior features between different users, which increases the burden of calculation. In addition, according to the feature extraction method described in Section 3, we can infer that the complexity of feature extraction depends on the following points: the total number of forum users, the number of forum posts, and the length of forum posts.
Table 10
Running time of the proposed model.
| Total time (min) | Feature extraction (min) | Two-level model | |
| ASR (min) | MSR (min) | ||
| 16.16 | 12.85 | 1.59 | 1.72 |
6. Conclusion
Fake reviews in forums are always an obstacle for enterprises to make effective use of the information in forums. And forum spammers are constantly updating their technology or changing their posting methods to prevent them from being detected by the fake reviews recognition system. Although the forum spammers try to disguise themselves as ordinary users, this purposeful posting will eventually show different behaviors from ordinary users. Therefore, this paper changes the research target from understanding abnormal reviews and the suspicious relationship among forum spammers to discovering how they must behave (follow or be followed) to achieve their monetary goals. Based on different behavior features, forum spammers can be classified into automated forum spammers and marketing forum spammers. The support vector machine-based ASR model and the k-means clustering-based MSR model are developed, and their applications are demonstrated by using car forum reviews written in Chinese. The final experimental results illustrate the effectiveness of our behavior-driven recognition models.
Acknowledgments
This research was supported by the National Natural Science Foundation of China (no. 72101075 and 72101078), the Fundamental Research Funds for the Central Universities (nos. JZ2020HGQA0168 and JZ2021HGQA0204), and the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (no. 71521001).
[1] X. T. Vu, P. Morizet-Mahoudeaux, A User-Centered Approach for Integrating Social Data into Groups of Interest, 2015.
[2] B. Zhao, Z. Zhang, W. Qian, A. Zhou, "Identification of collective viewpoints on microblogs," Data & Knowledge Engineering, vol. 87 no. 9, pp. 374-393, DOI: 10.1016/j.datak.2013.05.003, 2013.
[3] L. Spinney, "How Facebook, fake news and friends are warping your memory," Nature, vol. 543, pp. 168-170, DOI: 10.1038/543168a, 2017.
[4] X. Hu, T. Jiliang, G. Huiji, L. Huan, "Social spammer detection with sentiment information," Proceedings of the IEEE International Conference on Data Mining IEEE, pp. 180-189, DOI: 10.1109/icdm.2014.141, .
[5] Y. R. Chen, H. H. Chen, "Opinion spam detection in web forum: a real case study," In Proceedings of the, International Conference, pp. 173-183, DOI: 10.1145/2736277.2741085, .
[6] C. Chen, K. Wu, V. Srinivasan, X. Zhang, "Battling the internet water army: detection of hidden paid posters," 2011. http://arxiv.org/abs/1111.4297
[7] P. Hayati, K. Chai, V. Potdar, A. Talevski, "Behaviour-based web spambot detection by utilising action time and action frequency," pp. 351-360, DOI: 10.1007/978-3-642-12165-4_28, .
[8] J. P. Singh, S. Irani, N. P. Rana, Y. K. Dwivedi, S. Saumya, P. Kumar Roy, "Predicting the “helpfulness” of online consumer reviews," Journal of Business Research, vol. 70 no. 70, pp. 346-355, DOI: 10.1016/j.jbusres.2016.08.008, 2017.
[9] Y.-M. Li, H.-M. Chen, J.-H. Liou, L.-F. Lin, "Creating social intelligence for product portfolio design," Decision Support Systems, vol. 66, pp. 123-134, DOI: 10.1016/j.dss.2014.06.013, 2014.
[10] S. M. Mudambi, D. Schuff, "What makes a helpful online review? a study of customer reviews on amazon.com," Social Science Electronic Publishing, vol. 34 no. 1, pp. 185-200, 2012.
[11] Y. Chen, J. Xie, "Online consumer review: word-of-mouth as a new element of marketing communication mix," Management Science, vol. 54 no. 3, pp. 477-491, DOI: 10.1287/mnsc.1070.0810, 2008.
[12] M. Ghannoum, "Prevalence and mitigation of forum spamming," vol. 34 no. 17, pp. 2309-2317, DOI: 10.1109/INFCOM.2011.5935048, .
[13] Y. Shin, S. Myers, M. Gupta, P. Radivojac, "A link graph-based approach to identify forum spam," Security and Communication Networks, vol. 8 no. 2, pp. 176-188, DOI: 10.1002/sec.970, 2015.
[14] Y. J. Lee, J.-M. Shim, H.-G. Cho, G. Woo, "Detecting and visualizing the dispute structure of the replying comments in the internet forum sites," pp. 456-463, DOI: 10.1109/cyberc.2010.90, .
[15] Y. Shin, M. Gupta, S. Myers, "The nuts and bolts of a forum spam automator," .
[16] Y. Niu, W. Yi-Min, C. Hao, M. Ming, H. Francis, "A quantitative study of forum spamming using context-based analysis," .
[17] H. Liu, Z. Yuchao, L. Hao, Wu Junjie, W. Zhiang, Z. Xu, "How many zombies around you," Proceedings of the 2013 International Conference on Data Mining, pp. 1133-1138, DOI: 10.1109/icdm.2013.166, .
[18] D. Yu, N. Chen, F. Jiang, B. Fu, A. Qin, "Constrained NMF-based semi-supervised learning for social media spammer detection," Knowledge-Based Systems, vol. 125, pp. 64-73, DOI: 10.1016/j.knosys.2017.03.025, 2017.
[19] P. Hayati, V. Potdar, K. Chai, A. Talevski, "Characterization of web spambots using self organizing maps," Computer Systems Science and Engineering, vol. 26 no. 2, 2011.
[20] A. Radford, R. Jozefowicz, I. Sutskever, "Learning to generate reviews and discovering sentiment," 2017. http://arxiv.org/abs/1704.01444
[21] L. Akoglu, M. Mcglohon, C. Faloutsos, "Oddball: spotting anomalies in weighted graphs," pp. 410-421, DOI: 10.1007/978-3-642-13672-6_40, .
[22] R. K. Dewang, A. K. Singh, "State-of-art approaches for review spammer detection: a survey," Journal of Intelligent Information Systems, vol. 50 no. 2, pp. 231-264, DOI: 10.1007/s10844-017-0454-7, 2018.
[23] M. Jiang, C. Peng, B. Alex, F. Christos, Y. Shiqiang, "Inferring strange behavior from connectivity pattern in social networks," Proceedings of the 2014 Pacific-Asia Conference on Knowledge Discovery and Data Mining, .
[24] J. Ratkiewicz, M. Conover, B. G. Alves, A. Flammini, F. Menczer, "Detecting and tracking political abuse in social media," .
[25] Y. Sun, K. Loparo, "Opinion spam detection based on heterogeneous information network," ,DOI: 10.1109/ictai.2019.00277, .
[26] S. Ghosh, V. Bimal, K. Farshad, S. Naveen Kumar, K. Gautam, B. Fabricio, G. Niloy, G. Krishna Phani, "Understanding and combating link farming in the twitter social network," Proceedings of the 21st International Conference on World Wide WebACM, pp. 61-70, DOI: 10.1145/2187836.2187846, .
[27] M. Z. Asghar, A. Ullah, S. Ahmad, A. Khan, "Opinion spam detection framework using hybrid classification scheme," Soft computing, vol. 24 no. 5, pp. 3475-3498, DOI: 10.1007/s00500-019-04107-y, 2020.
[28] E. P. Lim, V.-A. Nguyen, N. Jindal, B. Liu, H. Wirawan Lauw, "Detecting product review spammers using rating behaviors," Proceedings of the ACM International Conference on Information and Knowledge Management ACM, pp. 939-948, DOI: 10.1145/1871437.1871557, .
[29] JATO. http://www.jato.com/global-car-sales-5-6-2016-due-soaring/
[30] iResearch, "The monthly report about internet advertising of chinese automotive industry," 2016. in Chinese
[31] A. Mukherjee, B. Liu, N. Glance, "Spotting fake reviewer groups in consumer reviews," Proceedings of the International Conference on World Wide Web ACM, pp. 191-200, DOI: 10.1145/2187836.2187863, .
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2021 Han Su et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
Forum comments are valuable information for enterprises to discover public preferences and market trends. However, extensive marketing and malicious attack behaviors in forums are always an obstacle for enterprises to make effective use of this information. And these forum spammers are constantly updating technology to prevent detection. Therefore, how to accurately recognize forum spammers has become an important issue. Aiming to accurately recognize forum spammers, this paper changes the research target from understanding abnormal reviews and the suspicious relationship among forum spammers to discover how they must behave (follow or be followed) to achieve their monetary goals. First, we classify forum spammers into automated forum spammers and marketing forum spammers based on different behavioral features. Then, we propose a support vector machine-based automated spammer recognition (ASR) model and a k-means clustering-based marketing spammer recognition (MSR) model. The experimental results on the real-world labelled dataset illustrate the effectiveness of our methods on classification spammer from common users. To the best of our knowledge, this work is among the first to construct behavior-driven recognition models according to the different behavioral patterns of forum spammers.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Ni, Xin 2 ; Zhao, Fang 3 1 School of Management, Hefei University of Technology, Hefei 230009, China
2 Department of Design, Information System and Inventive Processes, INSA de Strasbourg, Strasbourg, France
3 School of Management, Hefei University of Technology, Hefei 230009, China; Department of Information Systems and Analytics, National University of Singapore, 13 Computing Drive, Singapore





