Anomalous Behavior Analysis in Social Networks and Consumer Review Websites

Date

2018-12

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Web and social media have been influencing every aspect of today's world, rendering a tremendous amount of data that requires new insights to know about the current society. The usage and dependence on these media has led to their active use by a small yet powerful group of users to sway the sentiment of people for selfish gains. To check the infiltration of these anomalous users, we face two challenges: (1) studying opinions to learn their behaviors, and (2) detecting opinion spam to reduce their effects. We study the behaviors of anomalous users in reviews collected on Yelp, Amazon and social data from Twitter. Using Yelp reviews we explore the temporal behaviors of spammers. Social spammers easily penetrate and are difficult to filter as they adapt to changing filtering algorithms. Using a Twitter dataset, we study their behaviors of success rate, fraudulence, and content posting activities. We uncover that successful spammers have a stronger friendship base and post an amalgam of spam and non-spam contents. We exploit the behaviors learned from Yelp and Twitter to generate spam detection algorithms. Our novel temporal features are instrumental in spam detection in consumer reivews, performing better than existing state-of-the-art approaches. We combine the content-based features and graph based approach embodying social relationships for spam detection in Twitter. Biased random walks and language models significantly improve the classification. We further characterize the review system in Amazon, a leading online marketplace. We use verified purchases as a popularity index to evaluate models of popularity prediction. We find that it is indeed possible to analyze behaviors and develop methods that perform well for anomaly detection in the web, even in these challenging situations.

Description

Keywords

Spam Detection, Anomaly Detection, Machine learning, Deep learning

Citation

Portions of this document appear in: KC, Santosh, and Arjun Mukherjee. "On the temporal dynamics of opinion spamming: Case studies on yelp." In Proceedings of the 25th International Conference on World Wide Web, pp. 369-379. International World Wide Web Conferences Steering Committee, 2016. And in: Santosh, K. C., Suman Kalyan Maity, and Arjun Mukherjee. "Enwalk: Learning network features for spam detection in twitter." In International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation, pp. 90-101. Springer, Cham, 2017. And in: Santosh, K. C., Sohan De Sarkar, and Arjun Mukherjee. "Product Popularity Modeling Via Time Series Embedding." In 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 650-653. IEEE, 2018.