skip to main content
10.1145/3447548.3467390acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models

Published: 14 August 2021 Publication History

Abstract

What should a malicious user write next to fool a detection model? Identifying malicious users is critical to ensure the safety and integrity of internet platforms. Several deep learning based detection models have been created. However, malicious users can evade deep detection models by manipulating their behavior, rendering these models of little use. The vulnerability of such deep detection models against adversarial attacks is unknown. Here we create a novel adversarial attack model against deep user sequence embedding-based classification models, which use the sequence of user posts to generate user embeddings and detect malicious users. In the attack, the adversary generates a new post to fool the classifier. We propose a novel end-to-end Personalized Text Generation Attack model, called PETGEN, that simultaneously reduces the efficacy of the detection model and generates posts that have several key desirable properties. Specifically, PETGEN generates posts that are personalized to the user's writing style, have knowledge about a given target context, are aware of the user's historical posts on the target context, and encapsulate the user's recent topical interests. We conduct extensive experiments on two real-world datasets (Yelp and Wikipedia, both with ground-truth of malicious users) to show that PETGEN significantly reduces the performance of popular deep user sequence embedding-based classification models. PETGEN outperforms five attack baselines in terms of text quality and attack efficacy in both white-box and black-box classifier settings. Overall, this work paves the path towards the next generation of adversary-aware sequence classification models.

Supplementary Material

MP4 File (KDD21-rst2893.mp4)
Presentation video.

References

[1]
Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019).
[2]
Teodora Dobrilova. 2020. https://review42.com/what-percentage-of-amazon-reviews-are-fake/. [Online; accessed 02-02-2021].
[3]
Yingtong Dou, Guixiang Ma, Philip S Yu, and Sihong Xie. 2020. Robust spammer detection by nash reinforcement learning. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 924--933.
[4]
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2017. Hotflip: White-box adversarial examples for text classification. ACL (2017).
[5]
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. arXiv preprint arXiv:1406.2661 (2014).
[6]
Jeff Heaton. 2017. Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep learning: The MIT Press, 2016, 800 pp, ISBN: 0262035618. Genetic Programming and Evolvable Machines, Vol. 19, 1--2 (2017).
[7]
Clayton Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In ICWSM, Vol. 8.
[8]
Eric Jang, Shixiang Gu, and Ben Poole. 2017. Categorical Reparameterization with Gumbel-Softmax. ICLR (2017).
[9]
Xiaowei Jia, Sheng Li, Handong Zhao, Sungchul Kim, and Vipin Kumar. 2019. Towards robust and discriminative sequential data learning: When and how to perform adversarial training?. In SIGKDD. 1665--1673.
[10]
Srijan Kumar, Justin Cheng, Jure Leskovec, and VS Subrahmanian. 2017. An army of me: Sockpuppets in online discussion communities. In WWW. 857--866.
[11]
Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and VS Subrahmanian. 2018. Rev2: Fraudulent user prediction in rating platforms. In WSDM. 333--341.
[12]
Srijan Kumar, Francesca Spezzano, and V.S. Subrahmanian. 2015. VEWS: A Wikipedia Vandal Early Warning System. In SIGKDD. ACM.
[13]
Srijan Kumar, Xikun Zhang, and J. Leskovec. 2019. Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks. SIGKDD (2019).
[14]
Alex M Lamb, Anirudh Goyal Alias Parth Goyal, Ying Zhang, Saizheng Zhang, Aaron C Courville, and Yoshua Bengio. 2016. Professor forcing: A new algorithm for training recurrent networks. In NeuIPS. 4601--4609.
[15]
Thai Le, Suhang Wang, and Dongwon Lee. 2020. Malcom: Generating malicious comments to attack neural fake news detection models. ICDM (2020).
[16]
Ji Young Lee and Franck Dernoncourt. 2016. Sequential short-text classification with recurrent and convolutional neural networks. NAACL (2016).
[17]
Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, and Ting Wang. 2018. Textbugger: Generating adversarial text against real-world applications. NDSS.
[18]
Yang Liu and Mirella Lapata. 2019. Hierarchical Transformers for Multi-Document Summarization. In ACL.
[19]
Weili Nie, Nina Narodytska, and Ankit Patel. 2018. Relgan: Relational generative adversarial networks for text generation. In ICLR.
[20]
Nima Noorshams, Saurabh Verma, and Aude Hofleitner. 2020. TIES: Temporal Interaction Embeddings For Enhancing Social Media Integrity At Facebook. In SIGKDD. 3128--3135.
[21]
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In EuroS&P. IEEE, 372--387.
[22]
Danish Pruthi, Bhuwan Dhingra, and Zachary C Lipton. 2019. Combating adversarial misspellings with robust word recognition. ACL (2019).
[23]
Shebuti Rayana and Leman Akoglu. 2015. Collective opinion spam detection: Bridging review networks and metadata. In SIGKDD. 985--994.
[24]
Gisela Redeker. 2000. Coherence and structure in text and discourse.
[25]
Dino Sejdinovic, Bharath Sriperumbudur, Arthur Gretton, and Kenji Fukumizu. 2013. Equivalence of distance-based and RKHS-based statistics in hypothesis testing. The Annals of Statistics (2013), 2263--2291.
[26]
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD explorations newsletter, Vol. 19, 1 (2017), 22--36.
[27]
Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, and Sameer Singh. 2019. Universal Adversarial Triggers for Attacking and Analyzing NLP. In EMNLP.
[28]
Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. Hierarchical attention networks for document classification. In NAACL.
[29]
Wei Emma Zhang, Quan Z Sheng, Ahoud Alhazmi, and Chenliang Li. 2020. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM TIST, Vol. 11, 3 (2020), 1--41.
[30]
Yi Zhao, Yanyan Shen, and Junjie Yao. 2019. Recurrent Neural Network for Text Classification with Hierarchical Multiscale Dense Connections. In IJCAI.

Cited By

View all
  • (2024)A Survey on the Role of Crowds in Combating Online Misinformation: Annotators, Evaluators, and CreatorsACM Transactions on Knowledge Discovery from Data10.1145/369498019:1(1-30)Online publication date: 29-Nov-2024
  • (2024)Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style AttacksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671977(3367-3378)Online publication date: 25-Aug-2024
  • (2024)Adversarial Text Rewriting for Text-aware Recommender SystemsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679592(1804-1814)Online publication date: 21-Oct-2024
  • Show More Cited By

Index Terms

  1. PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
    August 2021
    4259 pages
    ISBN:9781450383325
    DOI:10.1145/3447548
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adversarial text generation
    2. attack
    3. deep learning
    4. sequence classification
    5. user classification

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)186
    • Downloads (Last 6 weeks)27
    Reflects downloads up to 08 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Survey on the Role of Crowds in Combating Online Misinformation: Annotators, Evaluators, and CreatorsACM Transactions on Knowledge Discovery from Data10.1145/369498019:1(1-30)Online publication date: 29-Nov-2024
    • (2024)Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style AttacksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671977(3367-3378)Online publication date: 25-Aug-2024
    • (2024)Adversarial Text Rewriting for Text-aware Recommender SystemsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679592(1804-1814)Online publication date: 21-Oct-2024
    • (2024)Corrective or Backfire: Characterizing and Predicting User Response to Social CorrectionProceedings of the 16th ACM Web Science Conference10.1145/3614419.3644004(149-158)Online publication date: 21-May-2024
    • (2024)Hierarchical Query Classification in E-commerce SearchCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648332(338-345)Online publication date: 13-May-2024
    • (2023)Advances in AI for safety, equity, and well-being on web and social mediaProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i13.26811(15444-15444)Online publication date: 7-Feb-2023
    • (2023)Shilling Black-box Review-based Recommender Systems through Fake Review GenerationProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599502(286-297)Online publication date: 6-Aug-2023
    • (2023)Attacking Fake News Detectors via Manipulating News Social EngagementProceedings of the ACM Web Conference 202310.1145/3543507.3583868(3978-3986)Online publication date: 30-Apr-2023
    • (2023)Reinforcement Learning-based Counter-Misinformation Response Generation: A Case Study of COVID-19 Vaccine MisinformationProceedings of the ACM Web Conference 202310.1145/3543507.3583388(2698-2709)Online publication date: 30-Apr-2023
    • (2022)Rank List Sensitivity of Recommender Systems to Interaction PerturbationsProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557425(1584-1594)Online publication date: 17-Oct-2022
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media