research-article

MaSS: Model-agnostic, Semantic and Stealthy Data Poisoning Attack on Knowledge Graph Embedding

Authors:

Fuli FengAuthors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 2000 - 2010

https://doi.org/10.1145/3543507.3583203

Published: 30 April 2023 Publication History

Abstract

Open-source knowledge graphs are attracting increasing attention. Nevertheless, the openness also raises the concern of data poisoning attacks, that is, the attacker could submit malicious facts to bias the prediction of knowledge graph embedding (KGE) models. Existing studies on such attacks adopt a clear-box setting and neglect the semantic information of the generated facts, making them fail to attack in real-world scenarios. In this work, we consider a more rigorous setting and propose a model-agnostic, semantic, and stealthy data poisoning attack on KGE models from a practical perspective. The main design of our work is to inject indicative paths to make the infected model predict certain malicious facts. With the aid of the proposed opaque-box path injection theory, we theoretically reveal that the attack success rate under the opaque-box setting is determined by the plausibility of triplets on the indicative path. Based on this, we develop a novel and efficient algorithm to search paths that maximize the attack goal, satisfy certain semantic constraints, and preserve certain stealthiness, i.e., the normal functionality of the target KGE will not be influenced although it predicts wrong facts given certain queries. Through extensive evaluation of benchmark datasets and 6 typical knowledge graph embedding models as the victims, we validate the effectiveness in terms of attack success rate (ASR) under opaque-box setting and stealthiness. For example, on FB15k-237, our attack achieves a ASR on DeepPath, with an average ASR over when attacking various KGE models under the opaque-box setting.

References

[1]

M. Ashburner, C. Ball, J. Blake, D. Botstein, H. Butler, J. Cherry, A. Davis, K. Dolinski, S. Dwight, J. Eppig, M. Harris, D. Hill, L. Issel-Tarver, A. Kasarskis, Suzanna Lewis, J. C. Matese, J. Richardson, M. Ringwald, G. Rubin, and G. Sherlock. 2000. Gene Ontology: tool for the unification of biology. Nature Genetics 25 (2000), 25–29.

[2]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In ISWC.

[3]

Prithu Banerjee, Lingyang Chu, Yong Zhang, Laks V. S. Lakshmanan, and Lanjun Wang. 2021. Stealthy Targeted Data Poisoning Attack on Knowledge Graphs. In ICDE.

[4]

Peru Bhardwaj, John D. Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 8225–8239. https://doi.org/10.18653/v1/2021.emnlp-main.648

[5]

Peru Bhardwaj, John D. Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Poisoning Knowledge Graph Embeddings via Relation Inference Patterns. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 1875–1888. https://doi.org/10.18653/v1/2021.acl-long.147

[6]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).

[7]

Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, and Qingming Huang. 2021. Dual Quaternion Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6894–6902.

[8]

Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial Attack on Graph Structured Data. In ICML(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1123–1132. http://proceedings.mlr.press/v80/dai18b.html

[9]

Kemal Davaslioglu and Yalin E Sagduyu. 2019. Trojan attacks on wireless signal classification with adversarial machine learning. In 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). IEEE, 1–6.

Digital Library

[10]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. In AAAI.

[11]

Wenqi Fan, Tyler Derr, Xiangyu Zhao, Yao Ma, Hui Liu, Jianping Wang, Jiliang Tang, and Qing Li. 2021. Attacking Black-box Recommendations via Copying Cross-domain User Profiles. In ICDE.

[12]

R. Fasoulis, Konstantinos Bougiatiotis, Fotis Aisopos, Anastasios Nentidis, and Georgios Paliouras. 2020. Error detection in Knowledge Graphs: Path Ranking, Embeddings or both¿CoRR abs/2002.08762 (2020). arXiv:2002.08762https://arxiv.org/abs/2002.08762

[13]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6572

[14]

William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, and Jure Leskovec. 2018. Embedding Logical Queries on Knowledge Graphs. In Advances in Neural Information Processing Systems, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 2030–2041.

[15]

Yan Hong, Chenyang Bu, and Tingting Jiang. 2020. Rule-enhanced Noisy Knowledge Graph Embedding via Low-quality Error Detection. In 2020 IEEE International Conference on Knowledge Graph, ICKG 2020, Online, August 9-11, 2020, Enhong Chen and Grigoris Antoniou (Eds.). IEEE, 544–551. https://doi.org/10.1109/ICBK50248.2020.00082

[16]

Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge Graph Embedding Based Question Answering. In WSDM, J. Shane Culpepper, Alistair Moffat, Paul N. Bennett, and Kristina Lerman (Eds.). ACM, 105–113. https://doi.org/10.1145/3289600.3290956

Digital Library

[17]

Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge graph embedding based question answering. In WSDM.

[18]

Denis Krompaß, Stephan Baier, and Volker Tresp. 2015. Type-constrained representation learning in knowledge graphs. In International semantic web conference. Springer, 640–655.

[19]

Ni Lao, Tom M. Mitchell, and William W. Cohen. 2011. Random Walk Inference and Learning in A Large Scale Knowledge Base. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27-31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, 529–539. https://aclanthology.org/D11-1049/

[20]

Ni Lao, Jun Zhu, Xinwang Liu, Yandong Liu, and William W. Cohen. 2010. Efficient Relational Learning with Hidden Variable Detection. In Advances in Neural Information Processing Systems, John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta (Eds.). 1234–1242.

[21]

Yankai Lin, Zhiyuan Liu, Huan-Bo Luan, Maosong Sun, Siwei Rao, and Song Liu. 2015. Modeling Relation Paths for Representation Learning of Knowledge Bases. In EMNLP, Lluís Màrquez, Chris Callison-Burch, Jian Su, Daniele Pighin, and Yuval Marton (Eds.).

[22]

Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. 2022. Adversarial Attack on Graph Neural Networks as An Influence Maximization Problem. In WSDM ’22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21 - 25, 2022, K. Selcuk Candan, Huan Liu, Leman Akoglu, Xin Luna Dong, and Jiliang Tang (Eds.). ACM, 675–685. https://doi.org/10.1145/3488560.3498497

Digital Library

[23]

Jiaqi Ma, Shuangrui Ding, and Qiaozhu Mei. 2020. Towards More Practical Adversarial Attacks on Graph Neural Networks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).

[24]

Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. 2017. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121 (2017).

[25]

Heiko Paulheim and Aldo Gangemi. 2015. Serving DBpedia with DOLCE–more than just adding a cherry on top. In International semantic web conference. Springer, 180–196.

[26]

Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. 2019. Investigating robustness and interpretability of link prediction via adversarial modifications. arXiv:1905.00563 (2019).

[27]

Meng Qu, Junkun Chen, Louis-Pascal A. C. Xhonneux, Yoshua Bengio, and Jian Tang. 2021. RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum¿id=tGZu6DlbreV

[28]

Tara Safavi and Danai Koutra. 2020. Codex: A comprehensive knowledge graph completion benchmark. arXiv preprint arXiv:2009.07810 (2020).

[29]

Amit Singhal. 2012. Google Knowledge Graph. (2012).

[30]

Alexander Turner, Dimitris Tsipras, and Aleksander Madry. 2019. Label-Consistent Backdoor Attacks. CoRR abs/1912.02771 (2019). arXiv:1912.02771http://arxiv.org/abs/1912.02771

[31]

Min Wang, Yanzhen Zou, Yingkui Cao, and Bing Xie. 2019. Searching software knowledge graph with question. In International Conference on Software and Systems Reuse. Springer, 115–131.

[32]

Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724–2743.

[33]

Rui Wang, Bicheng Li, Shengwei Hu, Wenqian Du, and Min Zhang. 2019. Knowledge graph embedding via graph attenuated attention networks. IEEE Access 8 (2019), 5212–5224.

[34]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes.AAAI.

[35]

Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. 2021. Graph backdoor. In 30th { USENIX} Security Symposium ({ USENIX} Security 21).

[36]

Wenhan Xiong, Thien Hoang, and William Yang Wang. 2017. Deeppath: A reinforcement learning method for knowledge graph reasoning. arXiv:1707.06690 (2017).

[37]

Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6575

[38]

Hengtong Zhang, Changxin Tian, Yaliang Li, Lu Su, Nan Yang, Wayne Xin Zhao, and Jing Gao. 2021. Data Poisoning Attack against Recommender System Using Incomplete and Perturbed Data. In KDD ’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, Feida Zhu, Beng Chin Ooi, and Chunyan Miao (Eds.). ACM, 2154–2164. https://doi.org/10.1145/3447548.3467233

Digital Library

[39]

Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019. Data poisoning attack against knowledge graph embedding. arXiv:1904.12052 (2019).

[40]

Xinyang Zhang, Zheng Zhang, and Tianying Wang. 2021. Trojaning Language Models for Fun and Profit. EuroS&P (2021).

[41]

Yongfeng Zhang and Xu Chen. 2018. Explainable recommendation: A survey and new perspectives. arXiv preprint arXiv:1804.11192.

[42]

Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. 2020. Clean-Label Backdoor Attacks on Video Recognition Models. In CVPR.

[43]

Yang Zhao, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2020. Knowledge Graphs Enhanced Neural Machine Translation. In IJCAI.

Cited By

Ma HLv PChen KZhou J(2024)KGDist: A Prompt-Based Distillation Attack against LMs Augmented with Knowledge GraphsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678906(480-495)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678906
Zhao TChen JRu YLin QGeng YLiu JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Untargeted Adversarial Attack on Knowledge Graph EmbeddingsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657702(1701-1711)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657702
Zhou EGuo SMa ZHong ZGuo TDong PChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Poisoning Attack on Federated Knowledge Graph EmbeddingProceedings of the ACM Web Conference 202410.1145/3589334.3645422(1998-2008)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645422
Show More Cited By

Index Terms

MaSS: Model-agnostic, Semantic and Stealthy Data Poisoning Attack on Knowledge Graph Embedding
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning

Recommendations

Poisoning Attack on Federated Knowledge Graph Embedding
WWW '24: Proceedings of the ACM Web Conference 2024

Federated Knowledge Graph Embedding (FKGE) is an emerging collaborative learning technique for deriving expressive representations (i.e., embeddings) from client-maintained distributed knowledge graphs (KGs). However, poisoning attacks in FKGE, which ...
Robust Graph Embedding Recommendation Against Data Poisoning Attack
Big Data Intelligence and Computing
Abstract
With the development of recommendation system technology, more and more Internet services are applied to recommendation systems.
In recommendation systems, matrix factoring is the most widely used technique. However, matrix factoring algorithms are ...
Evaluation Framework for Poisoning Attacks on Knowledge Graph Embeddings
Natural Language Processing and Chinese Computing
Abstract
In the area of knowledge graph embedding data poisoning, attackers are beginning to consider the importance of poisoning sample exposure risk while increasing the toxicity of the poisoning sample. On the other hand, we have found that some ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '23: Proceedings of the ACM Web Conference 2023

April 2023

4293 pages

ISBN:9781450394161

DOI:10.1145/3543507

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '23

Sponsor:

SIGWEB

WWW '23: The ACM Web Conference 2023

April 30 - May 4, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
454
Total Downloads

Downloads (Last 12 months)182
Downloads (Last 6 weeks)14

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ma HLv PChen KZhou J(2024)KGDist: A Prompt-Based Distillation Attack against LMs Augmented with Knowledge GraphsProceedings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3678890.3678906(480-495)Online publication date: 30-Sep-2024
https://dl.acm.org/doi/10.1145/3678890.3678906
Zhao TChen JRu YLin QGeng YLiu JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Untargeted Adversarial Attack on Knowledge Graph EmbeddingsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657702(1701-1711)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657702
Zhou EGuo SMa ZHong ZGuo TDong PChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Poisoning Attack on Federated Knowledge Graph EmbeddingProceedings of the ACM Web Conference 202410.1145/3589334.3645422(1998-2008)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645422
Wei YChen WZhang XZhao PQu JZhao L(2024)Multi-Modal Siamese Network for Few-Shot Knowledge Graph Completion2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00061(719-732)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00061
Yang JXu HMirzoyan SChen TLiu ZLiu ZJu WLiu LXiao ZZhang MWang S(2024)Poisoning medical knowledge using large language modelsNature Machine Intelligence10.1038/s42256-024-00899-36:10(1156-1168)Online publication date: 20-Sep-2024
https://doi.org/10.1038/s42256-024-00899-3
Bai JLiu XWang WLuo CSong YOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Complex query answering on eventuality knowledge graph with implicit logical constraintsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667453(30534-30553)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667453

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten