ABSTRACT
Open-source knowledge graphs are attracting increasing attention. Nevertheless, the openness also raises the concern of data poisoning attacks, that is, the attacker could submit malicious facts to bias the prediction of knowledge graph embedding (KGE) models. Existing studies on such attacks adopt a clear-box setting and neglect the semantic information of the generated facts, making them fail to attack in real-world scenarios. In this work, we consider a more rigorous setting and propose a model-agnostic, semantic, and stealthy data poisoning attack on KGE models from a practical perspective. The main design of our work is to inject indicative paths to make the infected model predict certain malicious facts. With the aid of the proposed opaque-box path injection theory, we theoretically reveal that the attack success rate under the opaque-box setting is determined by the plausibility of triplets on the indicative path. Based on this, we develop a novel and efficient algorithm to search paths that maximize the attack goal, satisfy certain semantic constraints, and preserve certain stealthiness, i.e., the normal functionality of the target KGE will not be influenced although it predicts wrong facts given certain queries. Through extensive evaluation of benchmark datasets and 6 typical knowledge graph embedding models as the victims, we validate the effectiveness in terms of attack success rate (ASR) under opaque-box setting and stealthiness. For example, on FB15k-237, our attack achieves a ASR on DeepPath, with an average ASR over when attacking various KGE models under the opaque-box setting.
- M. Ashburner, C. Ball, J. Blake, D. Botstein, H. Butler, J. Cherry, A. Davis, K. Dolinski, S. Dwight, J. Eppig, M. Harris, D. Hill, L. Issel-Tarver, A. Kasarskis, Suzanna Lewis, J. C. Matese, J. Richardson, M. Ringwald, G. Rubin, and G. Sherlock. 2000. Gene Ontology: tool for the unification of biology. Nature Genetics 25 (2000), 25–29.Google ScholarCross Ref
- Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In ISWC.Google Scholar
- Prithu Banerjee, Lingyang Chu, Yong Zhang, Laks V. S. Lakshmanan, and Lanjun Wang. 2021. Stealthy Targeted Data Poisoning Attack on Knowledge Graphs. In ICDE.Google Scholar
- Peru Bhardwaj, John D. Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, 8225–8239. https://doi.org/10.18653/v1/2021.emnlp-main.648Google ScholarCross Ref
- Peru Bhardwaj, John D. Kelleher, Luca Costabello, and Declan O’Sullivan. 2021. Poisoning Knowledge Graph Embeddings via Relation Inference Patterns. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 1875–1888. https://doi.org/10.18653/v1/2021.acl-long.147Google ScholarCross Ref
- Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).Google Scholar
- Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, and Qingming Huang. 2021. Dual Quaternion Knowledge Graph Embeddings. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6894–6902.Google ScholarCross Ref
- Hanjun Dai, Hui Li, Tian Tian, Xin Huang, Lin Wang, Jun Zhu, and Le Song. 2018. Adversarial Attack on Graph Structured Data. In ICML(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1123–1132. http://proceedings.mlr.press/v80/dai18b.htmlGoogle Scholar
- Kemal Davaslioglu and Yalin E Sagduyu. 2019. Trojan attacks on wireless signal classification with adversarial machine learning. In 2019 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN). IEEE, 1–6.Google ScholarDigital Library
- Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. In AAAI.Google Scholar
- Wenqi Fan, Tyler Derr, Xiangyu Zhao, Yao Ma, Hui Liu, Jianping Wang, Jiliang Tang, and Qing Li. 2021. Attacking Black-box Recommendations via Copying Cross-domain User Profiles. In ICDE.Google Scholar
- R. Fasoulis, Konstantinos Bougiatiotis, Fotis Aisopos, Anastasios Nentidis, and Georgios Paliouras. 2020. Error detection in Knowledge Graphs: Path Ranking, Embeddings or both¿CoRR abs/2002.08762 (2020). arXiv:2002.08762https://arxiv.org/abs/2002.08762Google Scholar
- Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6572Google Scholar
- William L. Hamilton, Payal Bajaj, Marinka Zitnik, Dan Jurafsky, and Jure Leskovec. 2018. Embedding Logical Queries on Knowledge Graphs. In Advances in Neural Information Processing Systems, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 2030–2041.Google Scholar
- Yan Hong, Chenyang Bu, and Tingting Jiang. 2020. Rule-enhanced Noisy Knowledge Graph Embedding via Low-quality Error Detection. In 2020 IEEE International Conference on Knowledge Graph, ICKG 2020, Online, August 9-11, 2020, Enhong Chen and Grigoris Antoniou (Eds.). IEEE, 544–551. https://doi.org/10.1109/ICBK50248.2020.00082Google ScholarCross Ref
- Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge Graph Embedding Based Question Answering. In WSDM, J. Shane Culpepper, Alistair Moffat, Paul N. Bennett, and Kristina Lerman (Eds.). ACM, 105–113. https://doi.org/10.1145/3289600.3290956Google ScholarDigital Library
- Xiao Huang, Jingyuan Zhang, Dingcheng Li, and Ping Li. 2019. Knowledge graph embedding based question answering. In WSDM.Google Scholar
- Denis Krompaß, Stephan Baier, and Volker Tresp. 2015. Type-constrained representation learning in knowledge graphs. In International semantic web conference. Springer, 640–655.Google Scholar
- Ni Lao, Tom M. Mitchell, and William W. Cohen. 2011. Random Walk Inference and Learning in A Large Scale Knowledge Base. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, 27-31 July 2011, John McIntyre Conference Centre, Edinburgh, UK, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, 529–539. https://aclanthology.org/D11-1049/Google Scholar
- Ni Lao, Jun Zhu, Xinwang Liu, Yandong Liu, and William W. Cohen. 2010. Efficient Relational Learning with Hidden Variable Detection. In Advances in Neural Information Processing Systems, John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta (Eds.). 1234–1242.Google Scholar
- Yankai Lin, Zhiyuan Liu, Huan-Bo Luan, Maosong Sun, Siwei Rao, and Song Liu. 2015. Modeling Relation Paths for Representation Learning of Knowledge Bases. In EMNLP, Lluís Màrquez, Chris Callison-Burch, Jian Su, Daniele Pighin, and Yuval Marton (Eds.).Google Scholar
- Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. 2022. Adversarial Attack on Graph Neural Networks as An Influence Maximization Problem. In WSDM ’22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21 - 25, 2022, K. Selcuk Candan, Huan Liu, Leman Akoglu, Xin Luna Dong, and Jiliang Tang (Eds.). ACM, 675–685. https://doi.org/10.1145/3488560.3498497Google ScholarDigital Library
- Jiaqi Ma, Shuangrui Ding, and Qiaozhu Mei. 2020. Towards More Practical Adversarial Attacks on Graph Neural Networks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).Google Scholar
- Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. 2017. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121 (2017).Google Scholar
- Heiko Paulheim and Aldo Gangemi. 2015. Serving DBpedia with DOLCE–more than just adding a cherry on top. In International semantic web conference. Springer, 180–196.Google Scholar
- Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. 2019. Investigating robustness and interpretability of link prediction via adversarial modifications. arXiv:1905.00563 (2019).Google Scholar
- Meng Qu, Junkun Chen, Louis-Pascal A. C. Xhonneux, Yoshua Bengio, and Jian Tang. 2021. RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net. https://openreview.net/forum¿id=tGZu6DlbreVGoogle Scholar
- Tara Safavi and Danai Koutra. 2020. Codex: A comprehensive knowledge graph completion benchmark. arXiv preprint arXiv:2009.07810 (2020).Google Scholar
- Amit Singhal. 2012. Google Knowledge Graph. (2012).Google Scholar
- Alexander Turner, Dimitris Tsipras, and Aleksander Madry. 2019. Label-Consistent Backdoor Attacks. CoRR abs/1912.02771 (2019). arXiv:1912.02771http://arxiv.org/abs/1912.02771Google Scholar
- Min Wang, Yanzhen Zou, Yingkui Cao, and Bing Xie. 2019. Searching software knowledge graph with question. In International Conference on Software and Systems Reuse. Springer, 115–131.Google ScholarCross Ref
- Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12 (2017), 2724–2743.Google ScholarCross Ref
- Rui Wang, Bicheng Li, Shengwei Hu, Wenqian Du, and Min Zhang. 2019. Knowledge graph embedding via graph attenuated attention networks. IEEE Access 8 (2019), 5212–5224.Google ScholarCross Ref
- Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes.AAAI.Google Scholar
- Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. 2021. Graph backdoor. In 30th { USENIX} Security Symposium ({ USENIX} Security 21).Google Scholar
- Wenhan Xiong, Thien Hoang, and William Yang Wang. 2017. Deeppath: A reinforcement learning method for knowledge graph reasoning. arXiv:1707.06690 (2017).Google Scholar
- Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6575Google Scholar
- Hengtong Zhang, Changxin Tian, Yaliang Li, Lu Su, Nan Yang, Wayne Xin Zhao, and Jing Gao. 2021. Data Poisoning Attack against Recommender System Using Incomplete and Perturbed Data. In KDD ’21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14-18, 2021, Feida Zhu, Beng Chin Ooi, and Chunyan Miao (Eds.). ACM, 2154–2164. https://doi.org/10.1145/3447548.3467233Google ScholarDigital Library
- Hengtong Zhang, Tianhang Zheng, Jing Gao, Chenglin Miao, Lu Su, Yaliang Li, and Kui Ren. 2019. Data poisoning attack against knowledge graph embedding. arXiv:1904.12052 (2019).Google Scholar
- Xinyang Zhang, Zheng Zhang, and Tianying Wang. 2021. Trojaning Language Models for Fun and Profit. EuroS&P (2021).Google Scholar
- Yongfeng Zhang and Xu Chen. 2018. Explainable recommendation: A survey and new perspectives. arXiv preprint arXiv:1804.11192.Google Scholar
- Shihao Zhao, Xingjun Ma, Xiang Zheng, James Bailey, Jingjing Chen, and Yu-Gang Jiang. 2020. Clean-Label Backdoor Attacks on Video Recognition Models. In CVPR.Google Scholar
- Yang Zhao, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2020. Knowledge Graphs Enhanced Neural Machine Translation. In IJCAI.Google Scholar
Index Terms
- MaSS: Model-agnostic, Semantic and Stealthy Data Poisoning Attack on Knowledge Graph Embedding
Recommendations
Poisoning Attack on Federated Knowledge Graph Embedding
WWW '24: Proceedings of the ACM on Web Conference 2024Federated Knowledge Graph Embedding (FKGE) is an emerging collaborative learning technique for deriving expressive representations (i.e., embeddings) from client-maintained distributed knowledge graphs (KGs). However, poisoning attacks in FKGE, which ...
Robust Graph Embedding Recommendation Against Data Poisoning Attack
Big Data Intelligence and ComputingAbstractWith the development of recommendation system technology, more and more Internet services are applied to recommendation systems.
In recommendation systems, matrix factoring is the most widely used technique. However, matrix factoring algorithms are ...
Evaluation Framework for Poisoning Attacks on Knowledge Graph Embeddings
Natural Language Processing and Chinese ComputingAbstractIn the area of knowledge graph embedding data poisoning, attackers are beginning to consider the importance of poisoning sample exposure risk while increasing the toxicity of the poisoning sample. On the other hand, we have found that some ...
Comments