research-article

Public Access

Graph Learning for Interactive Threat Detection in Heterogeneous Smart Home Rule Data

Authors:

Guangjing Wang,

Nikolay Ivanov,

ThanhVu Nguyen,

Qiben YanAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 1, Issue 1

Article No.: 102, Pages 1 - 27

https://doi.org/10.1145/3588956

Published: 30 May 2023 Publication History

Abstract

The interactions among automation configuration rule data have led to undesired and insecure issues in smart homes, which are known as interactive threats. Most existing solutions use program analysis to identify interactive threats among automation rules, which is not suitable for closed-source platforms. Meanwhile, security policy-based solutions suffer from low detection accuracy because the pre-defined security policies in a single platform can hardly cover diverse interactive threat types across heterogeneous platforms. In this paper, we propose Glint, the first graph learning-based system for interactive threat detection in smart homes. We design a multi-scale graph representation learning model, called ITGNN, for both homogeneous and heterogeneous interaction graph pattern learning. To facilitate graph learning, we build large interaction graph training datasets by multi-domain data fusion from five different platforms. Moreover, Glint detects drifting samples with contrastive learning and improves the generalization ability with transfer learning across heterogeneous platforms. Our evaluation shows that Glint achieves 95.5% accuracy in detecting interactive threats across the five platforms. Besides, we examine a set of user-designed blueprints in the Home Assistant platform and reveal four new types of real-world interactive threats, called "action block", "action ablation", "trigger intake", and "condition duplicate", which are cross-platform interactive threats captured by Glint.

Supplemental Material

MP4 File

Presentation video for SIGMOD 2023

Download
175.87 MB

References

[1]

Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick Traynor, Kevin RB Butler, and Joseph Wilson. 2019. Practical hidden voice attacks against speech and speaker recognition systems. arXiv preprint arXiv:1904.05734 (2019).

[2]

Mohannad Alhanahnah, Clay Stevens, Bocheng Chen, Qiben Yan, and Hamid Bagheri. 2022. IoTCOM: Dissecting Interaction Threats in IoT Systems. IEEE Transactions on Software Engineering (2022).

[3]

Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, Vol. 46, 3 (1992), 175--185.

[4]

Amazon. 2022. Amazon Alexa Skills. https://www.amazon.com/alexa-skills/b?ie=UTF8&node=13727921011.

[5]

Home Assistant. 2022. Home Assistant. https://www.home-assistant.io/.

[6]

Steven Bird, Ewan Klein, and Edward Loper. 2009. Natural language processing with Python: analyzing text with the natural language toolkit. " O'Reilly Media.".

[7]

Z Berkay Celik, Gang Tan, and Patrick D McDaniel. 2019. IoTGuard: Dynamic Enforcement of Security and Safety Policy in Commodity IoT. In NDSS.

[8]

Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, et al. 2018. Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018).

[9]

Haotian Chi, Chenglong Fu, Qiang Zeng, and Xiaojiang Du. 2022. Delay Wreaks Havoc on Your Smart Home: Delay-based: Automation Interference Attacks. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 1575--1575.

[10]

Wenbo Ding and Hongxin Hu. 2018. On the safety of iot device physical interaction control. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 832--846.

Digital Library

[11]

Wenbo Ding, Hongxin Hu, and Long Cheng. 2021. IOTSAFE: Enforcing Safety and Security Policy with Real IoT Physical Interaction Discovery. In Proceedings of the 2021 Network and Distributed Systems Security (NDSS) Symposium.

[12]

Jian Du, Shanghang Zhang, Guanhang Wu, José MF Moura, and Soummya Kar. 2017b. Topology adaptive graph convolutional networks. arXiv preprint arXiv:1710.10370 (2017).

[13]

Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017a. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 1285--1298.

Digital Library

[14]

Chenglong Fu, Qiang Zeng, and Xiaojiang Du. 2021. Hawatcher: Semantics-aware anomaly detection for appified smart homes. In 30th USENIX Security Symposium.

[15]

Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding. In Proceedings of The Web Conference 2020. 2331--2341.

Digital Library

[16]

Jo ao Gama, Indr.e vZ liobait.e, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM computing surveys (CSUR), Vol. 46, 4 (2014), 1--37.

[17]

Xiang Gu, Jian Sun, and Zongben Xu. 2020. Spherical space domain adaptation with robust pseudo-label loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9101--9110.

[18]

Justin Huang and Maya Cakmak. 2015. Supporting mental model accuracy in trigger-action programming. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 215--225.

Digital Library

[19]

Xiao Huang, Qingquan Song, Fan Yang, and Xia Hu. 2019. Large-scale heterogeneous feature embedding. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 3878--3885.

Digital Library

[20]

IFTTT. 2022. IFTTT Applets. https://ifttt.com/explore/.

[21]

Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[22]

Palanivel A Kodeswaran, Ravi Kokku, Sayandeep Sen, and Mudhakar Srivatsa. 2016. Idea: A system for efficient failure management in smart iot environments. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services. 43--56.

Digital Library

[23]

Christophe Leys, Christophe Ley, Olivier Klein, Philippe Bernard, and Laurent Licata. 2013. Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of experimental social psychology, Vol. 49, 4 (2013), 764--766.

[24]

Maosen Li, Siheng Chen, Ya Zhang, and Ivor Tsang. 2020. Graph cross networks with vertex infomax pooling. Advances in Neural Information Processing Systems, Vol. 33 (2020), 14093--14105.

[25]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 eighth ieee international conference on data mining. IEEE, 413--422.

Digital Library

[26]

Dongsheng Luo, Wei Cheng, Dongkuan Xu, Wenchao Yu, Bo Zong, Haifeng Chen, and Xiang Zhang. 2020. Parameterized explainer for graph neural network. Advances in neural information processing systems, Vol. 33 (2020), 19620--19631.

[27]

OpenAI et al. 2023. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).

[28]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830.

Digital Library

[29]

John Platt et al. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, Vol. 10, 3 (1999), 61--74.

[30]

Thomas Pr"atzlich, Jonathan Driedger, and Meinard Müller. 2016. Memory-restricted multiscale dynamic time warping. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 569--573.

[31]

Luana Ruiz, Luiz Chamon, and Alejandro Ribeiro. 2020. Graphon neural networks and the transferability of graph neural networks. Advances in Neural Information Processing Systems, Vol. 33 (2020).

[32]

Bernhard Schölkopf, John C Platt, John Shawe-Taylor, Alex J Smola, and Robert C Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural computation, Vol. 13, 7 (2001), 1443--1471.

[33]

Scrapy. 2022. Web Crawling Framework. https://scrapy.org/.

[34]

SmartThings. 2022. SmartThings Developer. https://smartthings.developer.samsung.com/docs/api-ref/capabilities.html.

[35]

Spacy. 2022. Industrial-Strength Natural Language Processing. https://spacy.io/.

[36]

Fan-Yun Sun, Jordon Hoffman, Vikas Verma, and Jian Tang. 2020. InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization. In International Conference on Learning Representations.

[37]

Milijana Surbatovich, Jassim Aljuraidan, Lujo Bauer, Anupam Das, and Limin Jia. 2017. Some recipes can do more than spoil your appetite: Analyzing the security and privacy risks of ifttt recipes. In Proceedings of the 26th International Conference on World Wide Web. 1501--1510.

Digital Library

[38]

Yuan Tian, Nan Zhang, Yueh-Hsun Lin, XiaoFeng Wang, Blase Ur, Xianzheng Guo, and Patrick Tague. 2017. Smartauth: User-centered authorization for the internet of things. In 26th USENIX Security Symposium (USENIX Security 17).

[39]

Rahmadi Trimananda, Seyed Amir Hossein Aqajari, Jason Chuang, Brian Demsky, Guoqing Harry Xu, and Shan Lu. 2020. Understanding and automatically detecting conflicting interactions between smart home IoT applications. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1215--1227.

Digital Library

[40]

Blase Ur, Melwyn Pak Yong Ho, Stephen Brawner, Jiyun Lee, Sarah Mennicken, Noah Picard, Diane Schulze, and Michael L Littman. 2016. Trigger-action programming in the wild: An analysis of 200,000 ifttt recipes. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 3227--3231.

Digital Library

[41]

Guangjing Wang, Hanqing Guo, Anran Li, Xiaorui Liu, and Qiben Yan. 2023. Federated IoT Interaction Vulnerability Analysis. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE.

[42]

Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019b. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. arXiv preprint arXiv:1909.01315 (2019).

[43]

Qi Wang, Pubali Datta, Wei Yang, Si Liu, Adam Bates, and Carl A Gunter. 2019a. Charting the attack surface of trigger-action IoT platforms. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1439--1453.

Digital Library

[44]

David H Wolpert. 1996. The lack of a priori distinctions between learning algorithms. Neural computation, Vol. 8, 7 (1996), 1341--1390.

[45]

Feng Xia, Ke Sun, Shuo Yu, Abdul Aziz, Liangtian Wan, Shirui Pan, and Huan Liu. 2021. Graph learning: A survey. IEEE Transactions on Artificial Intelligence, Vol. 2, 2 (2021), 109--127.

[46]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826 (2018).

[47]

Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021a. Semi-supervised log-based anomaly detection via probabilistic label estimation. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1448--1460.

Digital Library

[48]

Limin Yang, Wenbo Guo, Qingying Hao, Arridhana Ciptadi, Ali Ahmadzadeh, Xinyu Xing, and Gang Wang. 2021b. $$CADE$$: Detecting and explaining concept drift samples for security applications. In 30th USENIX Security Symposium (USENIX Security 21). 2327--2344.

[49]

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? arXiv preprint arXiv:1411.1792 (2014).

[50]

Hao Yuan, Haiyang Yu, Jie Wang, Kang Li, and Shuiwang Ji. 2021. On explainability of graph neural networks via subgraph explorations. In International Conference on Machine Learning. PMLR, 12241--12252.

[51]

Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 103--117.

Digital Library

[52]

Jianan Zhao, Xiao Wang, Chuan Shi, Binbin Hu, Guojie Song, and Yanfang Ye. 2021. Heterogeneous graph structure learning for graph neural networks. In 35th AAAI Conference on Artificial Intelligence (AAAI).

[53]

Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, et al. 2023. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419 (2023).

[54]

Pengpeng Zhou, Yang Wang, Zhenyu Li, Xin Wang, Gareth Tyson, and Gaogang Xie. 2020. Logsayer: Log pattern-driven cloud component anomaly diagnosis with machine learning. In 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS). IEEE, 1--10.

Cited By

Fang CChen ZSong SHuang XWang CWang J(2024)On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDBProceedings of the VLDB Endowment10.14778/3681954.368197717:11(2974-2986)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681977
Hu HQiu JWang HLiang BZou S(2024)DIDS: Double Indices and Double Summarizations for Fast Similarity SearchProceedings of the VLDB Endowment10.14778/3665844.366585117:9(2198-2211)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665851
Xiong HZhang HWang ZHe ZWang PWang X(2024)CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time SeriesProceedings of the VLDB Endowment10.14778/3665844.366584517:9(2123-2135)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665845
Show More Cited By

Recommendations

A survey of large language models for cyber threat detection
Abstract
With the increasing complexity of cyber threats and the expanding scope of cyberspace, there exist progressively more challenges in cyber threat detection. It is proven that most previous threat detection models may become inadequate due to the ...
Highlights
- Comprehensive review of LLMs for cyber threat detection stage.
- Explore four suitable cyber threat detection scenarios for LLMs.
- Explore different roles of LLMs in common cyber threat detection tasks.
- Discussion of extra ...
Cyber threat prediction using dynamic heterogeneous graph learning
Abstract
Predicting cyber threats is crucial for uncovering underlying security risks and proactively preventing malicious attacks. However, predicting cyber threats and demystifying the evolutionary patterns are challenging due to the heterogeneity and ...
HiGPT: Heterogeneous Graph Language Model
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Heterogeneous graph learning aims to capture complex relationships and diverse relational semantics among entities in a heterogeneous graph to obtain meaningful representations for nodes and edges. Recent advancements in heterogeneous graph neural ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 1, Issue 1

PACMMOD

May 2023

2807 pages

EISSN:2836-6573

DOI:10.1145/3603164

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2023

Published in PACMMOD Volume 1, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
447
Total Downloads

Downloads (Last 12 months)272
Downloads (Last 6 weeks)29

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fang CChen ZSong SHuang XWang CWang J(2024)On Reducing Space Amplification with Multi-Column Compaction in Apache IoTDBProceedings of the VLDB Endowment10.14778/3681954.368197717:11(2974-2986)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681977
Hu HQiu JWang HLiang BZou S(2024)DIDS: Double Indices and Double Summarizations for Fast Similarity SearchProceedings of the VLDB Endowment10.14778/3665844.366585117:9(2198-2211)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665851
Xiong HZhang HWang ZHe ZWang PWang X(2024)CIVET: Exploring Compact Index for Variable-Length Subsequence Matching on Time SeriesProceedings of the VLDB Endowment10.14778/3665844.366584517:9(2123-2135)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665845
Maroulis SStamatopoulos VPapastefanatos GTerrovitis M(2024)Visualization-Aware Time Series Min-Max Caching with Error Bound GuaranteesProceedings of the VLDB Endowment10.14778/3659437.365946017:8(2091-2103)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659460
Li ZDing BYao LLi YXiao XZhou J(2024)Performance-Based Pricing for Federated Learning via AuctionProceedings of the VLDB Endowment10.14778/3648160.364816917:6(1269-1282)Online publication date: 3-May-2024
https://dl.acm.org/doi/10.14778/3648160.3648169
Breve BCimino GDeufemia V(2024)Hybrid Prompt Learning for Generating Justifications of Security Risks in Automation RulesACM Transactions on Intelligent Systems and Technology10.1145/3675401Online publication date: 29-Jun-2024
https://dl.acm.org/doi/10.1145/3675401
Meruje Ferreira LCoelho FPereira J(2024)Databases in Edge and Fog Environments: A SurveyACM Computing Surveys10.1145/366600156:11(1-40)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3666001
Gao JLong C(2024)RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor SearchProceedings of the ACM on Management of Data10.1145/36549702:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654970
Heddes MNunes IGivargis TNicolau A(2024)Convolution and Cross-Correlation of Count Sketches Enables Fast Cardinality Estimation of Multi-Join QueriesProceedings of the ACM on Management of Data10.1145/36549322:3(1-26)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654932
Rui LHuang XSong SKang YWang CWang J(2024)Time Series Representation for Visualization in Apache IoTDBProceedings of the ACM on Management of Data10.1145/36392902:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639290
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents