research-article

Efficient IoT Traffic Inference: From Multi-view Classification to Progressive Monitoring

Authors:
Arman Pashamokhtari

UNSW Sydney, Australia

UNSW Sydney, Australia

0000-0002-0663-5061
View Profile

,
Gustavo Batista

UNSW Sydney, Australia

UNSW Sydney, Australia

0000-0002-3482-8442
View Profile

,
Hassan Habibi Gharakheili

UNSW Sydney, Australia

UNSW Sydney, Australia

0000-0002-9333-7635
View Profile

Authors Info & Claims

ACM Transactions on Internet of Things Volume 5 Issue 1Article No.: 5pp 1–30https://doi.org/10.1145/3625306

Published:16 December 2023Publication History

ACM Transactions on Internet of Things

Abstract

Machine learning-based techniques have proven to be effective in Internet-of-Things (IoT) network behavioral inference. Existing works developed data-driven models based on features from network packets and/or flows, but mainly in a static and ad-hoc manner, without adequately quantifying their gains versus costs. In this article, we develop a generic architecture that comprises two distinct inference modules in tandem, which begins with IoT network behavior classification followed by continuous monitoring. In contrast to prior relevant works, our generic architecture flexibly accounts for various traffic features, modeling algorithms, and inference strategies. We argue quantitative metrics are required to systematically compare and efficiently select various traffic features for IoT traffic inference.

This article¹ makes three contributions: (1) For IoT behavior classification, we identify four metrics, namely, cost, accuracy, availability, and frequency, that allow us to characterize and quantify the efficacy of seven sets of packet-based and flow-based traffic features, each resulting in a specialized model. By experimenting with traffic traces of 25 IoT devices collected from our testbed, we demonstrate that specialized-view models can be superior to a single combined-view model trained on a plurality of features by accuracy and cost. We also develop an optimization problem that selects the best set of specialized models for a multi-view classification. (2) For monitoring the expected IoT behaviors, we develop a progressive system consisting of one-class clustering models (per IoT class) at three levels of granularity. We develop an outlier detection technique on top of the convex hull algorithm to form custom-shape boundaries for the one-class models. We show how progression helps with computing costs and the explainability of detecting anomalies. (3) We evaluate the efficacy of our optimally selected classifiers versus the superset of specialized classifiers by applying them to our IoT traffic traces. We demonstrate how the optimal set can reduce the processing cost by a factor of six with insignificant impacts on the classification accuracy. Also, we apply our monitoring models to a public IoT dataset of benign and attack traces and show they yield an average true-positive rate of 94% and a false-positive rate of 5%. Finally, we publicly release our data (training and testing instances of classification and monitoring tasks) and code for convex hull-based one-class models.

REFERENCES

[1] Hamza A.. 2019. IoT Benign and Attack Traces. Retrieved from https://iotanalytics.unsw.edu.au/attack-data.htmlGoogle Scholar
[2] Abu-Mostafa Yaser et al. 2012. Learning from Data. AMLBook.Google ScholarDigital Library
[3] Ahmed J. et al. 2020. Monitoring enterprise DNS queries for detecting data exfiltration from internal hosts. IEEE Trans. Netw. Serv. Manage. 17, 1 (2020), 265–279.Google ScholarDigital Library
[4] Alrawi Omar, Lever Chaz, Antonakakis Manos, and Monrose Fabian. 2019. SoK: Security evaluation of home-based IoT deployments. In Proceedings of the IEEE Symposium on Security and Privacy (S&P’19).Google ScholarCross Ref
[5] Anand J., Sivanathan A., Hamza A., and Gharakheili H. Habibi. 2021. PARVP: Passively assessing risk of vulnerable passwords for HTTP authentication in networked cameras. In Proceedings of the ACM Workshop on DAI-SNAC. 10–16.Google ScholarDigital Library
[6] Bezawada B. et al. 2018. Behavioral fingerprinting of IoT devices. In Proceedings of the ACM ASHES.Google ScholarDigital Library
[7] Bezawada B., Bachani M., Peterson J., Shirazi H., Ray I., and Ray I.. 2018. Behavioral fingerprinting of IoT devices. In Proceedings of the ASHES. Toronto, Canada.Google ScholarDigital Library
[8] Bitdefender. 2017. Infected Vending Machines, Lamps, other IoT Devices Shut Down University Network. Retrieved from https://bit.ly/3NE6dPuGoogle Scholar
[9] Bremler-Barr A. et al. 2020. IoT or NoT: Identifying IoT devices in a short time scale. In Proceedings of the IEEE/IFIP NOMS. Google ScholarDigital Library
[10] Bynum M. et al. 2021. Pyomo–Optimization Modeling in Python (3rd ed.). Vol. 67. Springer Science & Business Media.Google ScholarCross Ref
[11] Cateni Silvia et al. 2014. A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing 135 (2014), 32–41. DOI:Google ScholarDigital Library
[12] Cisco. 2012. Introduction to Cisco IOS NetFlow—A Technical Overview. Retrieved from https://www.cisco.com/c/en/us/products/collateral/ios-nx-os-software/ios-netflow/prod_white_paper0900aecd80406232.htmlGoogle Scholar
[13] Edge Cyber. 2020. Cyberthreat Defense Report. Retrieved from https://cyber-edge.com/wp-content/uploads/2020/03/CyberEdge-2020-CDR-Report-v1.0.pdfGoogle Scholar
[14] Diamond S. et al. 2016. CVXPY: A Python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17, 83 (2016), 1–5.Google ScholarDigital Library
[15] Doshi R. et al. 2018. Machine learning DDoS detection for consumer Internet of Things devices. In Proceedings of the IEEE S&P Workshops.Google ScholarCross Ref
[16] Tlamelo E. et al. 2021. A survey on missing data in machine learning. J. Big Data 8 (2021), 1–37.Google Scholar
[17] Eddy Wesley. 2022. Transmission Control Protocol (TCP). Retrieved from https://www.rfc-editor.org/info/rfc9293. DOI:Google ScholarDigital Library
[18] Feng X., Li Q., Wang H., and Sun L.. 2018. Acquisitional rule-based engine for discovering Internet-of-Things devices. In Proceedings of the USENIX Security Conference.Google Scholar
[19] Forescout. 2016. Network Visibility Survey. Retrieved from http://bit.ly/30LBGafGoogle Scholar
[20] Garcia S., Parmisano A., and Erquiaga. M. J.2023. IoT-23: A Labeled Dataset with Malicious and Benign IoT Network Traffic. Retrieved from https://zenodo.org/record/4743746. Google ScholarCross Ref
[21] Guo H. et al. 2018. IP-based IoT device detection. In Proceedings of the ACM IoT S&P.Google ScholarDigital Library
[22] Guo Hang et al. 2020. IoTSTEED: Bot-side Defense to IoT-based DDoS Attacks (Extended). Technical Report ISI-TR-738. USC/Information Sciences Institute. Retrieved from https://bit.ly/3ec9eGSGoogle Scholar
[23] Guo H. and Heidemann J.. 2020. IoTSTEED: Bot-side Defense to IoT-based DDoS Attacks (Extended). Technical Report ISI-TR-738. USC/Information Sciences Institute. Retrieved from https://www.isi.edu/%7ejohnh/PAPERS/Guo20b.htmlGoogle Scholar
[24] Gharakheili H. Habibi, Lyu M., Wang Y., Kumar H., and Sivaraman V.. 2019. iTeleScope: Softwarized network middle-box for real-time video telemetry and classification. IEEE Trans. Netw. Serv. Manage. 16, 3 (2019), 1071–1085. DOI:Google ScholarCross Ref
[25] Hamza A. et al. 2019. Detecting volumetric attacks on IoT devices via SDN-based monitoring of MUD activity. In Proceedings of the ACM SOSR.Google Scholar
[26] Hamza A. et al. 2020. Verifying and monitoring IoTs network behavior using MUD profiles. IEEE Trans. Depend. Secure Comput. 19, 1 (May2020), 1–18.Google ScholarDigital Library
[27] Hamza A. et al. 2022. Verifying and monitoring IoTs network behavior using MUD profiles. IEEE TDSC 19, 1 (2022), 1–18.Google Scholar
[28] Hamza Ayyoob, Gharakheili Hassan Habibi, and Sivaraman Vijay. 2018. Combining MUD policies with SDN for IoT intrusion detection. In Proceedings of the ACM IoT S&P.Google ScholarDigital Library
[29] Hasan M. et al. 2019. Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Internet Things J. 7 (2019), 1–14.Google Scholar
[30] Hasan M., Islam M. M., Zarif M. I. I., and Hashem M. M. A.. 2019. Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Internet Things J. 7 (2019), 100059.Google ScholarCross Ref
[31] Hasan Md Kamrul et al. 2021. Missing value imputation affects the performance of machine learning: A review and analysis of the literature (2010–2021). Info. Med. Unlock. 27 (2021), 100799.Google ScholarCross Ref
[32] Holland J., Teixeira R., Schmitt P., Borgolte K., Rexford J., Feamster N., and Mayer J.. 2020. Classifying Network Vendors at Internet Scale. Retrieved from https://arxiv.org/abs/2006.13086. DOI:Google ScholarCross Ref
[33] Huang D. Yuxing, Apthorpe N., Li F., Acar G., and Feamster N.. 2020. IoT inspector: Crowdsourcing labeled network traffic from smart home devices at scale. ACM IMWUT 4, 2 (2020).Google Scholar
[34] IETF. 2013. Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information. Retrieved from https://tools.ietf.org/html/rfc7011Google Scholar
[35] IETF. 2019. Manufacturer Usage Description Specification. Retrieved from https://tools.ietf.org/html/rfc8520Google Scholar
[36] Jmila Houda, Blanc Gregory, Shahid Mustafizur R., and Lazrag Marwan. 2022. A survey of smart home IoT device classification using machine learning-based network traffic analysis. IEEE Access 10 (2022), 97117–97141. DOI:Google ScholarCross Ref
[37] Kumar D. et al. 2019. All things considered: An analysis of IoT devices on home networks. In Proceedings of the USENIX Security.Google Scholar
[38] Loi F. et al. 2017. Systematically evaluating security and privacy for consumer IoT devices. In Proceedings of the ACM Workshop on IoT S&P. 1–6.Google ScholarDigital Library
[39] Lyon G.. 1997. Retrieved from Nmap. https://nmap.org/Google Scholar
[40] Lyu Mi. et al. 2017. Quantifying the reflective DDoS attack capability of household IoT devices. In Proceedings of the ACM WiSec. 46–51.Google ScholarDigital Library
[41] Lyu M., Sherratt D., Sivanathan A., Gharakheili H. Habibi, Radford A., and Sivaraman V.. 2017. Quantifying the reflective DDoS attack capability of household IoT devices. In Proceedings of the ACM WiSec.Google ScholarDigital Library
[42] Marchal S. et al. 2019. AuDI: Toward autonomous IoT device-type identification using periodic communication. IEEE JSAC 37, 6 (June2019), 1402–1412.Google Scholar
[43] Mazhar M. et al. 2020. Characterizing smart home IoT traffic in the wild. In Proceedings of the IEEE/ACM IoTDI.Google ScholarCross Ref
[44] Meidan Y. et al. 2017. ProfilIoT: A machine learning approach for IoT device identification based on network traffic analysis. In Proceedings of the SAC.Google ScholarDigital Library
[45] Meidan Y. et al. 2018. N-BaIoT–network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervas. Comput. 17, 3 (2018), 12–22.Google ScholarDigital Library
[46] Meidan Y. et al. 2020. A novel approach for detecting vulnerable IoT devices connected behind a home NAT. Comput. Secur. 97 (Oct.2020), 1–23.Google ScholarDigital Library
[47] Miettinen M. et al. 2017. IoT SENTINEL: Automated device-type identification for security enforcement in IoT. In Proceedings of the IEEE ICDCS.Google Scholar
[48] Mills D.. 1992. Network Time Protocol (Version 3) Specification, Implementation and Analysis. Retrieved from https://www.rfc-editor.org/info/rfc1305Google Scholar
[49] MITRE. 2020. Common Vulnerabilities and Exposures. Retrieved from https://cve.mitre.org/Google Scholar
[50] Msadek N. et al. 2019. IoT device fingerprinting: Machine learning based encrypted traffic analysis. In Proceedings of the IEEE WCNC.Google ScholarDigital Library
[51] Nguyen T. D. et al. 2019. DÏoT: A federated self-learning anomaly detection system for IoT. In Proceedings of the IEEE ICDCS.Google Scholar
[52] Nguyen T. D. et al. 2019. DÏoT: A federated self-learning anomaly detection system for IoT. In Proceedings of the IEEE ICDCS.Google Scholar
[53] Paloato. 2020. Unit 42 IoT Threat Report. Retrieved from https://start.paloaltonetworks.com/unit-42-iot-threat-reportGoogle Scholar
[54] Pashamokhtari A. et al. 2020. Progressive monitoring of IoT networks using SDN and cost-effective traffic signatures. In Proceedings of the ETSecIoT.Google ScholarCross Ref
[55] Pashamokhtari A. et al. 2021. Inferring connected IoT devices from IPFIX records in residential ISP networks. In Proceedings of the IEEE LCN.Google ScholarCross Ref
[56] Pashamokhtari A. et al. 2022. IoT Traffic Instances. Retrieved from https://iotanalytics.unsw.edu.au/smartinfer.htmlGoogle Scholar
[57] Pashamokhtari A. et al. 2022. PicP-MUD: Profiling information content of payloads in MUD flows for IoT devices. In Proceedings of the IEEE WoWMoM.Google ScholarCross Ref
[58] Red-Button. 2016. Dyn (DynDNS) DDoS Attack. Retrieved from https://www.red-button.net/blog/dyn-dyndns-ddos-attackGoogle Scholar
[59] Reis D. et al. 2018. One-class quantification. In Proceedings of the ECML PKDD.Google Scholar
[60] Rockafellar R. T.. 1997. Convex Analysis. Princeton Mathematical Series.Google Scholar
[61] Safi M. et al. 2022. A survey on IoT profiling, fingerprinting, and identification. ACM TIOT 3, 4, Article 26 (Sep.2022), 39 pages.Google Scholar
[62] Saidi S. J. et al. 2020. A haystack full of needles: Scalable detection of IoT devices in the wild. In Proceedings of the ACM IMC.Google ScholarDigital Library
[63] Salesforce. 2019. TLS Fingerprinting with JA3 and JA3S. Retrieved from https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967Google Scholar
[64] SciPy. 2021. SciPy Convex Hull. Retrieved from https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.ConvexHull.htmlGoogle Scholar
[65] Sharma R. A. et al. 2022. Lumos: Identifying and localizing diverse hidden IoT devices in an unfamiliar environment. In Proceedings of the USENIX Security.Google Scholar
[66] Sivanathan A. et al. 2018. Can we classify an IoT device using TCP port scan?. In Proceedings of the IEEE ICIAfS.Google ScholarCross Ref
[67] Sivanathan A. et al. 2020. Detecting behavioral change of IoT devices using clustering-based network traffic modeling. IEEE Internet Things J. 7, 8 (2020), 7295–7309.Google ScholarCross Ref
[68] Sivanathan A. et al. 2020. Managing IoT cyber-security using programmable telemetry and machine learning. IEEE Trans. Netw. Serv. Manage. 17, 1 (2020), 60–74.Google ScholarDigital Library
[69] Sivanathan A., Gharakheili H. Habibi, Loi F., Radford A., Wijenayake C., Vishwanath A., and Sivaraman V.. 2019. Classifying IoT devices in smart environments using network traffic characteristics. IEEE Trans. Mobile Comput. 18, 8 (2019), 1745–1759.Google ScholarCross Ref
[70] Sivanathan A., Loi F., Gharakheili H. Habibi, and Sivaraman V.. 2017. Experimental evaluation of cybersecurity threats to the smart-home. In Proceedings of the IEEE ANTS. 1–6.Google ScholarDigital Library
[71] Sivaraman V., Chan D., Earl D., and Boreli R.. 2016. Smart-phones attacking smart-homes. In Proceedings of the ACM WiSec. 195–200.Google ScholarDigital Library
[72] Sivaraman V., Gharakheili H. Habibi, Fernandes C., Clark N., and Karliychuk T.. 2018. Smart IoT devices in the home: Security and privacy implications. IEEE Technol. Soc. Mag. 37, 2 (2018), 71–79.Google ScholarCross Ref
[73] Sommer R. and Paxson V.. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proceedings of the IEEE S&P. 305–316.Google ScholarDigital Library
[74] Sullivan H., Sivanathan A., Hamza A., and Gharakheili H. Habibi. 2023. Programmable active scans controlled by passive traffic inference for IoT asset characterization. In Proceedings of the IEEE/IFIP NOMS Workshop on Manage-IoT.Google ScholarCross Ref
[75] Thangavelu V., Divakaran D. M., Sairam R., Bhunia S. S., and Gurusamy M.. 2019. DEFT: A distributed IoT fingerprinting technique. IEEE Internet Things J. 6, 1 (2019), 940–952. DOI:Google ScholarCross Ref
[76] Trimananda R., Varmarken J., Markopoulou A., and Demsky B.. 2019. PingPong: Packet-level signatures for smart home device events. In Proceedings of the NDSS.Google Scholar
[77] Wang Y. et al. 2021. Analyzing the impact of missing values and selection bias on fairness. Int. J. Data Sci. Anal. 12, 2 (2021), 101–119.Google ScholarCross Ref
[78] Yang K. et al. 2019. Towards automatic fingerprinting of IoT devices in the cyberspace. Comput. Netw. 148 (2019), 318–327.Google ScholarCross Ref
[79] Yang K., Li Q., and Sun L.. 2019. Towards automatic fingerprinting of IoT devices in the cyberspace. Comput. Netw. 148 (2019), 318–327.Google ScholarCross Ref
[80] Zhao J. et al. 2017. Multi-view learning overview: Recent progress and new challenges. Info. Fusion 38 (2017), 43–54.Google ScholarDigital Library

Index Terms

Efficient IoT Traffic Inference: From Multi-view Classification to Progressive Monitoring
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Network security

Recommendations

The rise of traffic classification in IoT networks: A survey
Abstract
With the proliferation of the Internet of Things (IoT), the integration and communication of various objects have become a prevalent practice. The huge growth of IoT devices and different characteristics in the IoT traffic patterns ...
Read More
Improved classification with allocation method and multiple classifiers

We propose a new allocation method for building a classification ensemble.Allocation method uses multiple classifiers: the allocator and micro classifiers.Allocator separates the dataset and allocates them to one of micro classifiers.Allocator is based ...
Read More
Consumer IoT device deployment optimisation through deep learning: a CNN-LSTM solution for traffic classification and service identification

The internet of things (IoT) has revolutionised our world, connecting devices and creating a more intelligent and interconnected environment. However, managing and utilising the vast amount of data generated by these devices is a major challenge. To ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Internet of Things Volume 5, Issue 1
February 2024
181 pages
EISSN:2577-6207
DOI:10.1145/3613526
Editor:
Gian Pietro Picco
University of Trento, Italy
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States

Journal Family
ACM Journals for the Design of Smart and Connected Systems
Publication History
- Published: 16 December 2023
- Online AM: 24 September 2023
- Accepted: 7 September 2023
- Revised: 17 August 2023
- Received: 14 December 2022
Published in tiot Volume 5, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
IoT traffic classification
behavior monitoring
anomaly detection
optimization
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 246
  Total Downloads
- Downloads (Last 12 months)246
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Efficient IoT Traffic Inference: From Multi-view Classification to Progressive Monitoring

ACM Transactions on Internet of Things

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

The rise of traffic classification in IoT networks: A survey

Improved classification with allocation method and multiple classifiers

Consumer IoT device deployment optimisation through deep learning: a CNN-LSTM solution for traffic classification and service identification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Efficient IoT Traffic Inference: From Multi-view Classification to Progressive Monitoring

ACM Transactions on Internet of Things

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

The rise of traffic classification in IoT networks: A survey

Improved classification with allocation method and multiple classifiers

Consumer IoT device deployment optimisation through deep learning: a CNN-LSTM solution for traffic classification and service identification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media