Authors:
Gabriel Morales
;
Farhan Romit
;
Adam Bienek-Parrish
;
Patrick Jenkins
and
Rocky Slavin
Affiliation:
Department of Computer Science, University of Texas at San Antonio, San Antonio, Texas, U.S.A.
Keyword(s):
Internet-of-Things, LLM, Traffic Flow, Network Analysis, Networking Standards, Classification.
Abstract:
Technological advancement has made strides due in part to added convenience in our daily lives. This addition of automation and quick access to information has given rise to the Internet-of-Things (IoT), where otherwise normal items such as kitchen appliances, smartphones, and even electrical meters are interconnected and can access the Internet. Since IoT devices can be accessed anywhere and have user-set behaviors, they transmit data frequently over various networking standards which can be obtained by a malicious actor. While network data is often encrypted, the patterns they construct can be used by such an adversary to infer user behavior, device behavior, or the device itself. In this work, we evaluate various traditional machine learning models for device classification using network traffic features generated from link-level flows to overcome both encryption and differences in protocols/standards. We also demonstrate the viability of the GPT 3.5 large language model (LLM) to
perform the same task. Our experiments show the viability of flow-based classification across 802.11 Wi-Fi, Zigbee, and Bluetooth Low Energy devices. Furthermore, with a considerably smaller dataset, the LLM was able to identify devices with an overall accuracy of 79% through the use of prompt-tuning, and an overall accuracy of 63.73% for a larger more common dataset using fine-tuning. Compared to traditional models, the LLM closely matches the performance of the lowest-performing models and even achieves higher accuracy than the best-performing models.
(More)