research-article

Helix: DGA Domain Embeddings for Tracking and Exploring Botnets

Authors:

Asaf ShabtaiAuthors Info & Claims

CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Pages 2741 - 2748

https://doi.org/10.1145/3340531.3416022

Published: 19 October 2020 Publication History

Get Access

Abstract

Botnets have been using domain generation algorithms (DGA) for over a decade to covertly and robustly identify the domain name of their command and control servers (C&C). Recent advancements in DGA detection has motivated botnet owners to rapidly alter the C&C domain and use adversarial techniques to evade detection. As a result, it has become increasingly difficult to track botnets in DNS traffic. In this paper, we present Helix, a method for tracking and exploring botnets. Helix uses a spatio-temporal deep neural network autoencoder to convert domains into numerical vectors (embeddings) which capture the DGA and seed used to create the domain. This is made possible by leveraging both convolutional (spatial) and recurrent (temporal) layers, and by using techniques such as attention mechanisms and highways. Furthermore, by using an autoencoder architecture, the network can be trained in an unsupervised manner (no labeling of data) which makes the system practical for real world deployments.

In our evaluation, we found that Helix can track botnet campaigns, distinguish between DGA families and seeds, and can identify domains generated using the latest adversarial machine learning techniques. Helix is currently being used to track botnets in one of the world's largest Internet Service Providers (ISP), and we include some of the ISP's analysis work using our method.

Supplementary Material

MP4 File (3340531.3416022.mp4)

Botnets use domain generation algorithms (DGA) to establish a robust connection, Recent advancements in DGA detection has motivated botnet to use adversarial techniques to evade detection. As a result, it has become increasingly challenging to track botnets in DNS traffic. \r\nWe present Helix, a method for tracking and exploring botnets. Helix uses a deep neural network autoencoder to convert domains into numerical vectors which capture the DGA and seed used to create the domain. Furthermore, by using an autoencoder architecture, the network can be trained in an unsupervised manner which makes the system practical for real-world deployments. In our evaluation, we found that Helix can track botnet campaigns, distinguish between DGA families and seeds, and can identify domains generated using the latest adversarial machine learning techniques. \r\n

Download
12.02 MB

References

[1]

Aashna Ahluwalia, Issa Traore, Karim Ganame, and Nainesh Agarwal. 2017. Detecting Broad Length Algorithmically Generated Domains. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 10618 LNCS (2017), 19--34.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Real-time Detection of Botnet Behavior in Cloud Using Domain Generation Algorithm

GMAD: Graph-based Malware Activity Detection by DNS traffic analysis

Identifying botnets by capturing group activities in DNS traffic

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations