Detection of advanced persistent threats using hashing and graph-based learning on streaming data

Megherbi, Walid; Kiouche, Abd Errahmane; Haddad, Mohammed; Seba, Hamida

doi:10.1007/s10489-024-05475-1

Detection of advanced persistent threats using hashing and graph-based learning on streaming data

Published: 01 May 2024

Volume 54, pages 5879–5890, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Walid Megherbi¹,
Abd Errahmane Kiouche ORCID: orcid.org/0000-0003-2247-4859¹,
Mohammed Haddad¹ &
…
Hamida Seba¹

228 Accesses
Explore all metrics

Abstract

Many activities in the cybersecurity realm can be represented using graphs stream, such as call graphs. In this paper, we introduce an innovative method to detect Advanced Persistent Threats (APTs) from their onset. Unique to our approach is the ability to assimilate both structural and temporal aspects, crucial for differentiating between benign and malicious activities. To overcome challenges presented by streaming data processing, we leverage hashing techniques for a compact data representation. This method, when combined with a dynamic machine learning framework, facilitates swift, incremental detection and ensures minimal memory usage. Empirical evaluations underscore the efficacy of our approach, allowing a real-time response by pinpointing APTs at the initial stages of their activity

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ProcSAGE: an efficient host threat detection method based on graph representation learning

Article Open access 25 August 2024

Identifying Tactics of Advanced Persistent Threats with Limited Attack Traces

Prospective Study of Models for Advanced Persistent Threat Detection: A Comprehensive Analysis

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets used during the current study are publicly available : https://drive.google.com/drive/folders/1Kp3JQsZz2X61efHU4mTEWHdF-ZSun8ad

Notes

https://gitlab.liris.cnrs.fr/gladis/hadapt

References

Alshamrani A, Myneni S, Chowdhary A, Huang D (2019) A survey on advanced persistent threats: Techniques, solutions, challenges, and research opportunities. IEEE Commun Surv & Tutor 21(2):1851–1877
Quintero-Bonilla S, Rey A (2020) A new proposal on the advanced persistent threat: A survey. Appl Sci 10(11):3874
Article Google Scholar
Ma X, Wu J, Xue S, Yang J, Zhou C, Sheng QZ, Xiong H, Akoglu L (2021) A comprehensive survey on graph anomaly detection with deep learning. IEEE Transactions on Knowledge and Data Engineering
Wu Y, Dai H-N, Tang H (2021) Graph neural networks for anomaly detection in industrial internet of things. IEEE Internet Things J 9(12):9214–9231
Article Google Scholar
Yamanishi K, Takeuchi J-i (2002) A unifying framework for detecting outliers and change points from non-stationary time series data. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining, pp 676–681
Pu G, Wang L, Shen J, Dong F (2020) A hybrid unsupervised clustering-based anomaly detection method. Tsinghua Sci Technol 26(2):146–153
Article Google Scholar
Ahmad B, Jian W, Ali ZA, Tanvir S, Khan MSA (2019) Hybrid anomaly detection by using clustering for wireless sensor network. Wirel Pers Commun 106:1841–1853
Article Google Scholar
Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 855–864
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020) A comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst 32(1):4–24
Eswaran D, Faloutsos C, Guha S, Mishra N (2018) Spotlight: Detecting anomalies in streaming graphs. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 1378–1386
Yu W, Cheng W, Aggarwal CC, Zhang K, Chen H, Wang W (2018) Netwalk: A flexible deep embedding approach for anomaly detection in dynamic networks. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp 2672–2681
Chang Y-Y, Li P, Sosic R, Afifi M, Schweighauser M, Leskovec J (2021) F-fade: Frequency factorization for anomaly detection in edge streams. In: Proceedings of the 14th ACM international conference on web search and data mining, pp 589–597
Liu Y, Pan S, Wang YG, Xiong F, Wang L, Chen Q, Lee VC (2021) Anomaly detection in dynamic graphs via transformer. IEEE Transactions on Knowledge and Data Engineering
Lagraa S, Amrouche K, Seba H et al (2021) A simple graph embedding for anomaly detection in a stream of heterogeneous labeled graphs. Pattern Recognit 112:107746
Article Google Scholar
Manzoor E, Milajerdi SM, Akoglu L (2016) Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1035–1044
Yang Z, Yu J, Kitsuregawa M (2010) Fast algorithms for top-k approximate string matching. In: Proceedings of the AAAI conference on artificial intelligence vol 24, pp 1467–1473
Bolton AD, Anderson-Cook CM (2017) Apt malware static trace analysis through bigrams and graph edit distance. Stat Anal Data Min: ASA Data Sci J 10(3):182–193
Article MathSciNet Google Scholar
Milajerdi SM, Gjomemo R, Eshete B, Sekar R, Venkatakrishnan V (2019) Holmes: real-time apt detection through correlation of suspicious information flows. In: 2019 IEEE Symposium on security and privacy (SP), pp 1137–1152. IEEE
Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, pp 604–613
Dasgupta A, Kumar R, Sarlós, T (2011) Fast locality-sensitive hashing. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, pp.1073–1081
Wegman MN, Carter JL (1981) New hash functions and their use in authentication and set equality. J Comput Syst Sci 22(3):265–279
Article MathSciNet Google Scholar
Lemire D, Kaser O (2014) Strongly universal string hashing is fast. Comput J 57(11):1624–1638
Article Google Scholar
Liu FT, Ting KM, Zhou Z-H (2008) Isolation forest. In: 2008 Eighth Ieee international conference on data mining, pp 413–422. IEEE
Narayanan A, Chandramohan M, Venkatesan R, Chen L, Liu Y, Jaiswal S (2017) graph2vec: Learning distributed representations of graphs. arXiv:1707.05005
Oh J, Cho K, Bruna J (2019) Advancing graphsage with a data-driven node sampling. arXiv:1904.12935
Abadal S, Jain A, Guirado R, López-Alonso J, Alarcón E (2021) Computing graph neural networks: A survey from algorithms to accelerators. ACM Comput Surv (CSUR) 54(9):1–38
Article Google Scholar
Carrington AM, Manuel DG, Fieguth PW, Ramsay T, Osmani V, Wernly B, Bennett C, Hawken S, Magwood O, Sheikh Y et al (2022) Deep roc analysis and auc as balanced average accuracy, for improved classifier selection, audit and explanation. IEEE Trans Pattern Anal Mach Intell 45(1):329–341
Article Google Scholar

Download references

Acknowledgements

This work is supported by the French National Research Agency (ANR) under grant ANR-20-CE39-0008.

Funding

This work is supported by the French National Research Agency (ANR) under grant ANR-20-CE39-0008.

Author information

Authors and Affiliations

LIRIS UMR 5205, Université de Lyon, Université Lyon 1, Villeurbanne, F-69622, France
Walid Megherbi, Abd Errahmane Kiouche, Mohammed Haddad & Hamida Seba

Authors

Walid Megherbi
View author publications
You can also search for this author inPubMed Google Scholar
Abd Errahmane Kiouche
View author publications
You can also search for this author inPubMed Google Scholar
Mohammed Haddad
View author publications
You can also search for this author inPubMed Google Scholar
Hamida Seba
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Walid MEGHERBI, Abd Errahmane KIOUCHE, Mohammed HADDAD, and Hamida SEBA all made significant contributions to the research, design, and interpretation of the study.

Corresponding author

Correspondence to Abd Errahmane Kiouche.

Ethics declarations

Ethical and Informed Consent for data used

This study does not make use of any personal data, and therefore does not require anyone’s informed consent.

Conflicts of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Megherbi, W., Kiouche, A.E., Haddad, M. et al. Detection of advanced persistent threats using hashing and graph-based learning on streaming data. Appl Intell 54, 5879–5890 (2024). https://doi.org/10.1007/s10489-024-05475-1

Download citation

Accepted: 18 April 2024
Published: 01 May 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s10489-024-05475-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detection of advanced persistent threats using hashing and graph-based learning on streaming data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

ProcSAGE: an efficient host threat detection method based on graph representation learning

Identifying Tactics of Advanced Persistent Threats with Limited Attack Traces

Prospective Study of Models for Advanced Persistent Threat Detection: A Comprehensive Analysis

Explore related subjects

Data Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical and Informed Consent for data used

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now