FAIXID: A Framework for Enhancing AI Explainability of Intrusion Detection Results Using Data Cleaning Techniques

Liu, Hong; Zhong, Chen; Alnusair, Awny; Islam, Sheikh Rabiul

doi:10.1007/s10922-021-09606-8

FAIXID: A Framework for Enhancing AI Explainability of Intrusion Detection Results Using Data Cleaning Techniques

Published: 24 May 2021

Volume 29, article number 40, (2021)
Cite this article

Journal of Network and Systems Management Aims and scope Submit manuscript

Hong Liu¹,
Chen Zhong²,
Awny Alnusair ORCID: orcid.org/0000-0001-9513-3022¹ &
…
Sheikh Rabiul Islam³

2256 Accesses
Explore all metrics

Abstract

Organizations depend on heavy use of various cyber defense technologies, including intrusion detection and prevention systems, to monitor and protect networks and devices from malicious activities. However, large volumes of false alerts from such technologies challenge cybersecurity analysts in isolating credible alerts from false positives for further investigations. In this article, we propose a framework named FAIXID that leverages Explainable Artificial Intelligence (XAI) and data cleaning methods for improving the explainability and understandability of intrusion detection alerts, which in turn assist cyber analysts in making more informed decisions fueled by the quick elimination of false positives. We identified five functional modules in FAIXID: (1) the pre-modeling explainability module that improves the quality of network traffic’s data through data cleaning; (2) the modeling module that provides explanations of the AI models to help analysts make sense of the model internals; (3) the post-modeling explainability module that provides additional explanations to enhance the understandability of the results produced by the AI models; (4) the attribution module that selects the appropriate explanations for the analysts according to their needs; and (5) the evaluation module that evaluates the explanations and collects feedback from analysts. FAIXID has been implemented and evaluated using experiments with real-world datasets. Evaluation of results demonstrates that the utilization of data cleaning and AI explainability techniques provide quality explanations to analysts depending on their expertise and backgrounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Explainable AI for Intrusion Prevention: A Review of Techniques and Applications

Interpreting Intrusions - The Role of Explainability in AI-Based Intrusion Detection Systems

Domain Knowledge-Aided Explainable Artificial Intelligence

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

https://www.defense.gov
A type of fraud occurs online in pay-per-click advertising.

References

D’Amico, A., Whitley, K.: The real work of computer network defense analysts. In VizSEC 2007, pp 19–37. Springer, New York (2008)
Zhong, C., Yen, J., Liu, P., Erbacher, R.F., Garneau, C., Chen, B.: Studying analysts’ data triage operations in cyber defense situational analysis. In: Theory and models for cyber situation awareness, pp. 128–169. Springer, (2017)
Islam, S.R., Eberle, W., Ghafoor, S.K., Siraj, A., Rogers, M.: Domain knowledge aided explainable artificial intelligence for intrusion detection and response. arXiv preprint arXiv:1911.09853, (2019)
Amarasinghe, K., Manic, M.: Improving user trust on deep neural networks based intrusion detection systems. In IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society, pp. 3262–3268. IEEE, (2018)
Chu, X., Ilyas, I.F., Krishnan, S., Wang, J.: Data cleaning: Overview and emerging challenges. In: Proceedings of the 2016 International Conference on Management of Data, pp. 2201–2206 (2016)
Ding, Xiaoou, Wang, Hongzhi, Su, Jiaxuan, Li, Zijue, Li, Jianzhong, Gao, Hong: Cleanits: a data cleaning system for industrial time series. Proc. VLDB Endow. 12(12), 1786–1789 (2019)
Article Google Scholar
Krishnan, Sanjay, Wang, Jiannan, Wu, Eugene, Franklin, Michael J, Goldberg, Ken: Activeclean: interactive data cleaning for statistical modeling. Proc. VLDB Endow. 9(12), 948–959 (2016)
Article Google Scholar
Yu, Z., Chu, X.: Piclean: a probabilistic and interactive data cleaning system. In: Proceedings of the 2019 International Conference on Management of Data, pp. 2021–2024 (2019)
Lipton, Z.C.: The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016)
Onwubiko, C.: Cocoa: An ontology for cybersecurity operations centre analysis process. In: 2018 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA), pp. 1–8. IEEE (2018)
Ganesan, Rajesh, Jajodia, Sushil, Shah, Ankit, Cam, Hasan: Dynamic scheduling of cybersecurity analysts for minimizing risk using reinforcement learning. ACM Trans. Intell. Syst. Technol. TIST 8(1), 1–21 (2016)
Article Google Scholar
Zhong, Chen, Yen, John, Liu, Peng, Erbacher, Robert F: Learning from experts’ experience: toward automated cyber security data triage. IEEE Syst. J. 13(1), 603–614 (2018)
Article Google Scholar
Feng, C., Wu, S., Liu, N.: A user-centric machine learning framework for cyber security operations center. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), pp. 173–175. IEEE (2017)
Zhong, C., Yen, J., Liu, P.: Can cyber operations be made autonomous? an answer from the situational awareness viewpoint. In: Adaptive Autonomous Secure Cyber Systems, pp. 63–88. Springer (2020)
Peng, K., Leung, V.C.M., Huang, Q.: Clustering approach based on mini batch kmeans for intrusion detection system over big data. IEEE Access 6, 11897–11906 (2018)
Article Google Scholar
Otoum, S., Kantarci, B., Mouftah, H.: Empowering reinforcement learning on big sensed data for intrusion detection. In: IEEE International Conference on Communications (ICC), pp. 1–7 (2019)
Uwagbole, S.O., Buchanan, W.J., Fan, L.: Applied machine learning predictive analytics to sql injection attack detection and prevention. In: IFIP/IEEE Symposium on Integrated Network and Service Management (IM), pp. 1087–1090, Lisbon (2017)
Aloqaily, M., Otoum, S., Ridhawi, I.A., Jararweh, Y.: An intrusion detection system for connected vehicles in smart cities. Ad Hoc Netw. 90, 101842. Recent advances on security and privacy in Intelligent Transportation Systems. (2019)
Goeschel, K.: Reducing false positives in intrusion detection systems using data-mining techniques utilizing support vector machines, decision trees, and naive bayes for off-line analysis. In: SoutheastCon 2016, pp. 1–6, Norfolk, VA (2016)
Hachmi, Fatma, Boujenfa, Khadouja, Limam, Mohamed: Enhancing the accuracy of intrusion detection systems by reducing the rates of false positives and false negatives through multi-objective optimization. J. Netw. Syst. Manag. 27, 93–120 (2019)
Article Google Scholar
Gil Pérez, Manuel, FMármol, élix Gómez, Pérez, Gregorio Martínez, Skarmeta Gómez, Antonio F.: Repcidn: a reputation-based collaborative intrusion detection network to lessen the impact of malicious alarms. J. Netw. Syst. Manag. 21, 128–167 (2013)
Article Google Scholar
Khosravi-Farmad, Masoud, Ghaemi-Bafghi, Abbas: Bayesian decision network-based security risk management framework. J. Netw. Syst. Manag. 28, 1794–1819 (2020)
Article Google Scholar
Otoum, S., Kantarci, B., Mouftah, H.T.: Mitigating false negative intruder decisions in wsn-based smart grid monitoring. In: 13th International Wireless Communications and Mobile Computing Conference (IWCMC), pp. 153–158 (2017)
Liu, Y., Sarabi, A., Zhang, J., Naghizadeh, P., Karir, M., Bailey, M., Liu, M.: Cloudy with a chance of breach: Forecasting cyber security incidents. In: 24th USENIX Security Symposium (USENIX Security 15), pp. 1009–1024, Washington, D.C., (August 2015). USENIX Association
Soska, K., Christin, N.: Automatically detecting vulnerable websites before they turn malicious. In: 23rd USENIX Security Symposium (USENIX Security 14), pp. 625–640, San Diego, CA, (August 2014). USENIX Association
Ritter, A., Wright, E., Casey, W.A., Michael, T.: Weakly supervised extraction of computer security events from twitter. In: Proceedings of the 24th International Conference on World Wide Web WWW, pp. 896–905 (2015)
Yang, H., Ma, X., Du, K., Li, Z., Duan, H., Su, X., Liu, G., Geng, Z., Wu, J.: How to learn klingon without a dictionary: Detection and measurement of black keywords used by the underground economy. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 751–769 (2017)
Sabottke, Carl., Suciu, Octavian., Dumitras, Tudor.: Vulnerability disclosure in the age of social media: Exploiting twitter for predicting real-world exploits. In 24th USENIX Security Symposium (USENIX Security 15), pages 1041–1056, Washington, D.C., USENIX Association. (2015)
Borgolte, K., Kruegel, C., Vigna, G.: Delta: automatic identification of unknown web-based infection campaigns. In: The ACM SIGSAC Conference on Computer and Communications Security, pp. 109–120 (2013)
Wang, M., Zheng, K., Yang, Y., Wang, X.: An explainable machine learning framework for intrusion detection systems. IEEE Access 8, 73127–73141 (2020)
Article Google Scholar
Chandrasekaran, B., Tanner, M.C., Josephson, J.R.: Explaining control strategies in problem solving. IEEE Intell. Syst. (1), 9–15 (1989)
Swartout, W.R., Moore, J.D.: Explanation in second generation expert systems. In: Second Generation Expert Systems, pp. 543–585. Springer (1993)
Swartout, W.R.: Rule-based expert systems: The mycin experiments of the stanford heuristic programming project: Bg buchanan and eh shortliffe,(Addison-Wesley, Reading, MA, 1984), p. 702 (1985)
Esper, M.T.: Ai ethical principles. https://www.defense.gov/Newsroom/Releases/Release/Article/2091996/dod-adopts-ethical-principles-for-artificial-intelligence/, (February 2020). Accessed 03 July 2020
Yang, S.C.-H., Shafto, P.: Explainable artificial intelligence via bayesian teaching. In: NIPS 2017 Workshop on Teaching Machines, Robots, and Humans (2017)
Lei, T., Barzilay, R., Jaakkola, T.: Rationalizing neural predictions. arXiv preprint arXiv:1606.04155 (2016)
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. ACM (2016)
Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., Sayres, R.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). arXiv preprint arXiv:1711.11279 (2017)
Horel, E., Giesecke, K.: Towards explainable ai: Significance tests for neural networks. arXiv preprint arXiv:1902.06021 (2019)
Marino, D.L., Wickramasinghe, C.S., Manic, M.: An adversarial approach for explainable ai in intrusion detection systems. In: IECON 2018—44th Annual Conference of the IEEE Industrial Electronics Society, pp. 3237–3243 (2018)
Hartl, A., Bachl, M., Fabini, J., Zseby, T.: Explainability and adversarial robustness for rnns. In: 2020 IEEE Sixth International Conference on Big Data Computing Service and Applications (BigDataService), pp. 148–156 (2020)
Kasun Amarasinghe, K., Kenney, K., Manic, M.: Toward explainable deep neural network based anomaly detection. In: 2018 11th International Conference on Human System Interaction (HSI), pp. 311–317 (2018)
Wang, Zhidong, Lai, Yingxu, Liu, Zenghui, Liu, Jing: Explaining the attributes of a deep learning based intrusion detection system for industrial control networks. Sensors 20(14), 3817 (2020)
Article Google Scholar
Al Ridhawi, Ismaeel, Otoum, Safa, Aloqaily, Moayad, Boukerche, Azzedine: Generalizing AI: challenges and opportunities for plug and play AI solutions. IEEE Netw. 35(1), 372–379 (2020)
Google Scholar
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of the 34th International Conference on Machine Learning, Vol. 70, pp. 3145–3153. JMLR.org (2017)
Ando, S.: Interpreting random forests. http://blog.datadive.net/interpreting-random-forests/ (2019)
Datta, A., Sen, S., Zick, Y.: Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 598–617. IEEE (2016)
Štrumbelj, Erik, Kononenko, Igor: Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41(3), 647–665 (2014)
Article Google Scholar
Lipovetsky, Stan, Conklin, Michael: Analysis of regression in game theory approach. Appl. Stoch. Models Bus. Ind. 17(4), 319–330 (2001)
Article MathSciNet Google Scholar
Bach, Sebastian, Binder, Alexander, Montavon, Grégoire, Klauschen, Frederick, Müller, Klaus-Robert, Samek, Wojciech: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)
Article Google Scholar
Lundberg, Scott.: Shap vs lime
Arrieta, A.B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al.: Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Inf. Fusion 58, 82–115 (2020)
Liu, H., Kim, J.: Data quality assessment and problem severity assessment for data cleaning. In: The 15th International Conference on Data Science, pp. 207–210 (2019)
Dash, S., Gunluk, O., Wei, D.: Boolean decision rules via column generation. In: Advances in Neural Information Processing Systems, pp. 4655–4665 (2018)
Islam, S.R., Eberle, W., Bundy, S., Ghafoor, S.K.: Infusing domain knowledge in ai-based” black box” models for better explainability with application in bankruptcy prediction. arXiv preprint arXiv:1905.11474 (2019)
Dhurandhar, A., Chen, P.-Y., Luss, R., Tu, C.-C., Ting, P., Shanmugam, K., Das, P.: Explanations based on the missing: towards contrastive explanations with pertinent negatives. In: Advances in Neural Information Processing Systems, pp. 592–603 (2018)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530 (2016)
Doshi-Velez, F., Kim, B.: Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017)
Melis, D.A., Jaakkola, T.: Towards robust interpretability with self-explaining neural networks. In: Advances in Neural Information Processing Systems, pp. 7775–7784 (2018)
Luss, R., Chen, P.-Y., Dhurandhar, A., Sattigeri, P., Zhang, Y., Shanmugam, K., Tu, C.-C.: Generating contrastive explanations with monotonic attribute functions. arXiv preprint arXiv:1905.12698 (2019)
Maciá-Fernández, Gabriel, Camacho, José, Magán-Carrión, Roberto, García-Teodoro, Pedro, Therón, Roberto: Ugr‘16: a new dataset for the evaluation of cyclostationarity-based network idss. Comput. Secur. 73, 411–424 (2018)
Article Google Scholar
Arya, V., Bellamy, R. K.E., Chen, P.-Y., Dhurandhar, A., Hind, M., Hoffman, S.C., Houde, S., Liao, Q.V., Luss, R., Mojsilović, A. et al.: One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv preprint arXiv:1909.03012 (2019)
Wei, D., Dash, S., Gao, T., Günlük, O.: Generalized linear rule models. arXiv preprint arXiv:1906.01761 (2019)
Gurumoorthy, K.S., Dhurandhar, A., Cecchi, G., Aggarwal, C.: Efficient data representation by selecting prototypes with importance weights. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 260–269. IEEE (2019)
Islam, S.R., Eberle, W., Ghafoor, S.K.: Towards quantification of explainability in explainable artificial intelligence methods. AAAI Publications, The Thirty-Third International Flairs Conference (2020)
Miller, George A: The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev. 63(2), 81 (1956)
Article Google Scholar
Wolf, C.T., Ringland, K.E.: Designing accessible, explainable ai (xai) experiences. In: ACM SIGACCESS Accessibility and Computing (125):1–1 (2020)

Download references

Acknowledgements

Awny Alnusair was supported by the IU Kokomo summer faculty fellowship program.

Author information

Authors and Affiliations

Indiana University Kokomo, 2300 South Washington St, Kokomo, IN, 46904, USA
Hong Liu & Awny Alnusair
University of Tampa, 401 W Kennedy Blvd, Tampa, FL, 33606, USA
Chen Zhong
University of Hartford, 200 Bloomfield Ave, West Hartford, CT, 06117, USA
Sheikh Rabiul Islam

Authors

Hong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chen Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Awny Alnusair
View author publications
You can also search for this author in PubMed Google Scholar
Sheikh Rabiul Islam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Awny Alnusair.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Zhong, C., Alnusair, A. et al. FAIXID: A Framework for Enhancing AI Explainability of Intrusion Detection Results Using Data Cleaning Techniques. J Netw Syst Manage 29, 40 (2021). https://doi.org/10.1007/s10922-021-09606-8

Download citation

Received: 16 October 2020
Revised: 20 March 2021
Accepted: 22 April 2021
Published: 24 May 2021
DOI: https://doi.org/10.1007/s10922-021-09606-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FAIXID: A Framework for Enhancing AI Explainability of Intrusion Detection Results Using Data Cleaning Techniques

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Explainable AI for Intrusion Prevention: A Review of Techniques and Applications

Interpreting Intrusions - The Role of Explainability in AI-Based Intrusion Detection Systems

Domain Knowledge-Aided Explainable Artificial Intelligence

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

FAIXID: A Framework for Enhancing AI Explainability of Intrusion Detection Results Using Data Cleaning Techniques

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Explainable AI for Intrusion Prevention: A Review of Techniques and Applications

Interpreting Intrusions - The Role of Explainability in AI-Based Intrusion Detection Systems

Domain Knowledge-Aided Explainable Artificial Intelligence

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation