Skip to main content

Advertisement

Log in

Alarm Log Data Augmentation Algorithm Based on a GAN Model and Apriori

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

The complexity of alarm detection and diagnosis tasks often results in a lack of alarm log data. Due to the strong rule associations inherent in alarm log data, existing data augmentation algorithms cannot obtain good results for alarm log data. To address this problem, this paper introduces a new algorithm for augmenting alarm log data, termed APRGAN, which combines a generative adversarial network (GAN) with the Apriori algorithm. APRGAN generates alarm log data under the guidance of rules mined by the rule miner. Moreover, we propose a new dynamic updating mechanism to alleviate the mode collapse problem of the GAN. In addition to updating the real reference dataset used to train the discriminator in the GAN, we dynamically update the parameters and the rule set of the Apriori algorithm according to the data generated in each epoch. Through extensive experimentation on two public datasets, it is demonstrated that APRGAN surpasses other data augmentation algorithms in the domain with respect to alarm log data augmentation, as evidenced by its superior performance on metrics such as BLEU, ROUGE, and METEOR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Du M, Li F F, Zheng G N, Srikumar V. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. In Proc. the 2017 ACM SIGSAC Conference on Computer and Communications Security, Oct. 2017, pp.1285–1298. DOI: https://doi.org/10.1145/3133956.3134015.

    Chapter  Google Scholar 

  2. Fu Q, Lou J G, Wang Y, Li J. Execution anomaly detection in distributed systems through unstructured log analysis. In Proc. the 9th IEEE International Conference on Data Mining, Dec. 2009, pp.149–158. DOI: https://doi.org/10.1109/ICDM.2009.60.

    Google Scholar 

  3. He S L, Zhu J M, He P J, Lyu M R. Experience report: System log analysis for anomaly detection. In Proc. the 27th IEEE International Symposium on Software Reliability Engineering, Oct. 2016, pp.207–218. DOI: https://doi.org/10.1109/ISSRE.2016.21.

    Google Scholar 

  4. Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): 60. DOI: https://doi.org/10.1186/s40537-019-0197-0.

    Article  Google Scholar 

  5. Lou J G, Fu Q, Yang S Q, Xu Y, Li J. Mining invariants from console logs for system problem detection. In Proc. the 2010 USENIX conference on USENIX Annual Technical Conference, Jun. 2010, Article No. 24.

    Google Scholar 

  6. Xu W, Huang L, Fox A, Patterson D, Jordan M I. Detecting large-scale system problems by mining console logs. In Proc. the 22nd ACM SIGOPS Symposium on Operating Systems Principles, Oct. 2009, pp.117–132. DOI: https://doi.org/10.1145/1629575.1629587.

    Chapter  Google Scholar 

  7. Zhang C K, Wang X Y, Zhang H Y, Zhang H Y, Han P Y. Log sequence anomaly detection based on local information extraction and globally sparse Transformer model. IEEE Trans. Network and Service Management, 2021, 18(4): 4119–4133. DOI: https://doi.org/10.1109/TNSM.2021.3125967.

    Article  Google Scholar 

  8. Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In Proc. the 27th International Conference on Neural Information Processing Systems, Dec. 2014, pp.2672–2680.

    Google Scholar 

  9. Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In Proc. the 20th International Conference on Very Large Data Bases, Sept. 1994, pp.487–499.

    Google Scholar 

  10. Du M, Li F F. Spell: Streaming parsing of system event logs. In Proc. the 16th IEEE International Conference on Data Mining, Dec. 2016, pp.859–864. DOI: https://doi.org/10.1109/ICDM.2016.0103.

    Google Scholar 

  11. Liu P, Wang X M, Xiang C, Meng W Y. A survey of text data augmentation. In Proc. the 2020 International Conference on Computer Communication and Network Security, Aug. 2020, pp.191–195. DOI: https://doi.org/10.1109/CCNS50731.2020.00049.

    Google Scholar 

  12. Chawla N V, Bowyer K W, Hall L O, Kegelmeyer W P. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16(1): 321–357.

    Article  Google Scholar 

  13. Alejo R, García V, Pacheco-Sánchez J H. An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem. Neural Processing Letters, 2015, 42(3): 603–617. DOI: https://doi.org/10.1007/s11063-014-9376-3.

    Article  Google Scholar 

  14. Rivera W A. Noise reduction a priori synthetic over-sampling for class imbalanced data sets. Information Sciences, 2017, 408: 146–161. DOI: https://doi.org/10.1016/j.ins.2017.04.046.

    Article  Google Scholar 

  15. Yu L T, Zhang W N, Wang J, Yu Y. seqGAN: Sequence generative adversarial nets with policy gradient. In Proc. the 31st AAAI Conference on Artificial Intelligence, Feb. 2017, pp.2852–2858. DOI: https://doi.org/10.1609/aaai.v31i1.10804.

    Google Scholar 

  16. Lin K, Li D Q, He X D, Zhang Z Y, Sun M T. Adversarial ranking for language generation. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.3158–3168.

    Google Scholar 

  17. Guo J X, Lu S D, Cai H, Zhang W N, Yu Y, Wang J. Long text generation via adversarial training with leaked information. In Proc. the 32nd AAAI Conference on Artificial Intelligence, Feb. 2018, pp.5141–5148. DOI: https://doi.org/10.1609/aaai.v32i1.11957.

    Google Scholar 

  18. Makanju A, Zincir-Heywood A N, Milios E E. Investigating event log analysis with minimum apriori information. In Proc. the 2013 IFIP/IEEE International Symposium on Integrated Network Management, May 2013, pp.962–968.

    Google Scholar 

  19. Hu W K, Chen T W, Shah S L. Discovering association rules of mode-dependent alarms from alarm and event logs. IEEE Trans. Control Systems Technology, 2018, 26(3): 971–983. DOI: https://doi.org/10.1109/TCST.2017.2695169.

    Article  Google Scholar 

  20. Wang C, Vo H T, Ni P. An IoT application for fault diagnosis and prediction. In Proc. the 2015 IEEE International Conference on Data Science and Data Intensive Systems, Dec. 2015, pp.726–731. DOI: https://doi.org/10.1109/DSDIS.2015.97.

    Google Scholar 

  21. Mikolov T, Karafiát M, Burget L, Cernocky J, Khudanpur S. Recurrent neural network based language model. In Proc. the 11th Annual Conference of the International Speech Communication Association, Sept. 2010, pp.1045–1048.

    Google Scholar 

  22. Sutton R S, McAllester D, Singh S, Mansour Y. Policy gradient methods for reinforcement learning with function approximation. In Proc. the 12th International Conference on Neural Information Processing Systems, Nov. 1999, pp.1057–1063.

    Google Scholar 

  23. Borthakur D. HDFS architecture guide. May 2022. https://hadoop.apache.org/docs/r1.2.1/hdfs_design.pdf, Jul. 2024.

    Google Scholar 

  24. Rosado T, Bernardino J. An overview of openstack architecture. In Proc. the 18th International Database Engineering & Applications Symposium, Jul. 2014, pp.366–367. DOI: https://doi.org/10.1145/2628194.2628195.

    Chapter  Google Scholar 

  25. Papineni K, Roukos S, Ward T, Zhu W J. Bleu: A method for automatic evaluation of machine translation. In Proc. the 40th Annual Meeting of the Association for Computational Linguistics, Jul. 2002, pp.311–318. DOI: https://doi.org/10.3115/1073083.1073135.

    Google Scholar 

  26. Lin C. ROUGE: A package for automatic evaluation of summaries. In Proc. the 2004 Text Summarization Branches Out, Jul. 2004, pp.74–81.

    Google Scholar 

  27. Banerjee S, Lavie A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proc. the 2005 ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Jun. 2005, pp.65–72.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Peng Gao  (高志鹏).

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

This work was supported by the National Key Research and Development Program of China under Grant No. 2019YFB-2103202.

Yang Yang received her Ph.D. degree in computer science from Beijing University of Posts and Telecommunications (BUPT), Beijing, in 2011. She is currently an associate professor at the State Key Laboratory of Network and Switching Technology of BUPT. Her research interests are in the area of network management based on big data and artificial intelligence, and related fields.

Yu-Ting Li received his B.S. drgree in computer science and technology from Beijing University of Posts and Telecommunications (BUPT), Beijing, in 2020. He is now pursing his M.S. degree at the State Key Laboratory of Networking and Switching Technology of BUPT, Beijing. His research interests cover fault diagnosis and data augmentation.

Yong-Hua Huo is a senior engineer of the Communication Networks Laboratory of the 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang. Her research interests are network fault management and network anomaly detection.

Zhi-Peng Gao received his Ph.D. degree in computer science from Beijing University of Posts and Telecommunications (BUPT), Beijing, in 2007. He is currently a professor at the State Key Laboratory of Network and Switching Technology of BUPT, Beijing. His research interests are in the area of blockchain, big data analysis, edge computing, edge intelligent, and related fields.

Lan-Lan Rui is an associate professor of State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications (BUPT), Beijing. She received her Ph.D. degree in computer application technology from BUPT, Beijing, in 2010. Her research interests include edge computing, content based measurement and analysis, quality of service (QoS), smart service provisioning in mobile social network, and intelligent theory and technology of network services.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Li, YT., Huo, YH. et al. Alarm Log Data Augmentation Algorithm Based on a GAN Model and Apriori. J. Comput. Sci. Technol. 39, 951–966 (2024). https://doi.org/10.1007/s11390-024-2408-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-024-2408-1

Keywords

Navigation