Skip to main content

Detecting and Coloring Anomalies in Real Cellular Network Using Principle Component Analysis

  • Conference paper
  • First Online:
  • 1038 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10879))

Abstract

Anomaly detection in a communication network is a powerful tool for predicting faults, detecting network sabotage attempts and learning user profiles for marketing purposes and quality of services improvements. In this article, we convert the unsupervised data mining learning problem into a supervised classification problem. We will propose three methods for creating an associative anomaly within a given commercial traffic data database and demonstrate how, using the Principle Component Analysis (PCA) algorithm, we can detect the network anomaly behavior and classify between a regular data stream and a data stream that deviates from a routine, at the IP network layer level. Although the PCA method was used in the past for the task of anomaly detection, there are very few examples where such tasks were performed on real traffic data that was collected and shared by a commercial company.

The article presents three interesting innovations: The first one is the use of an up-to-date database produced by the users of an international communications company. The dataset for the data mining algorithm retrieved from a data center which monitors and collects low-level network transportation log streams from all over the world. The second innovation is the ability to enable the labeling of several types of anomalies, from untagged datasets, by organizing and prearranging the database. The third innovation is the abilities, not only to detect the anomaly but also, to coloring the anomaly type. I.e., identification, classification and labeling some forms of the abnormality.

This work was supported by the Israel Innovation Authority (Formerly the Office of the Chief Scientist and MATIMOP).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Estan, C., Savage, S., Varghese, G.: Automatically inferring patterns of resource consumption in network traffic. In: ACM SIGCOMM, Karlsruhe, Germany, pp. 137–148 (2003)

    Google Scholar 

  2. Zhang, Y., Singh, S., Sen, S., Duffield, N., Lund, C.: Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications. In: ACM Internet Measurement Conference, Taormina, Sicily, Italy, pp. 101–114 (2004)

    Google Scholar 

  3. Barford, P., Kline, J., Plonka, D., Ron, A.: A signal analysis of network traffic anomalies. In: ACM Internet Measurement Workshop, Marseille, France, pp. 71–82 (2002)

    Google Scholar 

  4. Krishnamurthy, B., Sen, S., Zhang, Y., Chen, Y.: Sketch-based change detection: methods, evaluation, and applications. In: ACM Internet Measurement Conference, Miami Beach, FL, USA, pp. 234–247 (2003)

    Google Scholar 

  5. Zhang, Y., Ge, Z., Greenberg, A., Roughan, M.: Network anomography. In: ACM Internet Measurement Conference, Berkeley, California, USA, October 2005

    Google Scholar 

  6. Soule, A., Salamatian, K., Taft, N.: Combining filtering and statistical methods for anomaly detection. In: ACM Internet Measurement Conference, Berkeley, California, USA, October 2005

    Google Scholar 

  7. Lakhina, A., Crovella, M., Diot, C.: Mining anomalies using traffic feature distributions. In: ACM SIGCOMM, Philadelphia, Pennsylvania, USA, pp. 217–228 (2005)

    Article  Google Scholar 

  8. Lakhina, A., Crovella, M., Diot, C.: Diagnosing network-wide traffic anomalies. In: ACM SIGCOMM, Portland, Oregon, USA, pp. 219–230 (2004)

    Article  Google Scholar 

  9. Soule, A., Ringberg, H., Silveira, F., Rexford, J., Diot, C.: Detectability of traffic anomalies in two adjacent networks. In: Passive and Active Measurement Conference (2007)

    Google Scholar 

  10. Mai, J., Chuah, C.-N., Sridharan, A., Ye, T., Zang, H.: Is sampled data sufficient for anomaly detection? In: ACM Internet measurement Conference, Rio de Janeriro, Brazil, pp. 165–176 (2006)

    Google Scholar 

  11. Mai, J., Sridharan, A., Chuah, C.-N., Zang, H., Ye, T.: Impact of packet sampling on portscan detection. IEEE J. Sel. Areas Commun. 24, 2285–2298 (2006)

    Article  Google Scholar 

  12. Brauckhoff, D., Tellenbach, B., Wagner, A., May, M., Lakhina, A.: Impact of packet sampling on anomaly detection metrics. In: ACM Internet Measurement Conference, Rio de Janeriro, Brazil, pp. 159–164 (2006)

    Google Scholar 

  13. Fodor, I.K.: A Survey of Dimension Reduction Techniques, Technical report UCRL-ID-148494, Lawrence Livermore Nat’l Laboratory, Center for Applied Scientific Computing, June 2002

    Google Scholar 

  14. Mao, K.Z.: Identifying critical variables of principal components for unsupervised feature selection. IEEE Trans. Syst. Man Cybern. Part B 35, 339–344 (2005)

    Article  Google Scholar 

  15. Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16(3), 199–215 (2001)

    Article  MathSciNet  Google Scholar 

  16. Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Comput. 9(7), 1545–1588 (1997)

    Article  Google Scholar 

  17. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)

    Article  Google Scholar 

  18. Webb, A.R.: Statistical Pattern Recognition, 2nd edn. Wiley, Chichester (2002)

    Book  Google Scholar 

  19. Viswanath, B., Bashir, M., Crovella, M., Guha, S., Gummadi, K., Krishnamurthy, B., Mislove, A.: Towards detecting anomalous user behavior in online social networks. In: 23rd USENIX Security Symposium (USENIX Security 14), pp. 223–238 (2014)

    Google Scholar 

  20. Bian, L.X., Crovella, F., Diot, M., Govindan, C., Iannaccone, R., Lakhina, A.: Detection and identification of network anomalies using sketch subspaces. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement, pp. 147–152 (2006)

    Google Scholar 

  21. Lakhina, A., Crovella, M., Diot, C.: Characterization of network-wide anomalies in traffic flows. In: Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement, pp. 201–206, (2004)

    Google Scholar 

  22. Lakhina, A., Crovella, M., Diot, C.: Diagnosing network-wide traffic anomalies. SIGCOMM Comput. Commun. Rev. 34(4), 219–230 (2004)

    Article  Google Scholar 

  23. Lakhina, A., Crovella, M., Diot, C.: Mining anomalies using traffic feature distributions. SIGCOMM Comput. Commun. Rev. 35(4), 217–228 (2005)

    Article  Google Scholar 

  24. Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., Kolaczyk, E., Taft, N.: Structural analysis of network traffic flows. SIGMETRICS Perform. Eval. Rev. 32(1), 61–72 (2004)

    Article  Google Scholar 

  25. Martin, R.A., Schwabacher, M., Oza, N., Srivastava, A.: Comparison Of Unsupervised Anomaly Detection Methods For Systems Health Management Using Space Shuttle Main Engine Data. Researchgate (2007)

    Google Scholar 

  26. Anderson, T.W.: An Introduction to Multivariate Statistical Analysis. Wiley Series in Probability and Mathematical Statistics, 2nd edn. Wiley, New York (1984)

    MATH  Google Scholar 

  27. Muirhead, R.J.: Aspects of Multivariate Statistical Theory. Wiley, New York (1982)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoram Segal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Segal, Y., Vilenchik, D., Hadar, O. (2018). Detecting and Coloring Anomalies in Real Cellular Network Using Principle Component Analysis. In: Dinur, I., Dolev, S., Lodha, S. (eds) Cyber Security Cryptography and Machine Learning. CSCML 2018. Lecture Notes in Computer Science(), vol 10879. Springer, Cham. https://doi.org/10.1007/978-3-319-94147-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94147-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94146-2

  • Online ISBN: 978-3-319-94147-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics