Skip to main content

Learning and Semantics

  • Chapter
  • First Online:
Cyber Defense and Situational Awareness

Part of the book series: Advances in Information Security ((ADIS,volume 62))

  • 2592 Accesses

Abstract

This chapter further elaborates on a topic of the previous chapter—inference—by focusing on a particular class of algorithms important for processing of cyber information—machine learning. The chapter also continues the thread of ontology and semantics as it explores the tradeoffs between the effectiveness of an algorithm and the semantic clarity of its products. It is often difficult to extract meaningful contextual information from a machine learning algorithm, because those algorithms that provide high accuracy also tend to use representations less comprehensible to humans. On the other hand, those algorithms that use more human-accessible vocabulary can be less accurate—they produce more false alerts (false positives), which confuse analysts. A related tradeoff is between the internal semantics of the algorithm versus the external semantics of its output. We illustrate this tradeoff with two case studies. Developers of CSA systems must be aware of such tradeoffs, and seek ways to mitigate them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Abe, N., Zadrozny, B., and Langford, J. “Outlier detection by active learning,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 2006.

    Google Scholar 

  • Alon, N., Gibbons, P. B., Matias, Y., & Szegedy, M. (1999). Tracking join and self-join sizes in limited storage. Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems.

    Google Scholar 

  • Axelsson, S. “The base-rate fallacy and the difficulty of intrusion detection,” ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 3, pp. 186–205, 2000.

    Google Scholar 

  • Barford, P., Dacier, M., Dietterich, T. G., Fredrikson, M., Giffin, J., Jajodia, S., and Jha, S. “Cyber SA: Situational awareness for cyber defense,” in Cyber Situational Awareness, Springer, 2010a, pp. 3–13.

    Google Scholar 

  • Barford, P., Chen, Y., Goyal, A., Li, Z., Paxson, V., and Yegneswaran, V. “Employing Honeynets for network situational awareness,” in Cyber Situational Awareness, Springer, 2010b, pp. 71–102.

    Google Scholar 

  • Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. Proceedings of COMPSTAT, 2010.

    Google Scholar 

  • Brugger, S. T., and Chow, J. “An assessment of the DARPA IDS Evaluation Dataset using Snort,” UC Davis department of Computer Science, 2007.

    Google Scholar 

  • Ciresan, D., Meier, U., and Schmidhuber, J. “Multi-column deep neural networks for image classification,” in IEEE Conference on Computer Vision and Pattern Recognition, 2012.

    Google Scholar 

  • Cisco Corporation. “Cisco Visual Networking Index: Forecast and Methodology, 2012–2017,” Cisco Corporation, 2013.

    Google Scholar 

  • Cortes, C., and Vapnik, V. “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.

    Google Scholar 

  • D’Amico, A., Whitley, K., Tesone, D., O’Brien, B., and Roth, E. “Achieving cyber defense situational awareness: A cognitive task analysis of information assurance analysts,” in Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 2005.

    Google Scholar 

  • Depren, O., Topallar, M., Anarim, E., and Ciliz, M. K. “An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks,” Expert Systems with Applications, vol. 29, no. 4, pp. 713–722, nov 2005.

    Google Scholar 

  • Endsley, M. R. “Toward a theory of situation awareness in dynamic systems,” Human Factors: The Journal of the Human Factors and Ergonomics Society, vol. 37, no. 1, pp. 32–64, 1995.

    Google Scholar 

  • Ertoz, L., Eilertson, E., Lazarevic, A., Tan, P.-N., Kumar, V., Srivastava, J., and Dokas, A. P. “MINDS-minnesota intrusion detection system,” Next Generation Data Mining, pp. 199–218, 2004.

    Google Scholar 

  • Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks. ArXiv/CS, abs/1312.6082.

    Google Scholar 

  • Harang, R. “Bridging the Semantic Gap: Human Factors in Anomaly-Based Intrusion Detection Systems,” in Network Science and Cybersecurity, New York, Springer, 2014, pp. 15–37.

    Google Scholar 

  • Harang, R., and Guarino, P. “Clustering of Snort alerts to identify patterns and reduce analyst workload,” in MILITARY COMMUNICATIONS CONFERENCE, 2012.

    Google Scholar 

  • Lakhina, A., Crovella, M., and Diot, C. “Diagnosing network-wide traffic anomalies,” ACM SIGCOMM Computer Communication Review, vol. 34, no. 4, pp. 219–230, 2004.

    Google Scholar 

  • Lakhina, A., Crovella, M., and Diot, C. “Mining anomalies using traffic feature distributions,” ACM SIGCOMM Computer Communication Review, vol. 35, no. 4, pp. 217–228, 2005.

    Google Scholar 

  • Lakkaraju, K., Yurcik, W., and Lee, A. J. “NVisionIP: netflow visualizations of system state for security situational awareness,” in 2004 ACM workshop on Visualization and data mining for computer security, 2004.

    Google Scholar 

  • Laskov, P., Dussel, P., Schafer, C., and Rieck, K. “Learning Intrusion Detection: Supervised or Unsupervised,” in Image analysis and processing, 2005.

    Google Scholar 

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1999). Gradient-based learning applied to document recognition. Proceedings of the IEEE , 86(11), 2278-2324.

    Article  Google Scholar 

  • LeCun, Y., Cortes, C., & Burges, C. J. (2014). MNIST handwritten digit database. Retrieved April 14, 2014, from http://yann.lecun.com/exdb/mnist/

    Google Scholar 

  • Li, P., and König, C. “b-Bit minwise hashing,” in ACM Proceedings of the 19th international conference on World wide web, 2010.

    Google Scholar 

  • Li, W.-J., Wang, K., Stolfo, S. J., and Herzog, B. “Fileprints: Identifying file types by n-gram analysis,” in Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop, 2005.

    Google Scholar 

  • Mell, P. “Hyperagg: A Python Program for Efficient Alert Aggregation Using Set Cover Approximation and Hamming Distance,” National Institute of Standards and Technology, 2013. [Online]. Available: http://csrc.nist.gov/researchcode/hyperagg-mell-20130109.zip.

  • Mell, P., and Harang, R. “Enabling Efficient Analysts: Reducing Alerts to Review through Hamming Distance Based Aggregation (SUBMITTED),” in Twelfth Annual Conference on Privacy, Security, and Trust, Toronto, 2014.

    Google Scholar 

  • Molina, M., Paredes-Oliva, I., Routly, W., and Barlet-Ros, P. “Operational experiences with anomaly detection in backbone networks,” Computers & Security, vol. 31, no. 3, pp. 273–285, may 2012.

    Google Scholar 

  • Murphy, K. P. (2012). Machine learning: a probabilistic perspective. MIT Press.

    Google Scholar 

  • Paxson, V. “Bro: A system for detecting network intruders in real time,” Computer Networks, vol. 31, no. 23–24, pp. 2435–2463, 1999.

    Google Scholar 

  • Rehak, M., Pechoucek, M., Celeda, P., Novotny, J., and Minarik, P. “CAMNEP: agent-based network intrusion detection system,” in Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems, 2008.

    Google Scholar 

  • Roesch, M. “Snort – lightweight intrusion detection for networks,” Proceedings of the 13th USENIX conference on System administration, pp. 229–238, 1999.

    Google Scholar 

  • Shi, Q., Petterson, J., Dror, G., Langford, J., Strehl, A. L., Smola, A. J., and Vishwanathan, S. V. N. “Hash kernels,” in International Conference on Artificial Intelligence and Statistics, 2009.

    Google Scholar 

  • Sommer, R., and Paxson, V. “Outside the Closed World: On Using Machine Learning for Network Intrusion Detection,” in 2010 IEEE Symposium on Security and Privacy (SP), 2010.

    Google Scholar 

  • Song, Y., Locasto, M. E., Stavrou, A., Keromytis, A. D., and Stolfo, S. J. “On the infeasibility of modeling polymorphic shellcode – Re-thinking . . .,” MACH LEARN, 2009.

    Google Scholar 

  • Wang, K., and Stolfo, S. “Anomalous payload-based network intrusion detection,” in Recent Advances in Intrusion Detection, 2004.

    Google Scholar 

  • Weinberger, K., Dasgupta, A., Langford, J., Smola, A., and Attenberg, J. “Feature hashing for large scale multitask learning,” in Proceedings of the 26th Annual International Conference on Machine Learning, 2011.

    Google Scholar 

  • Wilshusen, G. C. “CYBERSECURITY: A Better Defined and Implemented National Strategy Is Needed to Address Persistent Challenges,” 2013.

    Google Scholar 

  • Xu, K., Zhang, Z.-L., and Bhattacharyya, S. “Reducing unwanted traffic in a backbone network,” in USENIX Workshop on Steps to Reduce Unwanted Traffic in the Internet, Boston, 2005.

    Google Scholar 

  • Yegneswaran, V., Barford, P., and Paxson, V. “Using honeynets for internet situational awareness,” in ACM Hotnets IV, 2005.

    Google Scholar 

  • Yin, X., Yurcik, W., Treaster, M., Li, Y., and Lakkaraju, K. “VisFlowConnect: netflow visualizations of link relationships for security situational awareness,” in 2004 ACM workshop on Visualization and data mining for computer security, 2004.

    Google Scholar 

  • Zhang, J., Zulkernine, M., and Haque, A. “Random-Forests-Based Network Intrusion Detection Systems,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 38, no. 5, pp. 649–659, sep 2008.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard Harang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Harang, R. (2014). Learning and Semantics. In: Kott, A., Wang, C., Erbacher, R. (eds) Cyber Defense and Situational Awareness. Advances in Information Security, vol 62. Springer, Cham. https://doi.org/10.1007/978-3-319-11391-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11391-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11390-6

  • Online ISBN: 978-3-319-11391-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics