Skip to main content

A Markov Random Field Approach to Automated Protocol Signature Inference

  • Conference paper
Security and Privacy in Communication Networks (SecureComm 2015)

Abstract

Protocol signature specifications play an important role in networking and security services, such as Quality of Service(QoS), vulnerability discovery, malware detection, and so on. In this paper, we propose ProParser, a network trace based protocol signature inference system that exploits the embedded contextual correlations of n-grams in protocol messages. In ProParser, we first apply markov field aspect model to discover the contextual relations and spatial structure among n-grams extracted from protocol traces. Next, we perform keyword-based clustering algorithm to cluster messages into extremely cohesive groups, and finally use heuristic ranking rules to generate the signature specifications for the corresponding protocol. We evaluate ProParser on real-world network traces including both textual and binary protocols. We also compare ProParser with the state-of-the-art tool, ProWord, and find that our approach performs more accurately and effectively in practice.

This research was supported by the National Natural Science Foundation of China under grant numbers 61402472, 61572496, 61202067 and 61303261, and the National High Technology Research and Development Program of China under grant numbers 2013AA014703 and 2012AA012803.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cui, W., Kannan, J., Wang, H.J.: Discoverer: automatic protocol reverse engineering from network traces. In: Proceedings of the 16th USENIX Security Symposium, pp. 1–14 (2007)

    Google Scholar 

  2. Haffner, P., Sen, S., Spatscheck, O., Wang, D.: ACAS: automated construction of application signatures. In: Proceedings of the 2005 ACM SIGCOMM Workshop on Mining Network Data, pp. 197–202 (2005)

    Google Scholar 

  3. Wang, Y., et al.: A semantics aware approach to automated reverse engineering unknown protocols. In: Proceedings of the 20th IEEE International Conference on Network Protocol (ICNP), pp. 1–10 (2012)

    Google Scholar 

  4. Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Proceedings of the 12th Neural Information Processing Systems (NIPS), pp. 617–623 (1999)

    Google Scholar 

  5. Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of HTTP-based malware and signature generation using malicious network traces. In: Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation, pp. 391–404 (2010)

    Google Scholar 

  6. Slonim, N., Friedman, N., Tishby, N.: Unsupervised document classification using sequential information maximization. In: Proceedings of the 24th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129–136 (2002)

    Google Scholar 

  7. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101, 5228–5235 (2004)

    Article  Google Scholar 

  8. Finamore, A., Mellia, M., Meo, M., Rossi, D.: Kiss: Stochastic packet inspection classifier for udp traffic. IEEE/ACM Transactions on Networking, 1505–1515 (2010)

    Google Scholar 

  9. Wang, Y., et al.: Biprominer: automatic mining of binary protocol features (PDCAT). In: Proceedings of the 12th IEEE International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 179–184 (2011)

    Google Scholar 

  10. Caballero, J., Yin, H., Liang, Z., Song, D.: Polyglot: automatic extraction of protocol message format using dynamic binary analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 317–329 (2007)

    Google Scholar 

  11. Lim, J., Reps, T., Liblit, B.: Extracting output formats from executables. In: Proceedings of the 13th Working Conference on Reverse Engineering, pp. 167–178 (2006)

    Google Scholar 

  12. Cui, W., Peinado, M., Chen, K., Wang, H.J., Irun-Briz, L.: Tupni: automatic reverse engineering of input formats. In: Proceedings of the 14th ACM Conference on Computer and Communications Security, pp. 391–402 (2008)

    Google Scholar 

  13. Kannan, J., Jung, J., Paxson, V., Koksal, C.E.: Semi-automated discovery of applicattion session signatures. In: Proceedings of the 6th ACM SIGCOMM Conference on Internet Measurement (IMC), pp. 119–132 (2006)

    Google Scholar 

  14. Ma, J., Levchenko, K., Kreibich, C., Savage, S., Voelker, G.M.: Unexpected means of protocol inference. In: Proceedings of the 6th ACM SIGCOMM Internet Measurement Conference, pp. 313–326 (2006)

    Google Scholar 

  15. Holger, D., Anja, F., Michael, M., Vern, P., Robin, S.: Dynamic application-layer protocol analysis for network intrusion detection. In: Proceedings of the 15th Conference on USENIX Security Symposium, pp. 257–272 (2006)

    Google Scholar 

  16. Yun, X., Wang, Y., Zhang, Y., Zhou, Y.: A Semantics-Aware Approach to the Automated Network Protocol Identification. IEEE/ACM Transactions on Networking 24(1), 1–13 (2015)

    Article  Google Scholar 

  17. Zhang, J., Xiang, Y., Zhou, W., Wang, Y.: Unsupervised traffic classification using flow statistical properties and IP packet payload. Journal of Computer and System Sciences 79(5), 573–585 (2013)

    Article  MathSciNet  Google Scholar 

  18. Xie, G., Iliofotou, M., Keralapura, R., Faloutsos, M., Nucci, A.: Subflow: towards practical flow-level traffic classification. In: Proceedings of the 31th Annual International Conference on Computer Communications, pp. 2541–2545 (2012)

    Google Scholar 

  19. Cho, C.Y., Babic, D., Shin, R., Song, D.: Inference and analysis of formal models of botnet command and control protocols. In: Proceedings of the 17th ACM Conference on Computer and Communication Security, pp. 426–439 (2010)

    Google Scholar 

  20. Wang, Y., Zhang, Z., Yao, D.D., Qu, B., Guo, L.: Inferring protocol state machine from network traces: a probabilistic approach. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 1–18. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  21. Zhang, Z., Zhang, Z., Lee, P.P.C., Liu, Y., Xie, G.: ProWord: an unsupervised approach to protocol feature word extraction. In: Proceedings of the 33th Annual International Conference on Computer Communications, pp. 1393–1401 (2014)

    Google Scholar 

  22. Krueger, T., Krämer, N., Rieck, K.: ASAP: automatic semantics-aware analysis of network payloads. In: Dimitrakakis, C., Gkoulalas-Divanis, A., Mitrokotsa, A., Verykios, V.S., Saygin, Y. (eds.) PSDML 2010. LNCS, vol. 6549, pp. 50–63. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  23. Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proceedings of ACM SIGIR, pp. 49–56 (2004)

    Google Scholar 

  24. Azzopardi, L., Girolami, M., van Risjbergen, K.: Investigating the relationship between language model perplexity and ir precision-recall measures. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 369–370 (2003)

    Google Scholar 

  25. Wang, Y., et al.: Using entropy to classify traffic more deeply. In: Proceedings of the 6th International Conference on Networking, Architecture and Storage (NAS), pp. 45–52 (2011)

    Google Scholar 

  26. Zhang, Z., Zhang, Z., Lee, P.P.C., Liu, Y., Xie, G.: Toward Unsupervised Protocol Feature Word Extraction. IEEE Journal on Selected Areas in Communications 32(10), 1894–1906 (2014)

    Article  Google Scholar 

  27. Wang, Y., Yun, X., Zhang, Y.: Rethinking robust and accurate application protocol identification: a nonparametric approach. In: Proceedings of the 23rd IEEE International Conference on Network Protocol (ICNP), pp. 1–11 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yipeng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Cite this paper

Zhang, Y., Xu, T., Wang, Y., Sun, J., Zhang, X. (2015). A Markov Random Field Approach to Automated Protocol Signature Inference. In: Thuraisingham, B., Wang, X., Yegneswaran, V. (eds) Security and Privacy in Communication Networks. SecureComm 2015. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 164. Springer, Cham. https://doi.org/10.1007/978-3-319-28865-9_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28865-9_25

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28864-2

  • Online ISBN: 978-3-319-28865-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics