Skip to main content

Protocol Specification Inference Based on Keywords Identification

  • Conference paper
Book cover Advanced Data Mining and Applications (ADMA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8347))

Included in the following conference series:

Abstract

Protocol reverse engineering is becoming important in analyzing unknown protocols. Unfortunately, many techniques often have some limitations for few priori information or the time-consuming problem. To address these issues, we propose a framework based on protocol finite state machine (FSM) construction, which can infer the protocol specifications without any priori information of protocols. To improve our framework’s efficiency, we identify the keywords before the finite state construction. Our framework constructs two FSMs, one is L – FSM (language FSM) and the other is S – FSM (state FSM). L – FSM is to illustrate the protocol languages. S – FSM shows protocol sessions’ state transitions. We evaluate our framework with both binary and text protocol. The ARP and the SMTP are the target protocols as inputs. The precision rate and the recall rate are used for evaluation criterias in our experiments. The ARP’s precision and recall rate are both reached 100%. The SMTP’s precision rate is 100% and recall rate is almost 98%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Caballero, J., Yin, H., et al.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security. ACM (2007)

    Google Scholar 

  2. Antunes, J., et al.: Reverse engineering of protocols from network traces. In: 2011 18th Working Conference on Reverse Engineering, pp. 169–178. IEEE (2011)

    Google Scholar 

  3. Beddoe, M.: The protocol informatics project. Toorcon 4, 4 (2004)

    Google Scholar 

  4. Cui, W., Paxson, V., Weaver, N., et al.: Protocol-Independent Adaptive Replay of Application Dialog. In: NDSS (2006)

    Google Scholar 

  5. Leita, C., et al.: Scriptgen: an automated script generation tool for honeyd. In: 21st Annual Computer Security Applications Conference. IEEE (2005)

    Google Scholar 

  6. Weidong, C., Kannan, J., Wang, H.J.: Discoverer: Automatic protocol reverse engineering from network traces. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (2007)

    Google Scholar 

  7. Andrzejewski, D., et al.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, pp. 1171–1177. AAAI Press (2011)

    Google Scholar 

  8. Wondracek, G., Comparetti, P.M., Kruegel, C., et al.: Automatic Network Protocol Analysis. In: NDSS, vol. 8, pp. 1–14 (2008)

    Google Scholar 

  9. Lin, Z., Jiang, X., Xu, D., et al.: Automatic protocol format reverse engineering through conectect-aware monitored execution. In: 15th Symposium on Network and Distributed System Security, NDSS (2008)

    Google Scholar 

  10. Wang, Y., et al.: A semantics aware approach to automated reverse engineering unknown protocols. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–10. IEEE (2012)

    Google Scholar 

  11. Du, W., et al.: Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 111–120. ACM (2010)

    Google Scholar 

  12. Trifilo, A., et al.: Traffic to protocol reverse engineering. Computational Intelligence for Security and Defense Applications (2009)

    Google Scholar 

  13. Xiao, M.M., et al.: Automatic Network Protocol Automaton Extraction. In: Network and System Security (2009)

    Google Scholar 

  14. De la Higuera, C.: Grammatical inference: learning automata and grammars. Cambridge University Press (2010)

    Google Scholar 

  15. Wang, Y., Zhang, Z., Yao, D(D.), Qu, B., Guo, L.: Inferring protocol state machine from network traces: A probabilistic approach. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 1–18. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  16. Vidal, E., et al.: Probabilistic finite-state machines-part I. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1013–1025 (2005)

    Article  Google Scholar 

  17. Hopcroft, J.E.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Pearson Education India (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, Y., Zhang, N., Wu, Ym., Su, Bb. (2013). Protocol Specification Inference Based on Keywords Identification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53917-6_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53916-9

  • Online ISBN: 978-3-642-53917-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics