Protocol Specification Inference Based on Keywords Identification

Wang, Yong; Zhang, Nan; Wu, Yan-mei; Su, Bin-bin

doi:10.1007/978-3-642-53917-6_40

Yong Wang²⁵,
Nan Zhang²⁵,
Yan-mei Wu²⁵ &
…
Bin-bin Su²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8347))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3218 Accesses
4 Citations

Abstract

Protocol reverse engineering is becoming important in analyzing unknown protocols. Unfortunately, many techniques often have some limitations for few priori information or the time-consuming problem. To address these issues, we propose a framework based on protocol finite state machine (FSM) construction, which can infer the protocol specifications without any priori information of protocols. To improve our framework’s efficiency, we identify the keywords before the finite state construction. Our framework constructs two FSMs, one is L – FSM (language FSM) and the other is S – FSM (state FSM). L – FSM is to illustrate the protocol languages. S – FSM shows protocol sessions’ state transitions. We evaluate our framework with both binary and text protocol. The ARP and the SMTP are the target protocols as inputs. The precision rate and the recall rate are used for evaluation criterias in our experiments. The ARP’s precision and recall rate are both reached 100%. The SMTP’s precision rate is 100% and recall rate is almost 98%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Caballero, J., Yin, H., et al.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security. ACM (2007)
Google Scholar
Antunes, J., et al.: Reverse engineering of protocols from network traces. In: 2011 18th Working Conference on Reverse Engineering, pp. 169–178. IEEE (2011)
Google Scholar
Beddoe, M.: The protocol informatics project. Toorcon 4, 4 (2004)
Google Scholar
Cui, W., Paxson, V., Weaver, N., et al.: Protocol-Independent Adaptive Replay of Application Dialog. In: NDSS (2006)
Google Scholar
Leita, C., et al.: Scriptgen: an automated script generation tool for honeyd. In: 21st Annual Computer Security Applications Conference. IEEE (2005)
Google Scholar
Weidong, C., Kannan, J., Wang, H.J.: Discoverer: Automatic protocol reverse engineering from network traces. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (2007)
Google Scholar
Andrzejewski, D., et al.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, pp. 1171–1177. AAAI Press (2011)
Google Scholar
Wondracek, G., Comparetti, P.M., Kruegel, C., et al.: Automatic Network Protocol Analysis. In: NDSS, vol. 8, pp. 1–14 (2008)
Google Scholar
Lin, Z., Jiang, X., Xu, D., et al.: Automatic protocol format reverse engineering through conectect-aware monitored execution. In: 15th Symposium on Network and Distributed System Security, NDSS (2008)
Google Scholar
Wang, Y., et al.: A semantics aware approach to automated reverse engineering unknown protocols. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–10. IEEE (2012)
Google Scholar
Du, W., et al.: Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 111–120. ACM (2010)
Google Scholar
Trifilo, A., et al.: Traffic to protocol reverse engineering. Computational Intelligence for Security and Defense Applications (2009)
Google Scholar
Xiao, M.M., et al.: Automatic Network Protocol Automaton Extraction. In: Network and System Security (2009)
Google Scholar
De la Higuera, C.: Grammatical inference: learning automata and grammars. Cambridge University Press (2010)
Google Scholar
Wang, Y., Zhang, Z., Yao, D(D.), Qu, B., Guo, L.: Inferring protocol state machine from network traces: A probabilistic approach. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 1–18. Springer, Heidelberg (2011)
Chapter Google Scholar
Vidal, E., et al.: Probabilistic finite-state machines-part I. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1013–1025 (2005)
Article Google Scholar
Hopcroft, J.E.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Pearson Education India (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, University of Electronic Science and Technology of China, 611731, Chengdu, China
Yong Wang, Nan Zhang, Yan-mei Wu & Bin-bin Su

Authors

Yong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Nan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan-mei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Bin-bin Su
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

US Air Force Office of Scientific Research, 106-0032, Tokyo, Japan
Hiroshi Motoda
School of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, China
Zhaohui Wu
Faculty of Engineering and Information Technology, University of Technology, Chippendale, 2008, Sydney, NSW, Australia
Longbing Cao
Department of Computing Science, Edmonton, University of Alberta, T6G 2E8, Canada
Osmar Zaiane
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Min Yao
School of Computer Science, Fudan University, 200433, Shanghai, China
Wei Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Zhang, N., Wu, Ym., Su, Bb. (2013). Protocol Specification Inference Based on Keywords Identification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-53917-6_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53916-9
Online ISBN: 978-3-642-53917-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics