Abstract
Protocol reverse engineering is becoming important in analyzing unknown protocols. Unfortunately, many techniques often have some limitations for few priori information or the time-consuming problem. To address these issues, we propose a framework based on protocol finite state machine (FSM) construction, which can infer the protocol specifications without any priori information of protocols. To improve our framework’s efficiency, we identify the keywords before the finite state construction. Our framework constructs two FSMs, one is L – FSM (language FSM) and the other is S – FSM (state FSM). L – FSM is to illustrate the protocol languages. S – FSM shows protocol sessions’ state transitions. We evaluate our framework with both binary and text protocol. The ARP and the SMTP are the target protocols as inputs. The precision rate and the recall rate are used for evaluation criterias in our experiments. The ARP’s precision and recall rate are both reached 100%. The SMTP’s precision rate is 100% and recall rate is almost 98%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Caballero, J., Yin, H., et al.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: Proceedings of the 14th ACM Conference on Computer and Communications Security. ACM (2007)
Antunes, J., et al.: Reverse engineering of protocols from network traces. In: 2011 18th Working Conference on Reverse Engineering, pp. 169–178. IEEE (2011)
Beddoe, M.: The protocol informatics project. Toorcon 4, 4 (2004)
Cui, W., Paxson, V., Weaver, N., et al.: Protocol-Independent Adaptive Replay of Application Dialog. In: NDSS (2006)
Leita, C., et al.: Scriptgen: an automated script generation tool for honeyd. In: 21st Annual Computer Security Applications Conference. IEEE (2005)
Weidong, C., Kannan, J., Wang, H.J.: Discoverer: Automatic protocol reverse engineering from network traces. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (2007)
Andrzejewski, D., et al.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, pp. 1171–1177. AAAI Press (2011)
Wondracek, G., Comparetti, P.M., Kruegel, C., et al.: Automatic Network Protocol Analysis. In: NDSS, vol. 8, pp. 1–14 (2008)
Lin, Z., Jiang, X., Xu, D., et al.: Automatic protocol format reverse engineering through conectect-aware monitored execution. In: 15th Symposium on Network and Distributed System Security, NDSS (2008)
Wang, Y., et al.: A semantics aware approach to automated reverse engineering unknown protocols. In: 2012 20th IEEE International Conference on Network Protocols (ICNP), pp. 1–10. IEEE (2012)
Du, W., et al.: Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 111–120. ACM (2010)
Trifilo, A., et al.: Traffic to protocol reverse engineering. Computational Intelligence for Security and Defense Applications (2009)
Xiao, M.M., et al.: Automatic Network Protocol Automaton Extraction. In: Network and System Security (2009)
De la Higuera, C.: Grammatical inference: learning automata and grammars. Cambridge University Press (2010)
Wang, Y., Zhang, Z., Yao, D(D.), Qu, B., Guo, L.: Inferring protocol state machine from network traces: A probabilistic approach. In: Lopez, J., Tsudik, G. (eds.) ACNS 2011. LNCS, vol. 6715, pp. 1–18. Springer, Heidelberg (2011)
Vidal, E., et al.: Probabilistic finite-state machines-part I. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(7), 1013–1025 (2005)
Hopcroft, J.E.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. Pearson Education India (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Y., Zhang, N., Wu, Ym., Su, Bb. (2013). Protocol Specification Inference Based on Keywords Identification. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-53917-6_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53916-9
Online ISBN: 978-3-642-53917-6
eBook Packages: Computer ScienceComputer Science (R0)