Abstract
Domain Name System (DNS) log has been considered as a great source of valuable information for the decision making on government policy or business strategy because querying DNS is the first step of all Internet activities. Due to the size of DNS log, Hadoop is considered as a prominent solution, but the geographical dispersal of DNS log hinders to adopt it in an ordinary way. Hadoop assumes all data source should be located on a single Hadoop File System (HDFS), but DNS log is stored on DNS servers dispersed all over the world. To resolve this issue, a new method named “Localized Analysis & Merge (LAM)” is proposed in this paper. The proposed method enables Hadoop to analyze DNS log on the dispersed DNS servers and it reduced the whole processing time dramatically. Also, the LAM method showed that DNS log can be used to extract a lot of valuable information such as a malware detection, the access frequency over countries, etc.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mockapetris, P., Dunlap, K.J.: Development of the domain name system. ACM SIGCOMM Comput. Commun. Rev. 18(4), 123–133 (1988)
Snyder, M.E., Sundaram, R., Thakur, M.: Preprocessing DNS log data for effective data mining. In: IEEE International Conference on Communications, pp. 1–5 (2009)
Whyte, D., Kranakis, E., van Oorschot, P.C.: DNS-based detection of scanning worms in an enterprise network. In: NDSS (2005)
Choi, H., Lee, H., Lee, H., Kim, H.: Botnet detection by monitoring group activities in DNS traffic. Comput. Inf. Technol., 715–720 (2007)
Bilge, L., Kirda, E., Kruegel, C.: EXPOSURE: finding malicious domains using passive DNS analysis. In: NDSS (2011)
Postel, J.: RFC-1591: domain name system structure and delegation. In: IETF, March 1994
White, T.: Hadoop: The Definitive Guide. O’Reilly, Sebastopol (2012)
Borhkur, D.: The Hadoop distributed file system: architecture and design. Hadoop Proj. Website (2007)
Thomas, W.H.: Algebra, 1st edn, p. 24. Springer, New York (1974)
McAfee Inc.: Virus Profile & Definition (2016). https://home.mcafee.com/virusinfo/virusprofile.aspx?key=610775#none
Nathan, Y.: Visualize This: The Flowing Data Guide to Design Visualization, and Statistics, 1st edn. Wiley, Hoboken (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jung, E. (2017). A Data-Driven Decision Making with Big Data Analysis on DNS Log. In: Kim, K., Joukov, N. (eds) Information Science and Applications 2017. ICISA 2017. Lecture Notes in Electrical Engineering, vol 424. Springer, Singapore. https://doi.org/10.1007/978-981-10-4154-9_49
Download citation
DOI: https://doi.org/10.1007/978-981-10-4154-9_49
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4153-2
Online ISBN: 978-981-10-4154-9
eBook Packages: EngineeringEngineering (R0)