Abstract
The current challenges of decision support systems require a complex analysis of heterogeneous data. These include social and technical information and have various formats. In addition, this information is often incomplete about the domain. Data parts belong to different domains. Such information is defined in this paper as heterogeneous semi-structured objects. The author offers an approach to formalization and comparison of such data sets based on object model and vectorization. The novelty of the work lies in the object similarity measure. One can match objects of any type between themselves in the conditions of information incompleteness. The paper describes a method of formalizing data, matching method, advantages and disadvantages of the proposed solutions. As an example, the authors consider the application of the method in the data analysis of the solving of information security problems. In the paper, the system architecture of the decision support system based on the obtained results is presented.
Project is financially supported by the Ministry of Education and Science of the Russian Federation, Federal Program “Research and Development in Priority Areas of Scientific and Technological Sphere in Russia for 2014–2020” (Contract No. 14.578.21.0231; September 26, 2017, the unique identifier of agreement RFMEFI57817X0231).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barsegyan, A.A., Kupriyanov, M.S., Kholod, I.I., Tess M.D., Elizarov, S.I.: Data and Process Analysis: Handbook, 3rd edn. BXV – Petersburg, St. Petersburg (2009). 512 p
Ramsay, J.O.: Functional Data Analysis. Encyclopedia of Statistical Sciences. Wiley, New York (2006). https://doi.org/10.1002/0471667196.ess3138
Berry, M.J.A., Linoff, G.: Data Mining Techniques. Wiley, New York (1997)
Louise Barriball, K.: Collecting data using a semi-structured interview: a discussion paper. J. Adv. Nurs. 19, 328–335 (1994). https://doi.org/10.1111/j.1365-2648.1994.tb01088.x
Grishkovsky, A.: Integrated processing of unstructured data. Open systems, vol. 6 (2013)
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media, Inc., Sebastopol (2009). 504 p
Schabenberger, O., Gotway, C.A.: Statistical Methods for Spatial Data Analysis. Chapman & Hall/CRC Press (2005). 488 p. ISBN 1-58488-322-7
Milo, T., Zohar, S.: Using schema matching to simplify heterogeneous data translation. In: Proceedings of the 24th International Conference on Very Large Data Bases, VLDB 1998, pp. 122–133 (1998)
Liu, S., Chen, G., Yao, S., Tian, F., Liu, W.: A framework for interactive visual analysis of heterogeneous marine data in an integrated problem solving environment. Comput. Geosci. 104, 20–28 (2017)
Nathan Binkert, Stavros Harizopoulos, Mehul A. Shah, Benjamin Sowell, Dimitris Tsirogiannis: Scalable analysis platform for semi-structured data. Amazon Technologies Inc., Nou Data Corp. (2014). US20130166568A1
Madnick, S.E., Siegel, M.D.: Query and retrieving semi-structured data from heterogeneous sources by translating structured queries. Massachusetts Institute of Technology (2001). US6282537B1
Beyer, K.S., Ercegovac, V., Gemulla, R., Balmin, A., Eltabakh, M., Kanne, C.-C., Ozcan, F., Shekita, E.J.: JAQL: a scripting language for large scale semistructured data analysis. In: VLDB (2011)
Kenneth, W.: Kisiel System method and computer program product to automate the management and analysis of heterogeneous data. Wisdombuilder, L.L.C. (2001). US6327586
Constales, D., Yablonsky, G.S., D’hooge, D.R., Thybaut, J.W., Marin, G.B.: Advanced data analysis and modelling in chemical engineering. 120(2), 417–420 (2017). Elsevier. ISBN 978-0-444-59485-3
Dua, S., Du, X.: Data Mining and Machine Learning in Cybersecurity. Taylor and Francis Group, LLC (2011). 248 p
Cattell, R.G.G., Barry, D.K., Berler, M., Eastman, J., Jordan, D., Russell, C., Schadow, O., Stanienda, T., Velez, F. (eds.): The Object Data Standard ODMG 3.0. Morgan Kaufmann, January 2000
Fatkieva, R.R.: Developing metrics for detecting attacks based on network traffic analysis. Vestnik BGU, No. 9, pp. 81–86 (2013)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press, Cambridge (2016). 800 p
Poltavtseva, M.A., Pechenkin, A.I.: Intelligent Data Analysis in Decision Support Systems for Penetration Tests. Autom. Control. Comput. Sci. 51(8), 985–991 (2017). ISSN 0146-4116
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Poltavtseva, M., Zegzhda, P. (2019). Heterogeneous Semi-structured Objects Analysis. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Systems and Applications. IntelliSys 2018. Advances in Intelligent Systems and Computing, vol 868. Springer, Cham. https://doi.org/10.1007/978-3-030-01054-6_88
Download citation
DOI: https://doi.org/10.1007/978-3-030-01054-6_88
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01053-9
Online ISBN: 978-3-030-01054-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)