Skip to main content
Log in

Analysis and mathematical modeling of big data processing

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Big data processing is an urgent and unresolved challenge that originates from the intensive development of information technology. The recent techniques lose their effectiveness rapidly as the volumes of data increase. In this article, we will put down our vision of the basic approaches and models related to problem solving, based on processing large data volumes. This article introduces a two-stage decomposition of a problem, related to assessing management options. The first stage of our original approach implies a semantic analysis of textual information; the second stage is built around finding association rules in a database, processing them via mathematical statistics methods, and converting data and objectives to a vector. We suggest processing the collected news events by a semantic model, which describes their key features and interconnections between them in a specified subject area. The classification-based association rules allow assessing the likelihood of a particular event using a set chain of events. This approach can be applied through the analysis of online news in a specified market segment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  1. Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347

    Article  Google Scholar 

  2. Laudon KC, Laudon JP (2015) Management information systems. Upper Saddle River, Pearson

    MATH  Google Scholar 

  3. Zaurbekov N, Aidosov A, Zaurbekova N, Aidosov G, Zaurbekova G, Zaurbekov I (2018) Emission spread from mass and energy exchange in the atmospheric surface layer: two-dimensional simulation. Energ Source Part A 40(23):2832–2841

    Article  Google Scholar 

  4. Kwon O, Lee N, Shin B (2014) Data quality management, data usage experience and acquisition intention of big data analytics. Int J Inf Manag 34(3):387–394

    Article  Google Scholar 

  5. Bulat PV, Zasuhin ON, Uskov VN (2012) On classification of flow regimes in a channel with sudden expansion. Thermophys Aeromech 19(2):233–246

    Article  Google Scholar 

  6. Deng Q, Gönül S, Kabak Y, Gessa N, Glachs D, Gigante-Valencia F, Thoben KD (2019) An ontology framework for multisided platform interoperability. In: Popplewell K, Thoben KD, Knothe T, Poler R (eds) Enterprise interoperability VIII. Proceedings of the I-ESA conferences, vol 9. Springer, Cham

    Google Scholar 

  7. Rocha, V, Varela, L, Carmo-Silva, S (2016). Sharing product information for supporting collaborative product development. Dept. Production and Systems, School of Engineering, University of Minho, Braga, Portugal

  8. Cunha FA, dos Passos Silva J, de Barros AC, Romeiro Filho E (2017) The use of information management tools as support to the product development process in a metal mechanical company. Product: Manag Develop 11(1):33–41

    Article  Google Scholar 

  9. Welzer, T, Eder, J, Podgorelec, V, Latifić, AK (2019). Advances in Databases and Information Systems. In: 23rd European Conference, ADBIS 2019, Bled, Slovenia, Vol. 11695. Springer Nature

  10. Beyer, M (2011). Gartner says solving “big data” challenge involves more than just managing volumes of data

  11. Zikopoulos, PC, deRoos, D, Parasuraman, K, Deutsch, T, Corrigan, D, Giles, J, Melnyk, RB (2011). Harness the power of big data—The IBM Big Data Platform. McGraw-Hill

  12. Wu X, Zhu X, Wu GQ, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107

    Google Scholar 

  13. Abacha, AB, Zweigenbaum, P (2011). Medical entity recognition: A comparison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics

  14. Wiese, L (2015). Polyglot database architectures= Polyglot Challenges. In LWA, pp. 422–426

  15. Bassaler, J, Zaïm, S, Prémont, C (2014). What can businesses do to capture the full potential of big data? Orange business services

  16. Hurwitz, J, Nugent, A, Halper, F, Kaufman, M (2013). Big Data for Dummies. Wiley

  17. Azarmi, B (2016). Scalable big data architecture. Apress

  18. Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144

    Article  Google Scholar 

  19. Tian X, Han R, Wang L, Lu G, Zhan J (2015) Latency critical big data computing in finance. Journal of Finance and Data Science 1(1):33–41

    Article  Google Scholar 

  20. Sukhobokov AA, Lakhvich DS (2015) The impact of big data tools on the development of scientific disciplines related to modeling, science and education. Online journal of N.E. Bauman MSTU 3:207–240

    Google Scholar 

  21. Barlow, M (2013). Real-time big data analytics: emerging architecture. O’Reilly

  22. Thaduri A, Galar D, Kumar U (2015) Railway assets: a potential domain for big data analytics. Procedia Comput Sci 53:457–467

    Article  Google Scholar 

  23. Karimi, HA (2014). Big data: techniques and Technologies in Geoinformatics. RC Press

  24. Klemenkov PA, Kuznetsov SD (2012) Big data: current approaches to storage and processing. Proceedings of the Institute for System Programming of the Russian Academy of Sciences 23:143–156

    Article  Google Scholar 

  25. Hutter M (2005) Universal Artificial Intelligence. Springer, Berlin

    Book  Google Scholar 

  26. Evangelopoulos NE (2013) Latent semantic analysis. Wiley Interdiscip Rev Cogn Sci 4(6):683–692

    Article  Google Scholar 

  27. Seeker W, Kuhn J (2013) Morphological and syntactic case in statistical dependency parsing. Comput Linguist 39(1):23–55

    Article  Google Scholar 

  28. Hladik, J, Christl, C, Haferkorn, F, Graube, M (2013). Improving industrial collaboration with linked data, OWL. In: OWLED

  29. Brunetti JM, García R, Auer S (2013) From overview to facets and pivoting for interactive exploration of semantic web data. IJSWIS 9(1):1–20

    Google Scholar 

  30. Wauer, M, Schuster, D, Meinecke, J (2010). Aletheia: an architecture for semantic federation of product information from structured and unstructured sources. In: Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services, pp. 325–332

  31. Stolz, A, Rodriguez-Castro, B, Hepp, M (2013). Using BMEcat catalogs as a lever for product master data on the semantic web. In: Extended Semantic Web Conference, pp. 623–638. Springer, Berlin, Heidelberg

  32. Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971

    Article  Google Scholar 

  33. Dragisic Z, Ivanova V, Li H, Lambrix P (2017) Experiences from the anatomy track in the ontology alignment evaluation initiative. J Biomed Semant 8(1):56

    Article  Google Scholar 

  34. Wu J, Guo S, Huang H, Liu W, Xiang Y (2018) Information and communications technologies for sustainable development goals: state-of-the-art, needs and perspectives. IEEE Commun Surv Tut 20(3):2389–2406

    Article  Google Scholar 

  35. Wu J, Guo S, Li J, Zeng D (2016) Big data meet green challenges: Big data toward green applications. IEEE Syst, J. 10(3):888–900

    Article  Google Scholar 

  36. Singhal, A, Buckley, C, Mitra, M (2017). Pivoted document length normalization. In: Acm sigir forum, pp. 176–184. New York, NY, USA, ACM

  37. Shehata, S, Karray, F, Kamel, M (2006). Enhancing text clustering using concept-based mining model. In: Sixth International Conference on Data Mining (ICDM’06), pp. 1043–1048. IEEE

  38. Wu, ST, Li, Y, Xu, Y, Pham, B, Chen, P (2004). Automatic Pattern- Taxonomy Extraction for Web Mining. In: IEEE/WIC/ACM Int’l Conf. Web Intelligence (WI ‘04), pp. 242–248

Download references

Availability of data and material

Data will be available on request.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bakhtgerey Sinchev.

Ethics declarations

Conflict of interests

Authors declare that they have no conflict of interests.

Additional information

This article is part of the Topical Collection: Special Issue on Security of Mobile, Peer-to-peer and Pervasive Services in the Cloud

Guest Editors: B. B. Gupta, Dharma P. Agrawal, Nadia Nedjah, Gregorio Martinez Perez, and Deepak Gupta

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Imanbayev, K., Sinchev, B., Sibanbayeva, S. et al. Analysis and mathematical modeling of big data processing. Peer-to-Peer Netw. Appl. 14, 2626–2634 (2021). https://doi.org/10.1007/s12083-020-00978-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-020-00978-3

Keywords

Navigation