Abstract
With the rapid expansion of social media users and the ever-increasing data exchange between them, the era of big data has arrived. Integration of big data generates enormous benefits, making it a hotspot for research. However, big data demonstrates the heterogeneity brought on by multiple data sources. Big data integration is constrained by multi-source heterogeneous data. Moreover, the rise in the volume of social media data is affecting the efficiency of data integration. This study is concerned with developing a novel framework for data integration system that can manage the heterogeneity of massive social media data. The framework is comprised of four layers: data source layer, application layer, resource layer, and visualization layer. The framework establishes correlations between data stored in distributed data sources. We used RESTful APIs to offer end-users with reliable and effective web-based access to data using unique queries. The framework was evaluated based on firsthand impressions of test users, who answered a standardized set of questions after testing real-world inputs.
Similar content being viewed by others
Data Availability
The materials used in this study are available at https://github.com/AlShomar/AlShomar-Big-Data-Integration-Framework.
References
Abkenar SB, Kashani MH, Mahdipour E, Jameii SM (2021) Big data analytics meets social media: a systematic review of techniques, open issues, and future directions. Telemat Inf 57:101517
Ahmed SE, Aydın D, and Yılmaz E, (2021) Linear mixed-effects model using penalized spline based on data transformation methods. In: multivariate, multilinear and mixed linear models. Springer, 2021, pp. 319–341
Ahuja SP, Mani S, Zambrano J (2012) A survey of the state of cloud computing in healthcare. Netw Commun Technol 1(2):12
Akinyemi A, Sun M, Gray AJ (2020) Data integration for offshore decommissioning waste management. Automat Constr 109:103010
Al_Rabeah MH and Lakizadeh A, (2022) Gnn-ddi: a new data integration framework for predicting drug-drug interaction events based on graph neural networks
Alqarni A (2021) A secure approach for data integration in cloud using paillier homomorphic encryption
Al-Qurishi M, Alhuzami S, AlRubaian M, Hossain MS, Alamri A, Rahman MA (2018) User profiling for big social media data using standing ovation model. Multimed Tools Appl 77(9):179–201
Arer MM, Dhulavvagol PM, Totad S, (2022) Efficient big data storage and retrieval in distributed architecture using blockchain and ipfs. In: IEEE 7th international conference for convergence in technology (I2CT). IEEE 2022:1–6
Arslan AK, Tunç Z, Çolak C (2019) An open sourced software for data transformation and an application on simulated data. In: international artificial intelligence and data processing symposium (IDAP). IEEE 2019, pp. 1–6
Bettio C, Salsi V, Orsini M, Calanchi E, Magnotta L, Gagliardelli L, Kinoshita J, Bergamaschi S, Tupler R (2021) The Italian national registry for fshd: an enhanced data integration and an analytics framework towards smart health care and precision medicine for a rare disease. Orphanet J Rare Dis 16(1):1–13
Dey P, Pandit P (2020) Relevance of data transformation techniques in weed science. J Res Weed Sci 3(1):81–89
Eftekhari A, Zulkernine F, and Martin P, (2016) Binary: a framework for big data integration for ad-hoc querying. In: 2016 IEEE international conference on big data (Big Data). IEEE, 2016, pp. 2746–2753
Fillinger S, de la Garza L, Peltzer A, Kohlbacher O, Nahnsen S (2019) Challenges of big data integration in the life sciences. Anal Bioanal Chem 411(26):6791–6800
Fletcher RJ Jr, Hefley TJ, Robertson EP, Zuckerberg B, McCleery RA, Dorazio RM (2019) A practical guide for combining data to model species distributions. Ecology 100(6):e02710
https://github.com/AlShomar/AlShomar-Big-Data-Integration-Framework
Hasan FF, Bakar MSA (2021) Data transformation from sql to nosql mongodb based on r programming language. In: 2021 5th international symposium on multidisciplinary studies and innovative technologies (ISMSIT). IEEE 2021:399–403
Hilali I, Arfaoui N, and Ejbali R, (2022) A new approach for integrating data into big data warehouse. In: fourteenth international conference on machine vision (ICMV 2021), vol. 12084. SPIE, 2022, pp. 475–480
Jung H, Chung K (2021) Social mining-based clustering process for big-data integration. J Ambient Intell Humaniz Comput 12(1):589–600
Kalayci TE, Kalayci EG, Lechner G, Neuhuber N, Spitzer M, Westermeier E, Stocker A (2021) Triangulated investigation of trust in automated driving: challenges and solution approaches for data integration. J Ind Inf Integr 21:100186
Kancharala VS et al (2021) A graph based data integration and aggregation technique for big data. Turk J Comput Math Educ (TURCOMAT) 12(10):3842–3850
Keller JM (1983) Motivational design of instruction. Instructional design theories and models: an overview of their current status 1(1983):383–434
Kim S, Tom TH, Takeda M, Mase H (2021) A framework for transformation to nearshore wave from global wave data using machine learning techniques: validation at the port of Hitachinaka, Japan. Ocean Eng 221:108516
Kune R, Konugurthi PK, Agarwal A, Chillarige RR, Buyya R (2016) The anatomy of big data computing. Software Pract Exp 46(1):79–105
Li H, Deng J, Feng P, Pu C, Arachchige DD, Cheng Q (2021) Short-term nacelle orientation forecasting using bilinear transformation and iceemdan framework. Front Energy Res 9:780928
Manekar SA and Pradeepini G, (2017) Opportunity and challenges for migrating big data analytics in cloud. In: IOP conference series: materials science and engineering, vol. 225, no. 1. IOP Publishing, p. 012148
Nie W, Zhang Q, Ouyang Z, and Liu X, (2021) Design of big data integration platform based on hybrid hierarchy architecture. In: 2021 IEEE 15th international conference on big data science and engineering (BigDataSE). IEEE, pp. 135–140
NoAuthor A, (2020) Comparing business intelligence, business analytics and data analytics. [Online]. Available: https://www.tableau.com/en-gb/learn/articles/business-intelligence/bi-business-analytics
Paas FG, Van Merriënboer JJ, Adam JJ (1994) Measurement of cognitive load in instructional research. Percept Mot Skills 79(1):419–430
Pajooh HH, Rashid MA, Alam F, Demidenko S (2021) Iot big data provenance scheme using blockchain on hadoop ecosystem. J Big Data 8(1):1–26
Petri G, von Wangenheim CG, and Borgatto AF, (2017) A large-scale evaluation of a model for the evaluation of games for teaching software engineering. In: 2017 IEEE/ACM 39th international conference on software engineering: software engineering education and training track (ICSE-SEET). IEEE, 2017, pp. 180–189
Puth M-T, Neuhäuser M, Ruxton GD (2014) Effective use of pearson’s product-moment correlation coefficient. Anim Behav 93:183–189
Rossi R and Hirama K, (2022) Characterizing big data management. arXiv preprint arXiv:2201.05929
Saenko I and Kotenko I (2022) Towards resilient and efficient big data storage: evaluating a siem repository based on hdfs. In: 2022 30th Euromicro international conference on parallel, distributed and network-based processing (PDP). IEEE, 2022, pp. 290–297
Shehab W, ElGokhy SM, Sallam E (2016) Rohdip: resource oriented heterogeneous data integration platform. Int J Adv Comput Sci Appl 7(9):104–109
Shi Z, Zhao G, and Liu J, (2020) Research on the model of command and decision system for big data. In: 2020 IEEE 3rd international conference on information systems and computer aided education (ICISCAE). IEEE, 2020, pp. 481–484
Shu P, Liu F, Jin H, Chen M, Wen F, Qu Y, Li B, (2013) etime: energy-efficient transmission between cloud and mobile devices. In: proceedings IEEE INFOCOM. IEEE 2013 pp. 195–199
VandanaKolisetty V and Rajput DS, (2021) Integration and classification approach based on probabilistic semantic association for big data. Complex Intell Syst, pp. 1–14
Viswanath G, Krishna PV (2021) Hybrid encryption framework for securing big data storage in multi-cloud environment. Evol Intel 14(2):691–698
Ye O, Guo R, Fu Y, and Deng J, (2022) A parallel top-n video big data retrieval method based on multi-features. In: 2022 7th international conference on image, vision and computing (ICIVC). IEEE, 2022, pp. 293–299
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Al-Shomar, A.M., Al-Qurish, M. & Aljedaani, W. A novel framework for remote management of social media big data analytics. Soc. Netw. Anal. Min. 12, 172 (2022). https://doi.org/10.1007/s13278-022-00996-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-022-00996-4