Skip to main content
Log in

An enumerated analysis of NoSQL data models using statistical tools

  • S.I. : Intelligence for Systems and Software Engineering
  • Published:
Innovations in Systems and Software Engineering Aims and scope Submit manuscript

Abstract

The digital exploration of data in the modern technological world has paved the way for a new technology—big data. It is good for handling a massive volume and variety of data generated at high speed through online and offline transactions in different sectors. The NoSQL data model is often found more suitable for big data as it does not suffer from the limitations of traditional relational database (RDBMS) models. In this paper, the performance analysis of big data is done in an interesting way. The performances are evaluated using an experimental approach, taking a public data set of 5 million records and executing set of queries on different platforms like SQL Server 2012 (RDBMS) and two NoSQL models, Cassandra and MongoDB. Subsequently, the experimental results are verified by two well-known tools like Vlsekriterijumska Optimizacija I Kompromisno Resenje (VIKOR) and analysis of variance (ANOVA) to compare the performances from a practical perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability

The data set used in this article is publicly available on the portal of Kaggle. The link of the public domain resource of data is https://www.kaggle.com/elemento/nyc-yellow-taxi-trip-data, Last visited on 03.08.2022 at 09:00 am (IST). The data are available for research.

Notes

  1. https://www.kaggle.com/elemento/nyc-yellow-taxi-trip-data Last visited on 03.08.2022 at 09:00 am (IST).

  2. https://www.kaggle.com/elemento/nyc-yellow-taxi-trip-data Last visited on 03.08.2022 at 09:00 am (IST).

  3. https://github.com/ashisgitup/VIKOR_ANOVA.git.

  4. https://github.com/ashisgitup/VIKOR_ANOVA.git.

  5. https://www.kaggle.com/elemento/nyc-yellow-taxi-trip-data, Last visited on 03.08.2022 at 09:00 am (IST).

  6. https://github.com/ashisgitup/VIKOR_ANOVA.git

References

  1. Martins P et al (2019) A study over NoSQL performance. In: Rocha A et al (eds) New knowledge in information systems and technologies. World CIST’19 2019. Advances in intelligent systems and computing, vol 930, pp 603–611. https://doi.org/10.1007/978-3-030-16181-1_57

  2. Kim JH et al (2020) The hierarchical VIKOR method with incomplete information: supplier selection problem. Sustainability 12(22):9602. https://doi.org/10.3390/su12229602

    Article  Google Scholar 

  3. Siregar D et al (2018) Multi-attribute decision making with VIKOR method for any purpose decision. J Phys Conf Ser 1019(1):012034. https://doi.org/10.1088/1742-6596/1019/1/012034

    Article  Google Scholar 

  4. Mohsin I et al (2020) Optimization of the polishing efficiency and torque by using Taguchi method and ANOVA in robotic polishing. Appl Sci 10(3):824. https://doi.org/10.3390/app10030824

    Article  Google Scholar 

  5. Mondal A et al (2019) Performance analysis of structured, un-structured, and cloud storage systems. Int J Ambient Comput Intell (IJACI) 10(1):1–29. https://doi.org/10.4018/IJACI.2019010101

    Article  MathSciNet  Google Scholar 

  6. Sirish A et al (2019) Performance analysis of queries in RDBMS vs NoSQL. In: 2nd international conference on intelligent computing, instrumentation and control technologies (ICICICT), pp 1283–1286. https://doi.org/10.1109/ICICICT46008.2019.8993394

  7. Alidrisi H et al (2021) An innovative job evaluation approach using the VIKOR algorithm. J Risk Financ Manag 14(6):271. https://doi.org/10.3390/jrfm14060271

    Article  Google Scholar 

  8. Moorthy U et al (2021) A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. J Ambient Intell Humanized Comput 12(5):3527–3538. https://doi.org/10.1007/s12652-020-02592-w

    Article  Google Scholar 

  9. Liu Q et al (2021) t-Test and ANOVA for data with ceiling and/or floor effects. Behav Res Methods 53(1):264–277. https://doi.org/10.3758/s13428-020-01407-2

    Article  Google Scholar 

  10. Vijayaragunathan R et al (2020) Bayes factors for comparison of two-way ANOVA models. J Stat Theory Appl 19(4):540–546. https://doi.org/10.2991/jsta.d.201230.001

    Article  Google Scholar 

  11. Samanta AK et al (2018) Query performance analysis of NoSQL and big data. In: Fourth international conference on research in computational intelligence and communication networks (ICRCICN), pp. 237–241. https://doi.org/10.1109/ICRCICN.2018.8718712

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nabendu Chaki.

Ethics declarations

Conflict of interest

Both the 1st and 2nd authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Samanta, A.K., Chaki, N. An enumerated analysis of NoSQL data models using statistical tools. Innovations Syst Softw Eng 19, 5–14 (2023). https://doi.org/10.1007/s11334-022-00517-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11334-022-00517-8

Keywords

Navigation