Abstract
The Internet of Things (IoT) is an emerging paradigm that offers remarkable opportunities for data mining and analysis. IoT envisions a world where all smartphones, vehicles, public services facilities, and home appliances that can be connected to the internet act as data sources. Even today, a significant portion of electronic devices, including watches, emergency alarms, parking doors, and many appliances can be linked to IoT systems and remotely controlled. Big data analysis and data mining methods can be utilized to improve the performance of IoT systems and address their challenges in the area of data storage, processing, and analysis. Extensive studies on IoT with big data can make it possible to accumulate tremendous data and transform it into valuable knowledge using data mining techniques. With this background, this paper provides a systematic survey of the literature on the use of big data analytics and data mining methods in IoT. This review aims to identify the lines of research that should receive more attention in future works. To achieve this goal, the articles published between 2010 and 2021 on the subjects of IoT-based big data and IoT-based data mining (60 articles) have been reviewed. These articles fall into four general categories in terms of focus: architecture/platform, framework, applications, and security. The paper provides a summary of the methods used in IoT-based big data analysis and IoT-based data mining in these four categories to highlight the promising avenues of research for future works.














Similar content being viewed by others
Availability of data and material
All data and materials generated or analyzed during this study were included in this published article.
Code availability
There is no code for this study.
References
Atzori L, Iera A, Morabito G (2010) The internet of things: A survey. Comput Netw 54(15):2787–2805
Rezaeipanah A, Nazari H, Ahmadi G (2019) A hybrid approach for prolonging lifetime of wireless sensor networks using genetic algorithm and online clustering. J Comput Sci Eng 13(4):163–174
Alaa M, Zaidan AA, Zaidan BB, Talal M, Kiah MLM (2017) A review of smart home applications based on Internet of Things. J Netw Comput Appl 97:48–65
Rezaeipanah A, Amiri P, Nazari H, Mojarad M, Parvin H (2021) An energy-aware hybrid approach for wireless sensor networks using re-clustering-based multi-hop routing. Wireless Pers Commun 120(4):3293–3314
Guo H, Wang L, Chen F, Liang D (2014) Scientific big data and digital earth. Chin Sci Bull 59(35):5066–5073
Ahmed M, Choudhury S, Al-Turjman F (2019) Big data analytics for intelligent internet of things. In: Artificial intelligence in IoT (pp. 107–127). Springer, Cham
Shahidinejad A, Ghobaei-Arani M, Esmaeili L (2020) An elastic controller using Colored Petri Nets in cloud computing environment. Clust Comput 23(2):1045–1071
Shakarami A, Shahidinejad A, Ghobaei-Arani M (2021) An autonomous computation offloading strategy in Mobile Edge Computing: A deep learning-based hybrid approach. J Netw Comput Appl 178:102974
Klein S (2017) The world of big data and IoT. In: IoT solutions in Microsoft’s azure IoT suite (pp. 3–13). Apress, Berkeley, CA
Ghobaei-Arani M, Shamsi M, Rahmanian AA (2017) An efficient approach for improving virtual machine placement in cloud computing environment. J Exp Theor Artif Intell 29(6):1149–1171
Berahmand K, Mohammadi M, Faroughi A, Mohammadiani RP (2022) A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Clust Comput 25:869–888
Ghobaei-Arani M (2021) A workload clustering based resource provisioning mechanism using Biogeography based optimization technique in the cloud based systems. Soft Comput 25(5):3813–3830
Zhang J (2021) Distributed network security framework of energy internet based on internet of things. Sustain Energy Technol Assess 44:101051
Berahmand K, Nasiri E, Li Y (2021) Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding. Comput Biol Med 138:104933
Forouzandeh S, Berahmand K, Nasiri E, Rostami M (2021) A hotel recommender system for tourists using the Artificial Bee Colony Algorithm and Fuzzy TOPSIS Model: a case study of tripadvisor. Int J Inf Technol Decis Mak 20(01):399–429
Ghobaei-Arani M, Shahidinejad A (2021) An efficient resource provisioning approach for analyzing cloud workloads: a metaheuristic-based clustering approach. J Supercomput 77(1):711–750
Nasiri E, Berahmand K, Rostami M, Dabiri M (2021) A novel link prediction algorithm for protein-protein interaction networks by attributed graph embedding. Comput Biol Med 137:104772
Li Y, Song Y, Rezaeipanah A (2021) Generation a shooting on the walking for soccer simulation 3D league using Q-learning algorithm. J Ambient Intell Hum Comput, 1–11. In press.
Mohindru G, Mondal K, Banka H (2020) Internet of Things and data analytics: a current review. Wiley Interdiscipl Rev Data Min Knowl Discov 10(3):e1341
Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell 43(4):1445–1451
Yan C, Li Z, Zhang Y, Liu Y, Ji X, Zhang Y (2020) Depth image denoising using nuclear norm and learning graph model. ACM Trans Multimedia Comput Commun Appl (TOMM) 16(4):1–17
Jesus EF, Chicarino VR, de Albuquerque CV, Rocha AADA (2018) A survey of how to use blockchain to secure internet of things and the stalker attack. Secur Commun Netw 2018:9675050
Yan C, Hao Y, Li L, Yin J, Liu A, Mao Z, Gao X (2021) Task-adaptive attention for image captioning. IEEE Trans Circuits Syst Video Technol 32(1):43–51
Yan C, Teng T, Liu Y, Zhang Y, Wang H, Ji X (2021) Precise no-reference image quality evaluation based on distortion identification. ACM Trans Multimedia Comput Commun Appl (TOMM) 17(3):1–21
Shadroo S, Rahmani AM (2018) Systematic survey of big data and data mining in internet of things. Comput Netw 139:19–47
Santos GL, Bezerra DDF, Rocha ÉDS, Ferreira L, Moreira ALC, Gonçalves GE, Endo PT (2022) Service function chain placement in distributed scenarios: a systematic review. J Netw Syst Manage 30(1):1–39
Aggarwal PK, Jain P, Mehta J, Garg R, Makar K, Chaudhary P (2021) Machine learning, data mining, and big data analytics for 5G-enabled IoT. In: Blockchain for 5G-Enabled IoT, pp 351–375. Springer, Cham
Li C, Niu B (2020) Design of smart agriculture based on big data and Internet of things. Int J Distrib Sens Netw 16(5):1550147720917065
Kobusińska A, Leung C, Hsu CH, Raghavendra S, Chang V (2018) Emerging trends, issues and challenges in Internet of Things, Big Data and cloud computing. Futur Gener Comput Syst 87:416–419
De Francisci Morales, G., Bifet, A., Khan, L., Gama, J., & Fan, W. (2016, August). Iot big data stream mining. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 2119–2120).
Chen F, Deng P, Wan J, Zhang D, Vasilakos AV, Rong X (2015) Data mining for the internet of things: literature review and challenges. Int J Distrib Sens Netw 11(8):431047
Saemaldahr R, Thapa B, Maikoo K, Fernandez EB (2020) Reference Architectures for the IoT: a survey. In: International conference of reliable information and communication technology, pp 635–646. Springer, Cham
Nauman A, Qadri YA, Amjad M, Zikria YB, Afzal MK, Kim SW (2020) Multimedia Internet of Things: a comprehensive survey. IEEE Access 8:8202–8250
Ray PP (2016) A survey of IoT cloud platforms. Future Comput Inf J 1(1–2):35–46
Rathore MM, Ahmad A, Paul A (2016) IoT-based smart city development using big data analytical approach. In: 2016 IEEE international conference on automatica (ICA-ACCA), pp 1–8. IEEE
Rathore MM, Ahmad A, Paul A, Rho S (2016) Urban planning and building smart cities based on the internet of things using big data analytics. Comput Netw 101:63–80
Sun Y, Song H, Jara AJ, Bie R (2016) Internet of things and big data analytics for smart and connected communities. IEEE Access 4:766–773
Ma Y, Wang Y, Yang J, Miao Y, Li W (2017) Big health application system based on health internet of things and big data. IEEE Access 5:7885–7897
Souza AM, Amazonas JR (2015) An outlier detect algorithm using big data processing and internet of things architecture. Procedia Comput Sci 52:1010–1015
Kholod I, Kuprianov M, Petukhov I (2016) Distributed data mining based on actors for Internet of Things. In: 2016 5th mediterranean conference on embedded computing (MECO), pp 480–484. IEEE
Nigam S, Asthana S, Gupta P (2016) IoT based intelligent billboard using data mining. In: 2016 international conference on innovation and challenges in cyber security (ICICCS-INBUSH), pp 107–110. IEEE
Lee YJ, Park HD, Min O (2016) Cooperative big data processing engine for fast reaction in internet of things environment: greater than the sum of its parts. In: Mobile and wireless technologies 2016, pp 145–149. Springer, Singapore
Singh SK, Rathore S, Park JH (2020) Blockiotintelligence: A blockchain-enabled intelligent IoT architecture with artificial intelligence. Futur Gener Comput Syst 110:721–743
Luo XJ, Oyedele LO, Ajayi AO, Monyei CG, Akinade OO, Akanbi LA (2019) Development of an IoT-based big data platform for day-ahead prediction of building heating and cooling demands. Adv Eng Inform 41:100926
Kharbouch A, Naitmalek Y, Elkhoukhi H, Bakhouya M, De Florio V, El Ouadghiri MD, Blondia C (2019) IoT and big data technologies for monitoring and processing real-time healthcare data. Int J Distrib Syst Technol (IJDST) 10(4):17–30
Wang Z, Liang W, Zhang Y, Wang J, Tao J, Chen C, Men T (2019) Data mining in IoT era: a method based on improved frequent items mining algorithm. In: 2019 5th international conference on big data and information analytics (BigDIA) (pp 120–125). IEEE
Gao H (2021) Big data development of tourism resources based on 5G network and internet of things system. Microprocess Microsyst 80:103567
Strohbach M, Ziekow H, Gazis V, Akiva N (2015) Towards a big data analytics framework for IoT and smart city applications. In: Modeling and processing for next-generation big-data technologies, pp 257–282. Springer, Cham
Berlian MH, Sahputra TER, Ardi BJW, Dzatmika LW, Besari ARA, Sudibyo RW, Sukaridhoto S (2016) Design and implementation of smart environment monitoring and analytics in real-time system framework based on internet of underwater things and big data. In: 2016 international electronics symposium (IES), pp 403–408. IEEE
Guo K, Tang Y, Zhang P (2017) CSF: Crowdsourcing semantic fusion for heterogeneous media big data in the internet of things. Information Fusion 37:77–85
Sezer OB, Dogdu E, Ozbayoglu M, Onal A (2016) An extended IoT framework with semantics, big data, and analytics. In: 2016 IEEE international conference on big data (big data), pp 1849–1856. IEEE
Kaur K, Garg S, Kaddoum G, Bou-Harb E, Choo KKR (2019) A big data-enabled consolidated framework for energy efficient software defined data centers in IoT setups. IEEE Trans Industr Inf 16(4):2687–2697
Ruan J, Wang Y, Chan FTS, Hu X, Zhao M, Zhu F, Lin F (2019) A life cycle framework of green IoT-based agriculture and its finance, operation, and management issues. IEEE Commun Mag 57(3):90–96
Rizwan P, Suresh K, Babu MR (2016) Real-time smart traffic management system for smart cities by using Internet of Things and big data. In: 2016 international conference on emerging technological trends (ICETT), pp 1–7. IEEE
Bera A, Kundu A, De Sarkar NR, Mou D (2017) Experimental analysis on big data in iot-based architecture. In: Proceedings of the international conference on data engineering and communication technology, pp 1–9. Springer, Singapore
Niyato D, Alsheikh MA, Wang P, Kim DI, Han Z (2016). Market model and optimal pricing scheme of big data and Internet of Things (IoT). In: 2016 IEEE international conference on communications (ICC) (pp 1–6). IEEE
Dineshkumar P, SenthilKumar R, Sujatha K, Ponmagal RS, Rajavarman VN (2016) Big data analytics of IoT based Health care monitoring system. In: 2016 IEEE Uttar Pradesh section international conference on electrical, computer and electronics engineering (UPCON), pp 55–60. IEEE
Saenko I, Kotenko I, Kushnerevich A (2017) Parallel processing of big heterogeneous data for security monitoring of IoT networks. In: 2017 25th Euromicro international conference on parallel, distributed and network-based processing (PDP), pp 329–336. IEEE
Alam F, Mehmood R, Katib I, Albeshri A (2016) Analysis of eight data mining algorithms for smarter Internet of Things (IoT). Procedia Comput Sci 98:437–442
Banerjee A, Chakraborty C, Kumar A, Biswas D (2020) Emerging trends in IoT and big data analytics for biomedical and health care technologies. In: Handbook of data science approaches for biomedical engineering, pp 121–152. Academic Press
Taher NC, Mallat I, Agoulmine N, El-Mawass N (2019) An IoT-Cloud based solution for real-time and batch processing of big data: application in healthcare. In: 2019 3rd international conference on bio-engineering for smart technologies (BioSMART), pp 1–8. IEEE
Shang H, Lu D, Zhou Q (2021) Early warning of enterprise finance risk of big data mining in internet of things based on fuzzy association rules. Neural Comput Appl 33(9):3901–3909
Mkrttchian V, Gamidullaeva L, Finogeev A, Chernyshenko S, Chernyshenko V, Amirov D, Potapova I (2021) Big data and internet of things (IoT) technologies’ influence on higher education: current state and future prospects. Int JWeb-Based Learn Teach Technol (IJWLTT) 16(5):137–157
Shon T, Moon J (2007) A hybrid machine learning approach to network anomaly detection. Inf Sci 177(18):3799–3821
Ioannou C, Vassiliou V (2019) Classifying security attacks in IoT networks using supervised learning. In: 2019 15th International conference on distributed computing in sensor systems (DCOSS), pp 652–658. IEEE
Hosseinzadeh M, Rahmani AM, Vo B, Bidaki M, Masdari M, Zangakani M (2021) Improving security using SVM-based anomaly detection: issues and challenges. Soft Comput 25(4):3195–3223
Yahyaoui A, Abdellatif T, Attia R (2019) Hierarchical anomaly based intrusion detection and localization in IoT. In: 2019 15th international wireless communications and mobile computing conference (IWCMC), pp 108–113. IEEE
Doshi R, Apthorpe N, Feamster N (2018) Machine learning ddos detection for consumer internet of things devices. In: 2018 IEEE Security and Privacy Workshops (SPW), pp 29–35. IEEE
Chaudhary P, Gupta BB (2019). Ddos detection framework in resource constrained internet of things domain. In: 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), pp 675–678. IEEE
Dwyer OP, Marnerides AK, Giotsas V, Mursch T (2019) Profiling IoT-based Botnet Traffic using DNS. In 2019 IEEE global communications conference (GLOBECOM), pp 1–6. IEEE
Wehbi K, Hong L, Al-salah T, Bhutta AA (2019) A survey on machine learning based detection on DDoS Attacks for IoT systems. In: 2019 SoutheastCon, pp 1–6. IEEE
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Yeung G, Borowiec D, Friday A, Harper R, Garraghan P (2020) Towards {GPU} utilization prediction for cloud deep learning. In: 12th {USENIX} workshop on hot topics in cloud computing (HotCloud 20)
Li P, Zhang Y (2019) A novel intrusion detection method for internet of things. In: 2019 Chinese control and decision conference (CCDC), pp 4761–4765. IEEE
Roy B, Cheung H (2018) A deep learning approach for intrusion detection in internet of things using bi-directional long short-term memory recurrent neural network. In: 2018 28th international telecommunication networks and applications conference (ITNAC), pp 1–6. IEEE
Lawrence T, Zhang L (2019) IoTNet: An efficient and accurate convolutional neural network for IoT devices. Sensors 19(24):5541
Liang F, Yu W, Liu X, Griffith D, Golmie N (2020) Toward edge-based deep learning in industrial Internet of Things. IEEE Internet Things J 7(5):4329–4341
Roopak M, Tian GY, Chambers J (2020) An intrusion detection system against DDoS attacks in iot networks. In: 2020 10th annual computing and communication workshop and conference (CCWC), pp 0562–0567. IEEE
Ullah F, Naeem H, Jabbar S, Khalid S, Latif MA, Al-Turjman F, Mostarda L (2019) Cyber security threats detection in internet of things using deep learning approach. IEEE Access 7:124379–124389
Hwang RH, Peng MC, Nguyen VL, Chang YL (2019) An LSTM-based deep learning approach for classifying malicious traffic at the packet level. Appl Sci 9(16):3414
Liang X, Znati T (2019) A long short-term memory enabled framework for DDoS detection. In: 2019 IEEE global communications conference (GLOBECOM), pp 1–6. IEEE
Hanif S, Ilyas T, Zeeshan M (2019) Intrusion detection in IoT using artificial neural networks on unsw-15 dataset. In: 2019 IEEE 16th international conference on smart cities: improving quality of life using ICT & IoT and AI (HONET-ICT), pp 152–156. IEEE
Li L, Lu R, Choo KKR, Datta A, Shao J (2016) Privacy-preserving-outsourced association rule mining on vertically partitioned databases. IEEE Trans Inf Forensics Secur 11(8):1847–1861
Xu L, Wu X, Zhang X (2012) CL-PRE: a certificateless proxy re-encryption scheme for secure data sharing with public cloud. In: Proceedings of the 7th ACM symposium on information, computer and communications security, pp 87–88
Kim HI, Hong S, Chang JW (2016) Hilbert curve-based cryptographic transformation scheme for spatial query processing on outsourced private data. Data Knowl Eng 104:32–44
Toshniwal R, Dastidar KG, Nath A (2015) Big data security issues and challenges. Complexity 2(2):15–20
Will MG (2015) Privacy and big data: the need for a multi-stakeholder approach for developing an appropriate privacy regulation in the age of big data. Available at SSRN 2634970
Stergiou C, Psannis KE, Gupta BB, Ishibashi Y (2018) Security, privacy and efficiency of sustainable cloud computing for big data & IoT. Sustain Comput Inf Syst 19:174–184
Sollins KR (2019) IoT big data security and privacy versus innovation. IEEE Internet Things J 6(2):1628–1635
Li F, Xie R, Wang Z, Guo L, Ye J, Ma P, Song W (2019) Online Distributed IoT Security Monitoring With Multidimensional Streaming Big Data. IEEE Internet Things J 7(5):4387–4394
Kotenko IV, Saenko I, Kushnerevich A (2017) Parallel big data processing system for security monitoring in Internet of Things networks. J Wireless Mobile Netw Ubiquitous Comput Dependable Appl 8(4):60–74
Vimalkumar K, Radhika N (2017) A big data framework for intrusion detection in smart grids using apache spark. In: 2017 International conference on advances in computing, communications and informatics (ICACCI), pp 198–204. IEEE
Ge C, Yin C, Liu Z, Fang L, Zhu J, Ling H (2020) A privacy preserve big data analysis system for wearable wireless sensor network. Comput Secur 96:101887
Vinayakumar R, Alazab M, Soman KP, Poornachandran P, Al-Nemrat A, Venkatraman S (2019) Deep learning approach for intelligent intrusion detection system. IEEE Access 7:41525–41550
Kotenko I, Saenko I, Branitskiy A (2018) Framework for mobile Internet of Things security monitoring based on big data processing and machine learning. IEEE Access 6:72714–72723
Singh D, Tripathi G, Jara AJ (2014) A survey of Internet-of-Things: Future vision, architecture, challenges and services. In: 2014 IEEE world forum on Internet of Things (WF-IoT), pp 287–292. IEEE
Jesse N (2018) Internet of Things and Big Data: the disruption of the value chain and the rise of new software ecosystems. AI & Soc 33(2):229–239
ur Rehman MH, Yaqoob I, Salah K, Imran M, Jayaraman PP, Perera C (2019) The role of big data analytics in industrial Internet of Things. Future Gener Comput Syst 99:247–259
Li J, Li X, Peng Y (2019) Application of big data in agricultural internet of things. Rev Fac Agron 36:1521–1529
Gore R, Valsan SP (2016) Big Data challenges in smart Grid IoT (WAMS) deployment. In: 2016 8th International conference on communication systems and networks (COMSNETS), pp 1–6. IEEE
Dahdouh K, Dakkak A, Oughdir L, Ibriz A (2019) Large-scale e-learning recommender system based on Spark and Hadoop. J Big Data 6(1):1–23
Elshawi R, Sakr S, Talia D, Trunfio P (2018) Big data systems meet machine learning challenges: towards big data science as a service. Big Data Res 14:1–11
Ahmad M, Kanwal S, Cheema M, Habib MA (2019) Performance analysis of ECG big data using apache hive and apache pig. In: 2019 8th international conference on information and communication technologies (ICICT), pp 2–7. IEEE
Birjali M, Beni-Hssane A, Erritali M (2017) Analyzing social media through big data using infosphere biginsights and apache flume. Procedia Comput Sci 113:280–285
Le Noac’HP, Costan A, Bougé L (2017) A performance evaluation of Apache Kafka in support of big data streaming applications. In: 2017 IEEE International Conference on Big Data (Big Data), pp 4803–4806. IEEE
Hu L, Xia X (2021) 5G-Oriented IoT big data analysis method system. Mob Inf Syst 2021:3186696
Seth S, Johari R (2019) Statistical survey of data mining techniques: a walk-through approach using MongoDB. In: International conference on innovative computing and communications, pp 145–158. Springer, Singapore
Bashir MR, Gill AQ (2016) Towards an IoT big data analytics framework: smart buildings systems. In: 2016 IEEE 18th international conference on high performance computing and communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp 1325–1332. IEEE
Funding
No funding to declare.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhong, Y., Chen, L., Dan, C. et al. A systematic survey of data mining and big data analysis in internet of things. J Supercomput 78, 18405–18453 (2022). https://doi.org/10.1007/s11227-022-04594-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04594-1