Abstract
The internet of things (IoT) is an internet among things through advanced communication without human’s operation. The effective use of data classification in IoT to find new and hidden truth can enhance the medical field. In this paper, the big data analytics on IoT based healthcare system is developed using the Random Forest Classifier (RFC) and MapReduce process. The e-health data are collected from the patients who suffered from different diseases is considered for analysis. The optimal attributes are chosen by using Improved Dragonfly Algorithm (IDA) from the database for the better classification. Finally, RFC classifier is used to classify the e-health data with the help of optimal features. It is observed from the implementation results is that the maximum precision of the proposed technique is 94.2%. In order to verify the effectiveness of the proposed method, the different performance measures are analyzed and compared with existing methods.
Similar content being viewed by others
Abbreviations
- \(S{p_i}\) :
-
Separation of \(i\)th individual
- \(P\) :
-
Current position
- \({P_k}\) :
-
Position of \(k\)th individual
- \(N\) :
-
Total number of neighboring individual in the search space
- \({A_{li}}\) :
-
Alignment of \(i\)th neighboring individual
- \({V_k}\) :
-
Velocity of \(k\)th individual
- \({P^ - }\) :
-
Position of enemy
- \({P^+}\) :
-
Position of food source
- \(sw\) :
-
Separation weight
- \(aw\) :
-
Alignment weight
- \(cw\) :
-
Cohesion weight
- \(Att\) :
-
Attraction, food factor
- \(Dis\) :
-
Distraction, enemy factor
- \(w\_CR\) :
-
Inertia weight-crossover rate
- \(t\) :
-
Iteration count
- \({f_{\text{max} }}\) :
-
Largest fitness value
- \({f_p}\) :
-
Larger of the two individuals to cross the fitness
- \({f_{avg}}\) :
-
Average fitness
- \({f_{}}\) :
-
Mutation individual’s fitness
- \({R_1},{R_2}\) :
-
Random values
- \(V1,V2\) :
-
Random vectors that indicate the probability
- \(F\) :
-
Margin function
- \(I(\,)\) :
-
Indicator function
- \({\arg _k}I({h_k}(V1)\) :
-
\({h_k}\) is \(n\)th tree of the RF
References
Bin S, Yuan L, Xiaoyi W (2010) Research on data mining models for the internet of things. In: Image analysis and signal processing (IASP), 2010 international conference on, IEEE, pp 127–132
Paul A, Daniel A, Ahmad A, Rho S (2017) Cooperative cognitive intelligence for the internet of vehicles. IEEE Syst J 11(3):1249–1258
Singh A, Sharma S, 2017, February. Analysis of data mining models for internet of things. In: I-SMAC (IoT in social, mobile, analytics, and cloud) (I-SMAC), 2017 international conference on, IEEE, pp 94–100
Yan Z, Liu J, Yang LT, Chawla N (2017) Big data fusion in internet of things. Inf Fusion. https://doi.org/10.1016/j.inffus.2017.04.005
Paul A (2013) Graph-based M2M optimization in an IoT environment. In: Proceedings of the 2013 research in adaptive and convergent systems, ACM, pp 45–46
Warner JL, Zhang P, Liu J, Alterovitz G (2016) Classification of hospital-acquired complications using temporal clinical information from a large electronic health record. J Biomed Inform 59:209–217
Ahmed E, Yaqoob I, Hashem IAT, Khan I, Ahmed AIA, Imran M, Vasilakos AV (2017) The role of big data analytics in the Internet of Things. Comput Netw 129:459–471
Plageras AP, Stergiou C, Kokkonis G, Psannis KE, Ishibashi Y, Kim BG, Gupta BB (2017) Efficient large-scale medical data (eHealth Big Data) analytics in the internet of things. In: Business informatics (CBI), 2017 IEEE 19th conference on, IEEE, vol 2, pp 21–27
Sugiyarti E, Jasmi KA, Basiron B, Huda M, Shankar K, Maseleno A (2018) Decision support system of scholarship grantee selection using data mining. Int J Pure Appl Math 119(15):2239–2249
Susto GA, Schirru A, Pampuri S, McLoone S (2016) Supervised aggregative feature extraction for big data time series regression. IEEE Trans Ind Inform 12(3):1243–1252
Masetic Z, Subasi A (2016) Congestive heart failure detection using a random forest classifier. Comput Methods Prog Biomed 130:54–64
Revathi L, Appandiraj A (2017) Hadoop based parallel framework for feature subset selection in big data. J Innov Res Sci Eng Technol 4(5):3530–3534
Shankar K (2017) Prediction of most risk factors in hepatitis disease using Apriori algorithm. Res J Pharm Biol Chem Sci 8(5):477–484. ISSN 0975-8585
Mohapatra C, Rautray SS, Pandey M (2017) Prevention of infectious disease based on big data analytics and map-reduce. In: Electrical, computer and communication technologies (ICECCT), 2017 second international conference on, IEEE, pp 1–4
Lakshmanaprabu SK, Shankar K, Khanna A, Gupta D, Rodrigues JJ, Pinheiro PR, De Albuquerque VHC (2018) Effective features to classify big data using social internet of things. IEEE Access 6:24196–24204
Shankar K, Lakshmanaprabu SK, Gupta D et al (2018) Optimal feature-based multi-kernel SVM approach for thyroid disease classification. J Super Comput. https://doi.org/10.1007/s11227-018-2469-4
Manogaran G, Lopez D, Chilamkurti N (2018) In-Mapper combiner based MapReduce algorithm for processing of big climate data. Future Gener Comput Syst 86:433–445
Ke Q, Zhang J, Song H, Wan Y (2018) Big data analytics enabled by feature extraction based on partial independence. Neurocomputing 288:3–10
Sindhujaa N, Vanitha CN, Subaira AS (2016) An improved version of big data classification and clustering using graph search technique. Int J Comput Sci Mob Comput 5(2):224–229
Wang F, Niu L (2016) An improved BP neural network in the internet of things data classification application research. In: Information technology, networking, electronic, and automation control conference, IEEE, pp 805–808
Paul A, Ahmad A, Rathore MM, Jabbar S (2016) Smartbuddy: defining human behaviors using big data analytics in the social internet of things. IEEE Wirel Commun 23(5):68–74
Ravichandran K, Nagarasan S (2016) Performance of classification in medical data mining. J Innov Res Comput Commun Eng 4(6):12104–12110
Paul A, Rho S (2016) A probabilistic model for M2M in IoT networking and communication. Telecommun Syst 62(1):59–66
Sisiaridis D, Markowitch O (2017) Feature extraction and feature selection: reducing data complexity with apache spark. Int J Netw Secur Appl 9(6):39–51
Antunes M, Gomes D, Aguiar RL (2018) Towards IoT data classification through semantic features. Future Gener Comput Syst 86:792–798
Shadroo S, Rahmani AM (2018) Systematic survey of big data and data mining in the internet of things. Comput Netw 139:19–47
Amroun H, Temkit MHH, Ammi M (2017) Best feature for CNN classification of human activity using IOT network. In: The internet of things (iThings) and IEEE green computing and communications (GreenCom) and IEEE cyber, physical and social computing (CPSCom) and IEEE smart data (SmartData), 2017 IEEE international conference on, IEEE, pp 943–950
Girish KV, Ramakrishnan AG, Kumar N (2018( A system for distributed audio classification using sparse representation over cloud for IOT. In: Communication systems & networks (COMSNETS), 2018 10th international conference on, IEEE, pp 342–347
Paul A (2014) Real-time power management for embedded M2M using intelligent learning methods. ACM Trans Embed Comput Syst (TECS) 13(5s):148
Sree Ranjini KS, Murugan S (2017) Memory-based hybrid dragonfly algorithm for numerical optimization problems. Expert Syst Appl 83:63–78
Chaudhary A, Kolhe S, Kamal R (2016) An improved random forest classifier for multi-class classification. Inf Process Agric 3(4):215–222
Subramaniyaswamy V, Vijayakumar V, Logesh R, Indragandhi V (2015) Unstructured data analysis on big data using map reduce. Procedia Comput Sci 50:456–465
Yang S, Guo JZ, Jin JW (2018) An improved Id3 algorithm for medical data classification. Comput Electr Eng 65:474–487
Tran CT, Zhang M, Andreae P, Xue B, Bui LT (2018) An effective and efficient approach to classification with incomplete data. Knowl Based Syst 154:1–16
Talari S, Shafie-khah M, Siano P, Loia V, Tommasetti A, Catalão JP (2017) A review of smart cities based on the internet of things concept. Energies 10(4):421
Ayma VA, Ferreira RS, Happ P, Oliveira D, Feitosa R, Costa G, Plaza A, Gamba P (2015) Classification algorithms for big data analysis, a map reduce approach. Int Arch Photogramm Remote Sens Spat Inf Sci 40(3):17
Harris NL, Jaffe ES, Stein H, Banks PM, Chan JK, Cleary ML, Delsol G, De Wolf-Peeters C, Falini B, Gatter KC, Grogan TM (1994) A revised European–American classification of lymphoid neoplasms: a proposal from the International Lymphoma Study Group. Blood 84(5):1361–1392
https://archive.ics.uci.edu/ml/datasets/heart+Disease. Accessed 10 May 2018
https://archive.ics.uci.edu/ml/datasets/liver+disorders. Accessed 4 May 2018
https://archive.ics.uci.edu/ml/datasets/chronic_kidney_disease. Accessed 6 May 2018
http://archive.ics.uci.edu/ml/datasets/Lung+Cancer. Accessed 7 May 2018
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lakshmanaprabu, S.K., Shankar, K., Ilayaraja, M. et al. Random forest for big data classification in the internet of things using optimal features. Int. J. Mach. Learn. & Cyber. 10, 2609–2618 (2019). https://doi.org/10.1007/s13042-018-00916-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-018-00916-z