Skip to main content
Log in

Enhanced Non-parametric Sequence-based Learning Algorithm for Outlier Detection in the Internet of Things

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Although research on outlier detection methods has been an investigation area for long, few of those studies relate to an Internet of Things (IoT) domain. Several critical decisions taken on daily business operations depend on various data collected over time. Therefore, it is mandatory to guarantee its correctness, integrity, and accuracy before any further processing can commence. Outliers are often assumed to be Error by most algorithms in the past, which is always attributed to faulty sensors. Hence, this assumption has been investigated and results show that outliers can be classified into Error and Event types with the support of a Non-parametric sequence-based learning algorithm. The event type outlier is majorly caused by abnormality from sensor readings, which are very important and should not be ignored. However, the non-parametric sequence approach and other existing techniques still find it elusive to detect outliers in the global search space of a large dataset. Therefore, this paper proposes an Enhanced Non-parametric sequence learning algorithm based on Ensemble Clustering Techniques to detect Event and Error outliers in large datasets. Experiments are conducted on six different datasets from the UCL repository, except one collected from a laboratory testbed, to demonstrate the robustness and effectiveness of the proposed approach over the existing techniques. The results show a remarkable performance rate of 96.653% accuracy, 94.284% precision, and 98.112% for Error outlier detection. It also performs better in Event outlier detection with 87.611% accuracy, 71.141% precision and 85.755% specificity with 1291 s execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Kamal S, Ramadan RA, El-Refai F (2016) Smart outlier detection of wireless sensor network. Electron Energies 29:383–393

    Google Scholar 

  2. Graham B (2019) Frozen speed sensors may be blame for the Russian plane crash that killed 71 people. News.com.au Publishing website. https://www.news.com.au/travel/travel-updates/incidents/. Accessed Nov 2019

  3. Deng X, Jiang P, Peng X, Mi C (2019) An intelligent outlier detection method with one class support tucker machine and genetic algorithm toward big sensor data in Internet of Things. IEEE Trans Ind Electron 66:4672–4683

    Article  Google Scholar 

  4. Nesa N, Ghosh T (2018) IndrajitBanejee: non-parametric sequence-based learning approach for outlier detection in IoT. Future Gener Comput Syst 82:412–421

    Article  Google Scholar 

  5. Erfani SM, Rajasegarar S, Karunasekera S, Leckie C (2016) High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recogn 58:121–134

    Article  Google Scholar 

  6. Lyu L, Jin J, Rajasegarar S, He X, Palaniswami M (2017) Fog-empoered anomaly detection in IoT using hyperellipsoidal clustering. IEEE Internet Things J 4:1174–1184

    Article  Google Scholar 

  7. Zhang R, Ji P, Mylaraswamy D, Srivastava M, Zahedi S (2013) Cooperative sensor anomaly detection using global information. Tsinghua Sci Technol 18:209–219

    Article  Google Scholar 

  8. Yu T, Wang X, Shami A (2017) Recursive principal component analysis-based data outlier detection and sensor data aggregation in IoT systems. IEEE Internet Things J 4:2207–2216

    Article  Google Scholar 

  9. Kong X, Chang J, Niu M, Huang X, Wang J, Chang SI (2018) Research on real time feature extraction method for complex manufacturing big data. Int J Adv Manuf Technol 99:1101–1108

    Article  Google Scholar 

  10. Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  Google Scholar 

  11. Yu J, Rui Y, Chen B (2014) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimed 16(1):159–168

    Article  Google Scholar 

  12. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 1–14 (In Press)

  13. Razzak I, Zafar K, Imran M, Xu G (2020) Randomized nonlinear one-class support vector machines with bounded loss function to detect outliers for large scale IoT data. Future Gener Comput Syst 112:715–723

    Article  Google Scholar 

  14. Simar L, Wilson PW (2015) Statistical approaches for non-parametric frontier models: a guided tour. Int Stat Rev 83:77–110

    Article  MathSciNet  Google Scholar 

  15. Shahid N, Naqvi IH, Qaisar SB (2015) Characteristics and classification of outlier detection techniques for wireless sensor networks in harsh environments: a survey. Artif Intell Rev 43:193–228

    Article  Google Scholar 

  16. Zhang Y, Ren J, Liu J, Xu C, Guo H, Liu Y (2017) A survey on emerging computing paradigms for big data. Chin J Electron 26:1–12

    Article  Google Scholar 

  17. Thuc KX, Insoo K (2011) A collaborative event detection scheme using fuzzy logic in clustered wireless sensor networks. AEU Int J Electron Commun 65:485–488

    Article  Google Scholar 

  18. Ayadi A, Ghorbel O, Obeid AM, Abid M (2017) Outlier detection approaches for wireless sensor networks: a survey. Comput Netw 129:319–333

    Article  Google Scholar 

  19. Fan H, Zaïane OR, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19:31–51

    Article  Google Scholar 

  20. Tsai C-H, Chang C-L, Chen L (2003) Applying grey relational analysis to the vendor evaluation model. Int J Comput Internet Manag 11:45–53

    Google Scholar 

  21. Modi K, Oza B (2016) Outlier analysis approaches in data mining. Int J Innov Res Technol 3:6–12

    Google Scholar 

  22. Zhao Z, Chen W, Wu X, Chen PC, Liu J (2017) LSTM network: a deep learning approach for short-term traffic forecast. IET Intell Transp Syst J 11:68–75

    Article  Google Scholar 

  23. Ajitha P, Chandra E (2015) A survey on outliers detection in distributed data mining for big data. J Basic Appl Sci Res 5:31–38

    Google Scholar 

  24. Fan H, Zaïane OR, Foss A, Wu J (2006) A nonparametric outlier detection for effectively discovering Top-N outliers from engineering data. In: Pacific-Asia conference on knowledge discovery and data mining, pp 557–566

  25. Shekhar S, Lu CT, Zhang P (2003) A unified approach to detecting spatial outliers. GeoInform J 7:139–166

    Article  Google Scholar 

  26. Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42:2785–2797

    Article  Google Scholar 

  27. Manogaran G, Vijayakumar V, Varatharajan R, Kumar PM, Sundarasekar R, Hsu CH (2018) Machine learning based big data processing framework for cancer diagnosis using hidden Markov model and GM clustering. Wirel Personal Commun 102:2099–2116

    Article  Google Scholar 

  28. Yang MS, Lai CY, Lin CY (2012) A robust EM clustering algorithm for Gaussian mixture models. Pattern Recogn 45:3950–3961

    Article  Google Scholar 

  29. Wang Y, Chen W, Zhang J, Dong T, Shan G, Chi X (2011) Efficient volume exploration using the Gaussian mixture model. IEEE Trans Vis Comput Gr 17:1560–1573

    Article  Google Scholar 

  30. Das D (2018) Time and space complexity of algorithm: asymptotic notation. https://www.csetutor.com/time-complexity-and-space-complexity-of-an-algorithm/. Accessed 01 Dec 2019

  31. Dua D, Taniskidou EK (2018) UCI Machine learning Repository. University of California, Ivine, School of Information and Computer Sciences. https://archive.ics.uci.edu/ml/index.php. Accessed 04 Dec 2019

  32. Tripathy A, Agrawal A, Rath SK (2016) Classification of sentiment reviews using n-gram machine learning approach. Expert Syst Appl 57:117–126

    Article  Google Scholar 

  33. de Souza PS, dos Santos Marques W, Rossi FD, da Cunha Rodrigues G, Calheiros RN (2017) Calheiros performance and accuracy trade-off analysis of techniques for anomaly detection in IoT sensors. In: International conference on information networking, Da Nang, Vietnam, pp 486–491

  34. Fonollosa J, Sheik S, Huerta R, Marco S (2015) Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sensors Actuators 215:618–629

    Article  Google Scholar 

  35. Burgués J, Jiménez-Soto JM, Marco S (2018) Estimation of limit of detection in semiconductor gas sensors through linearized calibration models. Anal Chem Acta 1013:13–25

    Article  Google Scholar 

  36. Anguita D, Ghio A, Oneto L (2013) A public domain dataset for human activity recognition using smartphones. In: European symposium on artificial neural networks, computational intelligence and machine learning, Bruges Belgium, pp 437–442

  37. Raafat HM, Hossain MS, Essa E, Elmougy S, Tolba AS, Muhammad G, Ghoneim A (2017) Fog intelligence for real-time IoT sensor data abalytics. IEEE Access 5:24062–240069

    Article  Google Scholar 

  38. Aljawarneh SA, Vangipuram R (2020) GARUDA: Gaussian dissimilarity measure for feature representation and anomaly detection in Internet of Things. J Supercomput 76:4376–4413

    Article  Google Scholar 

  39. Luo T, Nagarajan SG (2018) Distributed anomaly detection using autoencoder neural networks in WNS for IoT. In: IEEE international conference on communications, Kansas City, MO, USA, pp 1–6

  40. Bandyopadhyay S, Ukil A, Puri C, Singh R, Bose T, Pal A (2016) SensIPro: smart sensor analytics for Internet of Things. In: IEEE symposium on computers and communications (ISCC), Messina Italy, pp 1–7

  41. Sharma V, You I, Kumar R (2017) Isma: intelligent sensing model for anomalies detection in cross platform OSNs with a case study on IoT. IEEE Access 5:3284–3301

    Article  Google Scholar 

Download references

Acknowledgement

Special thanks to Neshreen Nesa, a Research Scholar in the Department of Information Technology at IIEST Shibpur, India. For are tremendous support in providing a detailed explanation and deliverables of the existing research work which she co-authored and happens to be the lead author. Our gratitude also goes to the rest co-authors (Tania Ghosh and Indrajit Banerjee) of the existing research work, which our current study is based on. Also, acknowledging SwapanShakhari for retrieving the laboratory dataset utilized in this study. Appreciation also goes to the Research Management Centre (RMC) Universiti Teknologi Malaysia for their support with the research grant (Q.J130000.2451.07G48), in collaboration with the Research Management Centre of Universiti Tun Hussein Onn Malaysia (K028) and Universiti Teknikal Malaysia Melaka (PJP/2018/FTMK-CACT/CRG/S01649).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abel Efetobor Edje.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Edje, A.E., Abd Latiff, S.M. & Chan, H.W. Enhanced Non-parametric Sequence-based Learning Algorithm for Outlier Detection in the Internet of Things. Neural Process Lett 53, 1889–1919 (2021). https://doi.org/10.1007/s11063-021-10473-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10473-2

Keywords

Navigation