Skip to main content
Log in

A parameter based growing ensemble of self-organizing maps for outlier detection in healthcare

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Outlier detection is critical for many applications such as healthcare, health insurance, medical diagnosis, predictive analytics, pattern recognition, intrusion detection, anomaly or defect detection, video surveillance, credit card fraud detection and text mining. Outlier detection techniques could be statistics, distance- or model based. Techniques, which are based on a single method for outlier detection usually have weaknesses and strengths and are mostly unstable. Outlier detection ensembles harness the strengths of individual detectors and result in stable performance. This paper presents a new parameter based growing self-organizing maps ensemble (GSOME) for outlier detection in multivariate patterns. For outlier detection, the proposed GSOME transforms non-linear relationships between high dimensional patterns into a simple 1D geometric relationship. Whatever the pattern dimensionality is, it is mapped to a single point of a line. The dispersion of mapped points will be used to locate the outliers and measure the degree of outlyingness. Several experiments on both real and synthetic data sets show the promising performance of the proposed GSOME.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Christy, A., MeeraGandhi, G., Vaithyasubramanian, S.: Cluster based outlier detection algorithm for healthcare data. Procedia Comput. Sci. 50, 209–215 (2015)

    Article  Google Scholar 

  2. Muhammad, G.: Automatic speech recognition using interlaced derivative pattern for cloud based healthcare system. Clust. Comput. 18(2), 795–802 (2015)

    Article  MathSciNet  Google Scholar 

  3. Vembandasamy, K., Karthikeyan, T.: Novel outlier detection in diabetics classification using data mining techniques. Int. J. Appl. Eng. Res. 11(2), 1400–1403 (2016)

    Google Scholar 

  4. Hu, L., et al.: Software defined healthcare networks. IEEE Wirel. Commun. 22(6), 67–75 (2015)

    Article  Google Scholar 

  5. Hossain, M.S., Muhammad, G., Alamri, A.: Smart healthcare monitoring: a voice pathology detection paradigm for smart cities. Multimed. Syst. (2017). https://doi.org/10.1007/s00530-017-0561-x

  6. Hossain, M.S., Muhammad, G.: Cloud-assisted industrial internet of things (IIoT)—enabled framework for health monitoring. Comput. Netw. 101(2016), 192–202 (2016)

    Article  Google Scholar 

  7. Hossain, M.S., Muhammad, G.: Cloud-assisted speech and face recognition framework for health monitoring. Mob. Netw. Appl. 20(3), 391–399 (2015)

    Article  Google Scholar 

  8. Hu, Y., Duan, K., Zhang, Y. et al.: Simultaneously aided diagnosis model for outpatient departments via healthcare big data analytics. Multimed Tools Appl. (2016). https://doi.org/10.1007/s11042-016-3719-1

  9. Hauskrecht, M., Batal, I., Hong, C., Nguyen, Q., Cooper, G.E., Visweswaran, S., Clermont, G.: Outlier-based detection of unusual patient-management actions. An ICU study. J. Biomed. Inform. 64, 211–221 (2017)

    Article  Google Scholar 

  10. Laurikkala, J., Juhola, M., Kentala, E.: Informal identification of outliers in medical data. In: Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP-2000), A Workshop at the 14th European Conference on Artificial Intelligence (ECAI-2000), Berlin, Germany, August 20–25 (2000)

  11. Hauskrecht, M., Batal, I., Valko, M., Visweswaran, S., Cooper, G.F., Clermont, G.: Outlier detection for patient monitoring and alerting. J. Biomed. Inf. 46(1), 47–55 (2013). https://doi.org/10.1016/j.jbi.2012.08.004

  12. Ypma, R., Duin, P.W.: Novelty detection using self-organizing maps. In: Kasabov, N., Kozma, R., Ko, K., O’Shea, R., Coghill, G., Gedeon, T. (eds.) Progress in Connectionist-Based Information Systems, vol. 2, pp. 1322–1325. Springer, London (1997)

    Google Scholar 

  13. Banerjee, A., Chandola, V., Lazarevic, A., Kumar, V., Srivastava, J.: Anomaly Detection: A Tutorial. In: SIAM Data Mining Conference, Atlanta, GA (2008)

  14. Song, X., Wu, M., Jermaine, C., Ranka, S.: Conditional anomaly detection. IEEE Trans. Knowl. Data Eng. 19(5), 631–645 (2007)

    Article  Google Scholar 

  15. Olivetti & Oracle Research Laboratory, The Olivetti & Oracle Research Laboratory Face Database of Faces. http://www.cam-orl.co.uk/facedatabase.html

  16. TILDA, Textile defect image database. University of Freiburg, Germany (1996)

  17. Geman, S., et al.: Neural networks and the bias/variance dilemma. Neural Comput. 4, 1–58 (1992)

    Article  Google Scholar 

  18. Zhang, Y., Meratnia, N., Havinga, P.J.M.: Outlier Detection Techniques for Wireless Sensor Network: A Survey. University of Twente, Enschede (2008)

    Google Scholar 

  19. Ghaemi, R., Sulaiman, M.N., Ibrahim, I., Mustapha, N.: A Survey: Clustering Ensembles Techniques. World Academy of Science, Engineering and Technology, Singapore (2009)

    Google Scholar 

  20. Lazarevic, A., Kumar, V.: Feature bagging for outlier detection. In: KDD, pp. 157–166 (2005)

  21. Hellerstein, J.M.: Quantitative data cleaning for large databases. http://db.cs.berkeley.edu/jmh/papers/cleaning-unece.pdf (Last visited in 2010)

  22. Hodge, V.J., Austin, J.A.: Survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)

    Article  MATH  Google Scholar 

  23. Fausette, V.L.: Fundamentals of Neural Networks. Prentice Hall, Upper Saddle River (1993)

    Google Scholar 

  24. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S. (Eds.). Proceedings of the ACM SIGMOD International Conference on Management of Data, Montreal, Quebec, Canada, June 4-6, pp. 103–114. ACM Press, New York (1996)

  25. Ester, M., Kriegel, H-P., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, pp. 226–231 (1996)

  26. Stolfo, S.J., Prodromidis, A.L., Tselepis, S., Lee, W., Fan, D.W., Chan, P.K.: JAM: Java agents for meta-learning over distributed databases. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 74–81 (1997)

  27. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth International Group, Belmont, CA (1984)

    MATH  Google Scholar 

  28. Cohen, W.W.: Fast effective rule induction. In: International Conference on Machine Learning, pp. 115–123 (1995)

  29. Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J. 8, 237–253 (2000)

    Article  Google Scholar 

  30. Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data. J. Artif. Intell. Res. 11, 131–167 (1999)

    Article  MATH  Google Scholar 

  31. Saunders, R., Gero, J.S.: A curious design agent: a computational model of novelty-seeking behavior in design. In: Proceedings of the Sixth Conference on Computer Aided Architectural Design Research in Asia (CAADRIA2001), Sydney, pp. 725–738(2001a)

  32. Vesanto, J., Himberg, J., Siponen, M., Simula, O.: Enhancing SOM based data visualization. In: Proceedings of the 5th International Conference on Soft Computing and Information/Intelligent Systems. Methodologies for the Conception, Design and Application of Soft Computing, vol. 1, pp. 64–67. Singapore: World Scientific (1998)

  33. Graham, W., Rohan, B., Hongxing, H., Hawkins, S., Gu, L.: A comparative study of RNN for outlier detection in data mining. In: ICDM ’02 Proceedings of the 2002 IEEE International Conference on Data Mining IEEE Computer Society Washington, DC, USA (2002)

  34. Hawkins, S., Hongxing, H., Graham, W., Rohan, B., Baxter, A.: Outlier Detection Using Replicator Neural Networks, DaWaK, pp. 170–180. Springer, New York (2002)

  35. Kohonen, T.: Self-Organizing Maps. Springer, New York (2001)

    Book  MATH  Google Scholar 

  36. Jiawei, H., Micheline, K., Pei, P.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier, New York (2010)

  37. Saunders, R., Gero, J.S.: Designing for interest and novelty: motivating design agents. In: Proceedings of CAAD Futures 2001, pp. 725–738. Eindhoven (2001)

  38. Marsland, S.: On-line novelty detection through self-organization, with application to inspection robotics. Ph.D. thesis, Faculty of Science and Engineering, University of Manchester, UK (2001)

  39. Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorization. J. Inf. Fusion 6(1), 5–20 (2005)

    Article  Google Scholar 

  40. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles. Mach. Learn. 51, 181–207 (2003)

    Article  MATH  Google Scholar 

  41. Savdra, C., Salas, R., Moreno, S., Allende, H.: Fusion of self organizing maps. In: Prudhomme et al. (eds.) LNCS 4507, (2007); ISMIS, LNAI 4994 (2008)

  42. Vesanto, J., Himberg, J., Alhoniemi, E., Parhankangas, J.: Self-Organizing Map in Matlab: the SOM Toolbox. In: Proceedings of the Matlab DSP Conference, pp. 35–40. Espoo, Finland (1999)

  43. Moglu, F., Alpaydin, E.: Combining multiple representations for pen-based handwritten digit recognition. Turk J. Electr. Eng. 9(1) (2001)

  44. Xue, Z., Shang, Y., Feng, A.: Semi-supervised outlier detection based on fuzzy rough C-means clustering. Math Comput. Simul. 80(9) (2010)

  45. Buizza, R., Palmer, T.N.: Impact of Ensemble Size on Ensemble Prediction, European Centre for Medium-Range Weather Forecasts, Reading, Berkshire, UK (1988)

  46. UC Irvine machine learning repository. http://archive.ics.uci.edu/ml/index.html (2010)

Download references

Acknowledgements

This work was supported by the Deanship of Scientific Research at King Saud University, Riyadh, Saudi Arabia, through the Research Group Project under Grant RG -1436-023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Shamim Hossain.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elmougy, S., Hossain, M.S., Tolba, A.S. et al. A parameter based growing ensemble of self-organizing maps for outlier detection in healthcare. Cluster Comput 22 (Suppl 1), 2437–2460 (2019). https://doi.org/10.1007/s10586-017-1327-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1327-0

Keywords

Navigation