Skip to main content

DRN: Detection and Removal of Noisy Instances with Self Organizing Map

  • Conference paper
  • First Online:
Pattern Recognition and Artificial Intelligence (ICPRAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13364))

  • 1030 Accesses

Abstract

Identification of noisy instances provides an effective solution to improve the predictive performance of machine learning algorithms. The presence of noise in a data set poses two major negative consequences: (i) a decrease in the classification accuracy (ii) an increase in the complexity of the induced model. Therefore, the removal of noisy instances can improve the performance of the induced models. However, noise identification can be especially challenging when learning complex functions which often contain outliers. To detect such noise, we present a novel approach: DRN for detecting instances with noise. In our approach, we ensemble a self-organizing map (SOM) with a classifier. DRN can effectively distinguish between outlier and noisy instances. We evaluate the performance of our proposed algorithm using five different classifiers (viz. J48, Naive Bayes, Support Vector Machine, \(k \)-Nearest Neighbor, Random Forest) and 10 benchmark data sets from the UCI machine learning repository. Experimental results show that DRN removes noisy instances effectively and achieves better accuracy than the existing state-of-the-art algorithm on various datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/chobi21/DRN-Noise-Identification.

References

  1. Han, J., Kamber, M., Pei, J.: Data Preprocessing. Data Mining, 3rd edn. The Morgan Kaufman Series in Data Management System (2012)

    Google Scholar 

  2. Libralon, G., Carvalho, A., Lorena, A.: Prepossessing for noise detection in gene expression classification data. J. Braz. Comput. Soc. 15(1), 3–11 (2009)

    Google Scholar 

  3. Zhu, X., Wu, X.: Class noise vs attribute noise: a quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004)

    Google Scholar 

  4. Gamberger, D., Lavrac, N., Groselj, C.: Experiments with noise filtering in a medical domain. In: International Conference of Machine Learning, pp. 143–151 (1999)

    Google Scholar 

  5. Farid, D., Zhang, L., Rahman, C., Hossain, M., Strachan, R.: Hybrid decision tree and Naive Bayes classifier for multitask classification task. Expert Syst. Appl. 41(4), 1937–1946 (2014)

    Google Scholar 

  6. Sluban, B., Gamberger, D., Lavrac, N.: Ensembe-based noise detection: noise ranking and visual performance evaluation. Data Min. Knowl. Discov. 28(2), 265–303 (2014)

    Google Scholar 

  7. Tang, W., Khosgoftaar, T.: Noise identification with the k-means algorithm. In: 16th IEEE International Conference on Tools with Artificial Intelligence, pp. 373–378 (2004)

    Google Scholar 

  8. Hulse, J., Khosgoftaar, T., Huang, H.: The pairwise attribute noise detection algorithm. Knowl. Inf. Syst. 11(2), 171–190 (2007)

    Google Scholar 

  9. Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990). https://doi.org/10.1109/5.58325

  10. Munoz, A., Muruzabal, J.: Self-organizing maps for outlier detection. Neurocomputing 18(1–3), 33–60 (1998)

    Google Scholar 

  11. Gupta, S., Gupta, A.: Dealing with noise problems in machine learning data-sets: a systematic review. Procedia Comput. Sci. 161, 466–474 (2019)

    Google Scholar 

  12. He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recognit. Lett. 24(9–10), 1641–1650 (2003)

    Google Scholar 

  13. Yin, H., Dong, H., Li, Y.: A cluster based noise detection algorithm. In: 2009 First International Workshop on Database Technology and Applications, pp. 386–389 (2009)

    Google Scholar 

  14. Sarker, I., Kabir, M., Colman, A., Han, J.: An improved Naive Bayes classifier-based noise detection technique for classifying user phone call behavior. In: Australian Conference on Data Mining, pp. 72–85 (2017)

    Google Scholar 

  15. UCI Machine Learning Repository. https://archive.ics.uci.edu/. Accessed 1 Jan 2022

Download references

Acknowledgments

The authors thank the anonymous reviewers whose suggestions helped to clarify and improve our paper. This work was supported in part by the National Science Foundation under grant number OIA-1946231 and the Louisiana Board of Regents for the Louisiana Materials Design Alliance (LAMDA).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rashida Hasan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hasan, R., Chu, CH.H. (2022). DRN: Detection and Removal of Noisy Instances with Self Organizing Map. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13364. Springer, Cham. https://doi.org/10.1007/978-3-031-09282-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-09282-4_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-09281-7

  • Online ISBN: 978-3-031-09282-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics