Skip to main content
Log in

Automatic detection of boundary points based on local geometrical measures

  • Foundations
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper presents an angle and density-based data preprocessing method. It can be used to simultaneously identify outliers and boundary points (called uniformly boundary points). Detecting boundary points is often more interesting than detecting normal points, since they represent valid, interesting, and potentially valuable patterns. An efficient local geometry-based method is proposed for detecting such points by both angle and density measures. The unified measure is adaptive and stable by combining multiple features (angles and density), which can be used to evaluate to what degree a given point is a boundary point. Compared with two related state-of-the-art approaches, our method better reflects the characteristics of the data and provides similar but accuracies for more data set. Experimental results obtained for a number of synthetic and real-world data sets demonstrate the effectiveness and efficiency of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Aggarwal CC, Yu PS (2001) Outlier detection for high dimensional data. In: ACM Sigmod Record, vol 30. ACM, pp 37–46

  • Barnett V, Lewis T (1994) Outliers in statistical data. 3rd edn, Wiley, London

  • Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) Lof: identifying density-based local outliers. In: ACM Sigmod Record, vol 29. ACM, pp 93–104

  • Ding X, Li Y, Belatreche A Maguire L (2014) A locally adaptive boundary evolution algorithm for novelty detection using level set methods. In: 2014 international joint conference on neural networks (IJCNN). IEEE, pp 1870–1876

  • Ding X, Li Y, Belatreche A, Maguire LP (2015) Novelty detection using level set methods. IEEE Trans Neural Netw Learn Syst 26(3):576–588

    Article  MathSciNet  Google Scholar 

  • Elhamifar E, Vidal R (2011) Sparse manifold clustering and embedding. In: Advances in neural information processing systems, pp 55–63

  • Fu L, Medico E (2007) Flame, a novel fuzzy clustering method for the analysis of dna microarray data. BMC Bioinform 8(1):3

    Article  Google Scholar 

  • Fukunaga K (2013) Introduction to statistical pattern recognition. Academic Press, Cambridge

    MATH  Google Scholar 

  • Grubbs FE (1950) Sample criteria for testing outlying observations. Ann Math Stat 1:27–58

    Article  MathSciNet  MATH  Google Scholar 

  • Hautamäki V, Kärkkäinen I, Fränti P (2004) Outlier detection using \(k\)-nearest neighbour graph. In: ICPR, no 3, pp 430–433

  • Hawkins DM (1980) Identification of outliers. Springer, Berlin

    Book  MATH  Google Scholar 

  • Knox EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the international conference on very large data bases. Citeseer, pp 392–403

  • Kriegel H-P, Zimek, A et al (2008) Angle-based outlier detection in high-dimensional data. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 444–452

  • Kriegel H-P, Kröger P, Zimek A (2010) Outlier detection techniques. In: Tutorial at the 16th ACM international conference on knowledge discovery and data mining (SIGKDD), Washington

  • Kutsuna T, Yamamoto A (2014) Outlier detection based on leave-one-out density using binary decision diagrams. In: Tseng VS, Ho TB, Zhou Z-H (eds) Advances in knowledge discovery and data mining. Springer, Berlin, pp 486–497

  • Li Y (2008) A surface representation approach for novelty detection. In: International conference on information and automation ICIA 2008, pp 1464–1468

  • Li Y (2011) Selecting training points for one-class support vector machines. Pattern Recognit Lett 32(11):1517–1522

    Article  Google Scholar 

  • Li Y, Maguire LP (2011) Selecting critical patterns based on local geometrical and statistical information. IEEE Trans Pattern Anal Mach Intell 33(6):1189–1201

    Article  Google Scholar 

  • Li L, Lv J, Yi Z (2015a) A non-negative representation learning algorithm for selecting neighbors. Mach Learn 102:133–153

    Article  MathSciNet  MATH  Google Scholar 

  • Li X, Lv JC, Cheng D (2015b) Angle-based outlier detection algorithm with more stable relationships. In: Proceedings of the 18th Asia Pacific symposium on intelligent and evolutionary systems, Vol 1. Springer. pp 433–446

  • Li X, Geng P, Qiu B (2016a) A cluster boundary detection algorithm based on shadowed set. Intell Data Anal 20(1):29–45

    Article  Google Scholar 

  • Li X, Lv J, Li L, Ao F (2016b) An angle and density-based method for key points detection. In: 2016 international joint conference on neural networks (IJCNN). IEEE

  • Li X, Lv J, Yi Z (2016c) An efficient representation-based method for boundary point and outlier detection. IEEE Trans Neural Netw Learn Syst. doi:10.1109/TNNLS.2016.2614896

  • Lv JC, Yi Z, Tan KK (2007) Determination of the number of principal directions in a biologically plausible pca model. IEEE Trans Neural Netw 18(3):910–916

    Article  Google Scholar 

  • Lv JC, Tan KK, Yi Z, Huang S (2010) A family of fuzzy learning algorithms for robust principal component analysis neural networks. IEEE Trans Fuzzy Syst 18(1):217–226

    Article  Google Scholar 

  • Lv JC, Yi Z, Zhou J (2011) Subspace learning of neural networks. CRC Press, CRC, Boca Raton

    MATH  Google Scholar 

  • Lv JC, Yi Z, Li Y (2015) Non-divergence of stochastic discrete time algorithms for pca neural networks. IEEE Trans Neural Netw Learn Syst 26(2):394–399

    Article  MathSciNet  Google Scholar 

  • Qiu B, Cao X (2016) Clustering boundary detection for high dimensional space based on space inversion and hopkins statistics. Knowl Based Syst 98:216–225

    Article  Google Scholar 

  • Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  • Tang K, Peng F, Chen G, Yao X (2014) Population-based algorithm portfolios with automated constituent algorithms selection. Inf Sci 279:94–104

    Article  Google Scholar 

  • Tseng VS, Ho TB, Zhou Z-H, Chen ALP, Kao H-Y, (eds) (2014) 18th Pacific-Asia conference advances in knowledge discovery and data mining (PAKDD), vol 8444 of Lecture notes in computer science. Springer, Berlin

  • Wang C, Liu D, Wei QL, Zhao DB, Xia ZC (2014) Iterative adaptive dynamic programming approach to power optimal control for smart grid with energy storage devices. Zidonghua Xuebao/Acta Autom Sin 40(9):1984–1990

    MATH  Google Scholar 

  • Wang H, Jin Y, Yao X (2016) Diversity assessment in many-objective optimization. Trans Cybern 40(6):1510–1522

    Google Scholar 

  • Waugh SG (1995) Extending and benchmarking Cascade-Correlation: extensions to the Cascade-Correlation architecture and benchmarking of feed-forward supervised artificial neural networks. Ph.D. thesis, University of Tasmania

  • Xia C, Hsu W, Lee ML, Ooi BC (2006) Border: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303. doi:10.1109/TKDE.2006.38 ISSN 1041-4347

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61375065, 61502208 and 61602066) and by the Project Supported by the Scientific Research Foundation of the Education Department of Sichuan Province(17ZA0063) and the Scientific Research Foundation (KYTZ201608) of CUIT, partially supported by the National Science Fund for Distinguished Young Scholars of China (Grant No. 61625204), the Sichuan Science and Technology Support Project (Grant No. 2014SZ0104), and The Natural Science Foundation of Jiangsu Province of China (Grant No. BK20150522).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiancheng Lv.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by A. Di Nola.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Wu, X., Lv, J. et al. Automatic detection of boundary points based on local geometrical measures. Soft Comput 22, 3663–3674 (2018). https://doi.org/10.1007/s00500-017-2817-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2817-y

Keywords

Navigation