Skip to main content

GO-PEAS: A Scalable Yet Accurate Grid-Based Outlier Detection Method Using Novel Pruning Searching Techniques

  • Conference paper
  • First Online:
Book cover Artificial Life and Computational Intelligence (ACALCI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9592))

  • 1222 Accesses

Abstract

In this paper, we propose a scalable yet accurate grid-based outlier detection method called GO-PEAS (stands for Grid-based Outlier detection with Pruning Searching techniques). Innovative techniques are incorporated into GO-PEAS to greatly improve its speed performance, making it more scalable for large data sources. These techniques offer efficient pruning of unnecessary data space to substantially enhance the detection speed performance of GO-PEAS. Furthermore, the detection accuracy of GO-PEAS is guaranteed to be consistent with its baseline version that does not use the enhancement techniques. Experimental evaluation results have demonstrated the improved scalability and good effectiveness of GO-PEAS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.C., Yu, P.S.: Outlier detection in high dimensional data. In: SIGMOD 2001 (2001)

    Google Scholar 

  2. Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. Wiley, Chichester (1994)

    MATH  Google Scholar 

  3. Elahi, M., Lv, X., Nisar, M.W., Wang, H.: Distance based outlier for data streams using grid structure. Inf. Technol. J. 8(2), 128–137 (2009)

    Article  Google Scholar 

  4. Knorr, E.M., Ng, R.T.: Finding intentional knowledge of distance-based outliers. In: VLDB 1999, Edinburgh, Scotland, pp. 211–222 (1999)

    Google Scholar 

  5. Koh, J.L.Y., Lee, M.-L., Hsu, W., Ang, W.T.: Correlation-based attribute outlier detection in XML. In: ICDE 2008, pp. 1522–1524 (2008)

    Google Scholar 

  6. Ma, L., Gu, L., Li, B., Zhou, L., Wang, J.: An improved grid-based k-means clustering algorithm. Adv. Sci. Technol. Lett. 73, 1–6 (2014)

    Google Scholar 

  7. Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. IEEE Trans. Knowl. Data Eng. 25(1), 1–20 (2014)

    Google Scholar 

  8. Schubert, E., Zimek, A., Kriegel, H.-P.: Generalized outlier detection with flexible kernel density estimates. In: SDM 2014, pp. 542–550 (2014)

    Google Scholar 

  9. Vijayarani, S., Jothi, P.: An efficient clustering algorithm for outlier detection in data streams. Int. J. Adv. Res. Comput. Commun. Eng. 2(9), 3657–3665 (2013)

    Google Scholar 

  10. Zhang, J., Tao, X., Wang, H.: Outlier detection from large distributed databases. World Wide Web J. 17(4), 539–568 (2014)

    Article  MATH  Google Scholar 

  11. Zhang, J., Gao, Q., Wang, H., Liu, Q., Xu, K.: Detecting projected outliers in high-dimensional data streams. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 629–644. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Su, L., Han, W., Yang, S., Zou, P., Jia, Y.: Continuous adaptive outlier detection on distributed data streams. In: HPCC 2007, Houston, TX, USA, pp. 74–85 (2007)

    Google Scholar 

  13. Tang, J., Chen, Z., Fu, A., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: PAKDD 2002, Taipei, Taiwan, pp. 535–548 (2002)

    Google Scholar 

  14. Sheng, B., Li, Q., Mao, W., Jin, W.: Outlier detection in sensor networks. In: MobiHoc 2007, Montral, Qubec, Canada, pp. 219–228 (2007)

    Google Scholar 

  15. Otey, M., Ghoting, A., Parthasarathy, S.: Fast distributed outlier detection in mixed attribute data sets. Data Min. Knowl. Discov. 12(2), 203–228 (2006)

    Article  MathSciNet  Google Scholar 

  16. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: SIGMOD 2000, Dallas, Texas, pp 427–438 (2000)

    Google Scholar 

  17. Jin, W., Tung, A.K.H., Han, J.: Finding Top n Local Outliers in Large Database. In: SIGKDD 2001, San Francisco, CA, pp 293–298 (2001)

    Google Scholar 

  18. Ester, M., Kriegel, H-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: SIGKDD 1996, Portland, Oregon, USA, pp 226–231 (1996)

    Google Scholar 

  19. Chhabra, P., Scott, C., Kolaczyk, E.D., Crovella, M.: Distributed spatial anomaly detection. In: INFOCOM 2008, Phoenix, AZ, pp 1705–1713 (2008)

    Google Scholar 

Download references

Acknowledgement

The authors would like to thank the support from National Science Foundation of China through the research projects (No. 61370050, No. 61572036 and No. 61363030) and Guangxi Key Laboratory of Trusted Software (No. kx201527).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ji Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, H., Zhang, J., Luo, Y., Chen, F., Chang, L. (2016). GO-PEAS: A Scalable Yet Accurate Grid-Based Outlier Detection Method Using Novel Pruning Searching Techniques. In: Ray, T., Sarker, R., Li, X. (eds) Artificial Life and Computational Intelligence. ACALCI 2016. Lecture Notes in Computer Science(), vol 9592. Springer, Cham. https://doi.org/10.1007/978-3-319-28270-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28270-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28269-5

  • Online ISBN: 978-3-319-28270-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics