Skip to main content

Randomizing Greedy Ensemble Outlier Detection with GRASP

  • Conference paper
  • First Online:
Complex, Intelligent, and Software Intensive Systems (CISIS 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 611))

Included in the following conference series:

  • 2212 Accesses

Abstract

Ensemble methods have been recently used in many applications of machine learning in different areas. In this context, outlier detection is an area where recently these methods have received increasing attention. This paper deals with randomization in ensemble methods for outlier detection. We have developed a novel algorithm exploiting stochastic local search heuristics to induce diversity in an ensemble outlier detection algorithm. We exploit the capability of the GRASP heuristic to induce diversity into the search process and to maintain a good balance of exploitation and diversification in building the ensemble. The conducted experiments show interesting improvements over the greedy ensemble method and open the path for novel research in this direction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hawkins, D.: Identification of outliers. Monographs on Applied Probability and Statistics (1980)

    Google Scholar 

  2. Grubbs, F.E.: Procedures for Detecting outlying observations in samples. In: Technometrics 11.1 (1969), pp. 1–21 (1969)

    Google Scholar 

  3. Barnett, V., Lewis, T.: Outliers in Statistical Data, 3rd edn. Wiley, Hoboken (1994)

    MATH  Google Scholar 

  4. Ng, R., Subrahmanian, V.: Stable for Semantics for Probabilistic Deductive Database. University of MaryLand (1990)

    Google Scholar 

  5. Blakeslee, S.: Lost on Earth: Wealth of Data Found in Space. The New York Times (1990)

    Google Scholar 

  6. Ester, M., Kriegel, H-P., Sander, J., Xu, X.: A Density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD 1996 Proceedings of AAAI (1996). Copyright © 1996. www.aaai.org

  7. Ankerst, M., Breunig, M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to identify the clustering structure. In: SIGMOD 1999 Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, Philadelphia, Pennsylvania, USA, 31 May–03 June 1999, pp. 49–60. ACM, New York (1999). ©1999

    Google Scholar 

  8. Breunig, M.M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, 16–18 May 2000, pp. 93–104. ACM (2000)

    Google Scholar 

  9. Tang, J., Chen, Z., Fu, A.W.-C., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Taipei, Taiwan (2002)

    Google Scholar 

  10. Papadimitriou, S., Kitagawa, H., Gibbons, P.B.: LOCI: fast outlier detection using the local correlation integral. In: IEEE 19th International Conference on Data Engineering (ICDE 2003) (2003)

    Google Scholar 

  11. Kriegel, H.-P., Kroger, P., Schubert, E., Zimek, A.: LoOP: local outlier probabilities. In: Proceedings of CIKM, pp. 1649–1652 (2009)

    Google Scholar 

  12. Schubert, E., Zimek, A., Kriegel, H.-P.: Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection. Data Min. Knowl. Disc. (2012). doi:10.1007/s10618-012-0300-z

    MATH  Google Scholar 

  13. Hadi, A.S., Imon, A.H.M.R., Werner, M.: Detection of outliers. Wiley Interdisc. Rev.: Comput. Stat. 1(1), 57–70 (2009)

    Article  Google Scholar 

  14. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Proceedings of European Conference on Principles of Knowledge Discovery and Data Mining, Helsinki, Finland (2002)

    Google Scholar 

  15. Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets, pp. 392–403 (1998)

    Google Scholar 

  16. Orair, G.H., Teixeira, C.H.C., Meira Jr., W., Wang, Y., Parthasarathy, S.: Distance-based outlier detection: consolidation

    Google Scholar 

  17. Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. In: SIGMOD Record, vol. 29, pp. 427–438. ACM (2000)

    Google Scholar 

  18. Vu, N.H., Gopalkrishnan, V.: Efficient Pruning Schemes for Distance-Based Outlier Detection. In: Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, vol. 5782, pp. 160–175

    Google Scholar 

  19. Zhang, K., Hutter, M., Jin, H.: A New Local Distance-Based Outlier Detection (2009)

    Google Scholar 

  20. Approach for scattered real-world data. In: Proceedings of 13th Pacific-Asia Conference on Knowledge and Discovery and Data Mining (PAKDD 2000), pp. 813–822

    Google Scholar 

  21. de Vries, T., Chawla, S., Houle, M.E.: Finding local anomalies in very high dimensional space. In: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM), Sydney, Australia, pp. 128–137 (2010). doi:10.1109/ICDM.2010.151

  22. Keller, F., Müller, E., Böhm, K.: HiCS: high contrast subspaces for density-based outlier ranking. In: Proceedings of the 28th International Conference on Data Engineering (ICDE), Washington, DC (2012)

    Google Scholar 

  23. Zimek, A., Gaudet, R.J.G., Campello, B., Sander, J.: Subsampling for efficient and effective unsupervised outlier detection ensembles. In: Proceedings of the 19th ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD), Chicago, IL, pp. 428–436 (2013). doi:10.1145/2487575.2487676

  24. Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996). © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands

    MATH  Google Scholar 

  25. Schubert, E., Wojdanowski, R., Zimek, A., Kriegel, H.: On evaluation of outlier rankings and outlier scores. In: Proceedings of the SIAM International Conference on Data Mining, (SIAM 2012), Anaheim, CA, pp. 1047–1058 (2012)

    Google Scholar 

  26. Schubert, E.: Generalized and Efficient Outlier Detection for Spatial, Temporal, and High-Dimensional Data Mining. Munchen, Germany (2013)

    Google Scholar 

  27. Hoos, H.H., Stützle, T.: Stochastic Local Search Foundations and Applications. Elsevier Inc., San Francisco (2005)

    MATH  Google Scholar 

  28. Resende, M., Ribeiro, C.: Greedy randomized adaptive search procedures. J. Glob. Optim. 6, 109–133 (1995). Kluwer Academic Publisher, Netherlands

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lediona Nishani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Nishani, L., Biba, M. (2018). Randomizing Greedy Ensemble Outlier Detection with GRASP. In: Barolli, L., Terzo, O. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2017. Advances in Intelligent Systems and Computing, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-61566-0_92

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61566-0_92

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61565-3

  • Online ISBN: 978-3-319-61566-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics