Skip to main content

Top-k Distance-Based Outlier Detection on Uncertain Data

  • Conference paper
  • First Online:
  • 1749 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9483))

Abstract

In recent years, more researchers are studying uncertain data with the development of Internet of Things. The technique of outlier detection is one of the significant branches of emerging uncertain database. In existing algorithms, parameters are difficult to set, and expansibility is poor when used in large data sets. Aimed at these shortcomings, a top-k distance-based outlier detection algorithm on uncertain data is proposed. This algorithm applies dynamic programming theory to calculate outlier possibility and greatly improves the efficiency. Furthermore, an efficient virtual grid-based optimization approach is also proposed to greatly improve our algorithm’s efficiency. The theoretical analysis and experimental results fully prove that the algorithm is feasible and efficient.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Zhang, J., Zulkernine, M.: Anomaly based network intrusion detection with unsupervised outlier detection. In: IEEE International Conference on Communications, ICC 2006, pp. 2388–2393. IEEE (2006)

    Google Scholar 

  2. Alaydie, N., Fotouhi, F., Reddy, C.K., Soltanian-Zadeh, H.: Noise and outlier filtering in heterogeneous medical data sources. In: 2012 23rd International Workshop on Database and Expert Systems Applications, pp. 115–119. IEEE (2010)

    Google Scholar 

  3. Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24rd International Conference on Very Large Data Bases, pp. 392–403. Morgan Kaufmann Publishers Inc. (1998)

    Google Scholar 

  4. Wang, L., Zou, L.: Research on algorithms for mining distance-based outliers. J. Electron. 14, 485–490 (2005)

    Google Scholar 

  5. Han, J., Kamber, M.: Data Mining–Concepts and Techniques 2nd ed. Data Mining Concepts Models Methods & Algorithms Second Edition 10(9),1–18 (2006)

    Google Scholar 

  6. Knorr, E.M., Ng, R.T.: Finding intensional knowledge of distance-based outliers. In: VLDB, pp. 211–222 (1999)

    Google Scholar 

  7. Knorr, E.M., Ng, R.T., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J. — Int. J. Very Large Data Bases 8(3–4), 237–253 (2000)

    Article  Google Scholar 

  8. Aggarwal, C.C., Yu, P.S.: Outlier detection with uncertain data. In: SDM (2008)

    Google Scholar 

  9. Shaikh, S.A., Kitagawa, H.: Distance-based outlier detection on uncertain data of Gaussian distribution. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 109–121. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Shaikh, S.A., Kitagawa, H.: Fast top-k distance-based outlier detection on uncertain data. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 301–313. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  11. Shaikh, S.A., Kitagawa, H.: Top-k outlier detection from uncertain data. Int. J. Autom. Comput. 11(2), 128–142 (2014)

    Article  Google Scholar 

  12. Wang, B., Xiao, G., Yu, H., et al.: Distance-based outlier detection on uncertain data. In: IEEE Ninth International Conference on Computer & Information Technology, pp. 293–298. IEEE (2009)

    Google Scholar 

  13. Wang, B., Yang, X.-C., Wang, G.-R., Ge, Yu.: Outlier detection over sliding windows for probabilistic data streams. J. Comput. Sci. Technol. 25(3), 389–400 (2010)

    Article  Google Scholar 

  14. Abiteboul, S., Kanellakis, P., Grahne, G.: On the representation and querying of sets of possible worlds. In: PODS 2001, pp. 34–48 (1991)

    Google Scholar 

  15. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 15–27. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Dong, J., Cao, M., Huang, G., Ren, J.: Virtual grid-based clustering of uncertain data on vulnerability database. J. Convergence Inf. Technol. 7(20), 429–438 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyuan Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, Y., Zheng, H., Ding, Q. (2015). Top-k Distance-Based Outlier Detection on Uncertain Data. In: Huang, Z., Sun, X., Luo, J., Wang, J. (eds) Cloud Computing and Security. ICCCS 2015. Lecture Notes in Computer Science(), vol 9483. Springer, Cham. https://doi.org/10.1007/978-3-319-27051-7_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27051-7_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27050-0

  • Online ISBN: 978-3-319-27051-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics