Skip to main content

Optimal Subspace Analysis Based on Information-Entropy Increment

  • Conference paper
  • First Online:
  • 900 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1320))

Abstract

The data structure is becoming more and more complex, and the scale of the data set is getting larger and larger. The strong limitations and instability in the high-dimensional data environment is showed in traditional outlier detection method. To solve the problems, an Optimal subspace Analysis based on Information-entropy Increment is proposed. The concepts such as mutual information and dimensional entropy to re-portrait the indicators that measure the pros and cons of subspace clustering, optimize the objective function of the clustering subspace, and obtain the optimal subspace. According to the idea of dividing the information entropy increment by one, the entropy outlier score is proposed as a metric to detect outliers in the optimal subspace. Finally, experiments verify the effectiveness of the algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gautam, B., Koushik, G., et al.: Outlier detection using neighborhood rank difference. Pattern Recogn. Lett. 60, 24–31 (2015)

    Google Scholar 

  2. Breunig, M.M., Kriegel, H.P., Ng, R.T., et al.: LOF: identifying density-based local outliers. In: Hen, W.D.C., Naught, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACMSIGMOD International Conference on Management of Data, pp. 93–104. ACM, New York (2000)

    Google Scholar 

  3. Kontaki, M., Gounaris, A., Papadopoulos, A.N., et al.: Efficient and flexible algorithms for monitoring distance-based outliers over data streams. Inf. Syst. 55, 37–53 (2016)

    Article  Google Scholar 

  4. Clamond, D., Dutykh, D.: Accurate fast computation of steady two-dimensional surface gravity waves in arbitrary depth. J. Fluid Mech. 844, 491–518 (2018)

    Article  MathSciNet  Google Scholar 

  5. Wu, S., Wang, S.R.: Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. 25(3), 589–602 (2013)

    Article  Google Scholar 

  6. Chi, Z., Dong, L., Wei, F., et al.: InfoXLM: an information-theoretic framework for cross-lingual language model pre-training. 32, 154–159 (2020)

    Google Scholar 

  7. Coccarelli, D., Greenberg, J.A., Mandava, S., et al.: Creating an experimental testbed for information-theoretic analysis of architectures for x-ray anomaly detection. In: SPIE Defense + Security, pp. 69–72 (2017)

    Google Scholar 

  8. Zhang, Z., Qiu, J., Liu, C., et al.: Outlier detection algorithm based on clustering outlier factor and mutual density. Comput. Integr. Manuf. Syst. 2019(9), 2314–2323

    Google Scholar 

  9. Zhang, Z., Fang, C.: Subspace clustering outlier detection algorithm based on cumulative total entropy. Comput. Integr. Manuf. Syst. 21(8), 2249–2256 (2015)

    Google Scholar 

  10. Li, J., Zhang, C., Fan, H.: Swarm intelligent point cloud smoothing and denoising algorithm. Comput. Integr. Manuf. Syst. 17(5), 935–945 (2011)

    Google Scholar 

  11. Department of Inorganic Chemistry, Beijing Normal University, Central China Normal University, Nanjing Normal University. Inorganic Chemistry, pp. 222–227. Higher Education Press, Beijing (2002)

    Google Scholar 

  12. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)

    Article  MathSciNet  Google Scholar 

  13. Liao, L., Luo, B.: Outlier detection algorithm based on dimensional entropy. Comput. Eng. Des. 40(4), 983–988 (2019)

    Google Scholar 

  14. Zhang, J., Sun, Z., Yang, M.: Mass data incremental outlier mining algorithm based on grid and density. Comput. Res. Dev. 48(5), 823–830 (2011)

    Google Scholar 

  15. Feng, J., Sun, Y.F., Cao, C.: An Information Entropy-Based Approach to Outlier Detection in Rough Sets. Pergamon Press Inc, Oxford (2010)

    Google Scholar 

  16. Li, J., Xun, Y.: Strong correlation subspace outlier detection algorithm. Comput. Eng. Des. 38(10), 2754–2758 (2017)

    Google Scholar 

  17. Duan, L., Xiong, D., Lee, J., et al.: A local density based spatial clustering algorithm with noise. In: IEEE International Conference on Systems, pp. 599–603. IEEE (2007)

    Google Scholar 

  18. Ning, J., Chen, L., Luo, Z., Zhou, C., Zeng, H.: The evaluation index of outlier detection algorithm. Comput. Appl. 27(11), 1–8 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongping Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Liu, I., Zhang, Y., Zhang, J., Tian, M. (2021). Optimal Subspace Analysis Based on Information-Entropy Increment. In: Mei, H., et al. Big Data. BigData 2020. Communications in Computer and Information Science, vol 1320. Springer, Singapore. https://doi.org/10.1007/978-981-16-0705-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-0705-9_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-0704-2

  • Online ISBN: 978-981-16-0705-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics