Skip to main content
Log in

Uncertain Data Clustering Based on Probability Distribution in Obstacle Space

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

In the case of current technology, most of the measurements are focused on geometric distance, and the distribution of data is not considered. In order to compensate for this shortcoming of geometric distance measurement, this paper uses the KL distance as the similarity measurement standard for uncertain data, and the DOUD_C algorithm and COUD_C algorithm are proposed respectively in the discrete domain and continuous domain. In order to solve the problem of efficient clustering of the high dimensional data, this paper considers the data structure of grid, and BROUD_C algorithm is proposed. According to the adjacency characteristic of the grid, the cluster process is extended continuously, the algorithm can find clusters of arbitrary shapes, and we can filter a large number of isolated points, it solves the uncertain data clustering problem effectively in the obstacle space. The experimental results show that compared to the OBS_UK_means with VPA and SDA pruning algorithm and FOPTICS algorithm, the clustering performance of BROUD_C algorithm is more significant and CPU has less execution time in the obstacle space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Zhou, A.-Y., Jin, C.-Q., Wang, G.-R., et al. (2009). A survey on the management of uncertain data. Chinese Journal of Computers,32(01), 1–16.

    Article  Google Scholar 

  2. Pourzaferani, M., Barekatain, B., & Dehghani, S. (2018). An enhanced energy-aware cluster-based routing algorithm in wireless sensor networks. Wireless Personal Communications, 98(1), 1605–1635.

    Article  Google Scholar 

  3. Xu, L., Hu, Q., & Zhang, X., et al. (2016). AdaUK-Means: An ensemble boosting clustering algorithm on uncertain objects. In Chinese conference on pattern recognition (pp. 27–41). Singapore: Springer.

  4. Liao, K.-T., Liu, C.-M. (2017). An effective clustering mechanism for uncertain data mining using centroid boundary in UKmeans. In Computer symposium (pp. 300–305). IEEE.

  5. Kriegel, H. P, & Pfeifle, M. (2005). Hierarchical density-based clustering of uncertain data (pp. 689–692).

  6. Erdem, A., & Gündem, T. I. (2014). M-FDBSCAN: A multicore density-based uncertain data clustering algorithm. Turkish Journal of Electrical Engineering & Computer Sciences,22(1), 143–154.

    Article  Google Scholar 

  7. Tepwankul, A., & Maneewongwattana, S. (2010). U-DBSCAN: A density-based clustering algorithm for uncertain objects. In: IEEE international conference on data engineering workshops (pp. 136–143). IEEE.

  8. Liu, H., Zhang, X., Zhang, X., et al. (2017). Self-adapted mixture distance measure for clustering uncertain data. Knowledge-Based Systems,126, 33–47.

    Article  Google Scholar 

  9. Ngai, W. K., Kao, B., Chui, C. K., Cheng, R., Chau, M., & Yip, K. Y. (2006). Efficient clustering of uncertain data. In Sixth international conference on data mining (ICDM’06), Hong Kong (pp. 436–445).

  10. Kao, B., Lee, S. D., Lee, F. K. F., et al. (2010). Clustering uncertain data using Voronoi diagrams and R-tree index. IEEE Transactions on Knowledge and Data Engineering,22(9), 1219–1233.

    Article  Google Scholar 

  11. Lin, Y. C., Yang, D. N., & Chen, M. S. (2010). Data selection for exact value acquisition to improve uncertain clustering. Web-age information management (pp. 459–470). Berlin: Springer.

    Google Scholar 

  12. Zhang, J., Papadias, D., & Mouratidis, K., et al. (2004). Spatial queries in the presence of obstacles. In: International conference and proceedings on extending database technology, advances in database technology - EDBT 2004, Heraklion, Crete, Greece, March 14–18, DBLP (pp. 366–384).

  13. Keyan, C. A. O., Wang, G., Han, D., et al. (2012). Clustering algorithm of uncertain data in obstacle space. Journal of Frontiers of Computer Science and Technology,6(12), 1087–1097.

    Google Scholar 

  14. Zhang, X., Du, H., & Yang, T., et al. (2010). A novel spatial clustering with obstacles constraints based on PNPSO and K-Medoids. In Advances in swarm intelligence (pp. 605–610). Berlin: Springer.

  15. Zhou, J., Pan, Y., & Chen, C. L. P., et al. (2017). K-medoids method based on divergence for uncertain data clustering. In IEEE International conference on systems, man, and cybernetics (pp. 2671–2674). IEEE.

  16. Shan, D., & Yang, Z. (2013). Hierarchical clustering analysis method based on the grid with obstacle space. Journal of Digital Information Management,11(2), 207–211.

    Google Scholar 

  17. Xiao, L., & Hung, E. (). An efficient distance calculation method for uncertain objects. In IEEE symposium on computational intelligence and data mining, 2007. CIDM 2007 (pp. 10–17). IEEE.

  18. Xing, C., & Wen, P. (2015). Uncertain data streams clustering algorithm based on grid density and force. Application Research of Computers,32(1), 98–101.

    Google Scholar 

  19. Wang, J. (2014). Research on clustering algorithm for uncertain data based on probability distribution similarity. Xi’an: Xidian University.

    Google Scholar 

  20. Xu, L., Hu, Q., Hung, E., et al. (2015). Large margin clustering on uncertain data by considering probability distribution similarity. Neurocomputing,158(C), 81–89.

    Article  Google Scholar 

  21. Ming, H. (2010). The research on spatial data clustering based on space partition. Wuhan: Wuhan University.

    Google Scholar 

  22. Li, C., Sun, Z., Chen, G., et al. (2004). Kernel density estimation and its application to clustering algorithm construction. Journal of Computer Research and Development,10, 1712–1719.

    Google Scholar 

  23. Yang, C., Duraiswami, R., & Gumerov, N. A., et al. (2003). Improved fast gauss transform and efficient kernel density estimation. In IEEE international conference and proceedings on computer vision (Vol. 1, pp. 664–671). IEEE.

  24. Webb, A. R., Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification (pp. xx + 654). New York: Wiley, ISBN: 0-471-05669-3. (2007). Journal of Classification 24(2):305–307.

  25. Cao, Z., Sun, R., & Li, M. (2014). A method for clustering uncertain data streams based on GMM. Journal of Computer Research and Development,51(S2), 102–109.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No. 61370084; the Science and Technology Research Project of Heilongjiang Provincial Education Department 1253lz004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Wan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wan, J., Cui, M., He, Y. et al. Uncertain Data Clustering Based on Probability Distribution in Obstacle Space. Wireless Pers Commun 111, 2191–2214 (2020). https://doi.org/10.1007/s11277-019-06980-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-019-06980-0

Keywords

Navigation