Hybrid Data Clustering Based on Dependency Structure and Gibbs Sampling

Wang, Shuang-Cheng; Li, Xiao-Lin; Tang, Hai-Yan

doi:10.1007/11941439_138

Hybrid Data Clustering Based on Dependency Structure and Gibbs Sampling

Shuang-Cheng Wang^20,21,
Xiao-Lin Li²² &
Hai-Yan Tang²¹

Conference paper

3423 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Abstract

A new method for data clustering is presented in this paper. It can cluster data set with both continuous and discrete data effectively. By using this method, the values of cluster variable are viewed as missing data. At first, the missing data are initialized randomly. All those data are revised through the iteration by combining Gibbs sampling with the dependency structure that is built according to prior knowledge or built as star-shaped structure alternatively. A penalty coefficient is introduced to extend the MDL scoring function and the optimal cluster number is determined by using the extended MDL scoring function and the statistical methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, S.M., Hsiao, H.R.: A New Method to Estimate Null Values in Relational Database Systems Based on Automatic Clustering Techniques. Information Sciences: an International Journal 69, 1–2 (2005)
Google Scholar
Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., Freeman, D.: AutoClass: A Bayesian Classification System. In: Laird, J. (ed.) Proceedings of the 15th International Conference on Machine Learning, pp. 54–64. Morgan Kaufmann, San Mateo (1988)
Google Scholar
Cheeseman, P., Stutz, J.: Bayesian Classification (AutoClass): Theory and Results. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.), pp. 153–180. AAAI/MIT Press, Cambridge (1996)
Google Scholar
Geman, S., Geman, D.: Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721–742 (1984)
Article MATH Google Scholar
Mao, S.S., Wang, J.L., Pu, X.L.: Advanced Mathematical Statistics, 1st edn., pp. 401–459. China Higher Education Press, Beijing, Springer, Berlin (1998)
Google Scholar
Lam, W., Bacchus, F.: Learning Bayesian Belief Networks: An Approach Based on the MDL Principle. Computational Intelligence 4, 269–293 (1994)
Article Google Scholar
Domingos, P., Pazzani, M.: On the Optimality of the Simple Bayesian Classifier Under Zero-one Loss. Machine Learning 130, 2–3 (1997)
Google Scholar
Murphy, S.L., Aha, D.W.: UCI Repository of Machine Learning Databases, http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Department of Information Science, Shanghai Lixin University of Commerce, Shanghai, China
Shuang-Cheng Wang
China Lixin Risk Management Research Institute, Shanghai Lixin University of Commerce, Shanghai, China
Shuang-Cheng Wang & Hai-Yan Tang
National Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Xiao-Lin Li

Authors

Shuang-Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Lin Li
View author publications
You can also search for this author in PubMed Google Scholar
Hai-Yan Tang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DisPRR, National ICT Australia Ltd, QLD, Australia
Abdul Sattar
School of Computing, University of Tasmania, Sandy Bay, 7005, Tasmania, Australia
Byeong-ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, SC., Li, XL., Tang, HY. (2006). Hybrid Data Clustering Based on Dependency Structure and Gibbs Sampling. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_138

Download citation

DOI: https://doi.org/10.1007/11941439_138
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics