Skip to main content

A Novel Graph Partitioning Criterion Based Short Text Clustering Method

  • Conference paper
  • First Online:
Intelligent Computing Methodologies (ICIC 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9773))

Included in the following conference series:

Abstract

A novel clustering method based on spectral clustering theory and spectral cut standard is proposed via analyzing the characteristics of short text and the defects of the existing clustering algorithms. First of all, a weighted undirected graph is created according to spectral clustering theory, similarity between node and node is calculated on graph, and a symmetrical documents similarity matrix is constructed, which provides all information for the clustering algorithm. Inspired by Greedy strategy, we utilize prim to develop PrimMAE algorithm for the purpose of partitioning graph into two parts, in which RMcut is termination condition of partitioning process, and then it is fed into CASC algorithm to cut the documents set iteratively. Ultimately, high quality clustering results demonstrate the effectiveness of the new clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, H., Chen, B., Xu, W.: Short text feature extraction and clustering for web topic mining. In: Proceedings of IEEE 3rd International Conference on Semantics Knowledge and Grid (SKG 2007), pp. 382–385 (2007)

    Google Scholar 

  2. Sun, Q., Wang, Q., Qiao, H.: The algorithm of short message hot topic detection based on feature. Inf. Technol. J. 8(2), 236–240 (2009)

    Article  Google Scholar 

  3. Tang, J., Wang, X., Gao, H., et al.: Enriching short text representation in microblog for clustering. Front. Comput. Sci. 6(1), 88–101 (2012)

    MathSciNet  MATH  Google Scholar 

  4. Wang, L., Jia, Y., Han, W.: Instant message clustering based on extended vector space model. In: Kang, L., Liu, Y., Zeng, S. (eds.) ISICA 2007. LNCS, vol. 4683, pp. 435–443. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Yin, J., Wang, J.: A dirichlet multinomial mixture model-based approach for short text clustering. In: SIGKDD, pp. 233–242. ACM (2014)

    Google Scholar 

  6. Peng, J., Yang, D.Q., Tang, S.W.: A novel text clustering algorithm based on inner product space model of semantic. Chin. J. Comput. 30(8), 1354–1363 (2007)

    Google Scholar 

  7. Xing, X.S., Pan, J., Jiao, L.C.: A novel K-means clustering based on the immune programming algorithm. Chin. J. Comput. 26(5), 605–610 (2003)

    MathSciNet  Google Scholar 

  8. Wang, Y., Wu, L.H., Shao, H.Y.: Clusters merging method for short texts clustering. Open J. Soc. Sci. 2, 186–192 (2014)

    Article  Google Scholar 

  9. Chen, J.C., Hu, G.W., Yang, Z.H., et al.: Text clustering based on global center-determination. Comput. Eng. Appl. 47, 147–150 (2011)

    Google Scholar 

  10. Ni, X., Quan, X., Lu, Z., et al.: Short text clustering by finding core terms. Knowl. Inf. Syst. 27(3), 345–365 (2011)

    Article  Google Scholar 

  11. Qiu, Y., Wang, L., Shao, L.: User interest modeling approach based on short text of micro-blog. Comput. Eng. 40(2), 275–279 (2014)

    Google Scholar 

  12. Man, Y.: Feature extension for short text categorization using frequent term sets. In: Proceedings of 2nd International Conference on Information Technology and Quantitative Management, ITQM 2014. Procedia Computer Science, vol. 31, pp. 663– 670 (2014)

    Google Scholar 

  13. Bach, F.R., Jordan, M.I.: Learning spectral clustering. Adv. Neural Inf. Process. Syst. 7(2), 2006 (2004)

    Google Scholar 

  14. Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2(14), 849–856 (2002)

    Google Scholar 

  15. Li, J., Tian, Y., Huang, T., et al.: Multi-polarity text segmentation using graph theory. In: International Conference on Information Processing (ICIP), San Diego, American, pp. 3008–3011. IEEE (2008)

    Google Scholar 

  16. Shi, J., Malik, J.: Normalized cuts and image segmentation. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  17. Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. Knowl. Data Eng. 17(12), 1624–1637 (2005)

    Article  Google Scholar 

  18. Zhao, Y., Karypis, G.: Criterion functions for document clustering: experiments and analysis. Mach. Learn. 55(3), 311–331 (2004)

    Article  MATH  Google Scholar 

  19. Hartigan, J.A., Wong, M.A.: Algorithm as 136: a K-means clustering algorithm. Appl. Stat. 28(1), 100–108 (1979)

    Article  MATH  Google Scholar 

  20. Chang, P., Feng, N., Ma, H.: Document clustering algorithm based on word co-occurrence. Comput. Eng. 38(2), 213–214, 220 (2012)

    Google Scholar 

  21. He, T., Cao, X.-B., Tan, H.: An immune based algorithm for Chinese network short text clustering. Acta Autom. Sin. 35(7), 896–902 (2009)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under Grant No. 61272088, the Natural Science Foundation for Young Scientists of Gansu Province, China (Grant No. 1308TJY085, 145RJYA259), Youth Teacher Scientific Capability Promoting Project of Northwest Normal University (No. NWNU-LKQN-13-23).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to XiaoHong Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Li, X., He, T., Ran, H., Lu, X. (2016). A Novel Graph Partitioning Criterion Based Short Text Clustering Method. In: Huang, DS., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2016. Lecture Notes in Computer Science(), vol 9773. Springer, Cham. https://doi.org/10.1007/978-3-319-42297-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42297-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42296-1

  • Online ISBN: 978-3-319-42297-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics