Abstract:
Text clustering method is the main measure for news topic extraction and trend tracking. For condensing advantages and disadvantages of the hierarchy Clustering algorithm...Show MoreMetadata
Abstract:
Text clustering method is the main measure for news topic extraction and trend tracking. For condensing advantages and disadvantages of the hierarchy Clustering algorithm and K-means algorithm on text Clustering, proposed a new news text Clustering optimization algorithm, called QH-K(Quick Hierarchical K-means Clustering)algorithm. First, Using word2vector model to train texts into word vector. Then, Using proposed hierarchical clustering algorithm to cluster text, get the initial number of clustering and the clustering center by proposed validity index ST. Finally, Using k-means algorithm to optimize the clustering results and improve the final clustering effect. The experiments show that the accuracy, recall rate and F value of QH-K clustering optimization algorithm are all improved to compared with the traditional algorithm. In addition, the running time of the algorithm is also reduced
Published in: 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS)
Date of Conference: 23-25 November 2018
Date Added to IEEE Xplore: 14 April 2019
ISBN Information: