Abstract:
This paper introduces three classic models of statistical topic models: Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet...Show MoreMetadata
Abstract:
This paper introduces three classic models of statistical topic models: Latent Semantic Indexing (LSI), Probabilistic Latent Semantic Indexing (PLSI) and Latent Dirichlet Allocation (LDA). Then a method of text classification based on LDA model is briefly described, which uses LDA model as a text representation method. Each document means a probability distribution of fixed latent topic sets. Next, Support Vector Machine (SVM) is chose as classification algorithm. Finally, the evaluation parameters in classification system of LDA with SVM are higher than other two methods which are LSI with SVM and VSM with SVM, showing a better classification performance.
Date of Conference: 26-28 July 2011
Date Added to IEEE Xplore: 15 September 2011
ISBN Information: