Conferences >2019 25th International Confe...

Hierarchical Topic Modeling for Urdu Text Articles

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Digital text is increasing rapidly on the Internet with the excessive use of social media. For this reason, it is very challenging to extract effective information from t...View more

Metadata

Abstract:

Digital text is increasing rapidly on the Internet with the excessive use of social media. For this reason, it is very challenging to extract effective information from the digital text due its high dimensionality, sparseness and big data. In this paper, we study the powerful nonparametric Bayesian topic model which is Hierarchical Latent Dirichlet Allocation (hLDA). We deal the issue of learning topics hierarchies from Urdu text data. The presented Topic Model for Urdu is combined with preprocessing activities, hLDA model, and Gibbs Sampling (GS) algorithm. We present hLDA base topic model called Urdu Hierarchical Latent Dirichlet Allocation (uhLDA). Empirical study showed that uhLDA effectively learns the topics hierarchies from 5000 Urdu text documents. Furthermore, we evaluated the results using Pointwise Mutual information (PMI) and it shows that uhLDA outperforms as compared to existing standard topic model LDA.

Published in: 2019 25th International Conference on Automation and Computing (ICAC)

Date of Conference: 05-07 September 2019

Date Added to IEEE Xplore: 11 November 2019

ISBN Information:

DOI: 10.23919/IConAC.2019.8895047

Conference Location: Lancaster, UK

Contents

References is not available for this document.

Hierarchical Topic Modeling for Urdu Text Articles

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Hierarchical Topic Modeling for Urdu Text Articles

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?