Abstract:
In this study, the performance values of Word2Vec, Doc2Vec and FastText algorithms are compared on the text categorization problem based on a semi-supervised learning tec...Show MoreMetadata
Abstract:
In this study, the performance values of Word2Vec, Doc2Vec and FastText algorithms are compared on the text categorization problem based on a semi-supervised learning technique. The impact of some preprocessing techniques are also analyzed on a corpus that contains approximately 5 million Turkish news documents which are both in labeled and unlabeled manners. Naive Bayes, Support Vector Machines, Artificial Neural Networks, Decision Trees and Logistic Regression classification algorithms are used at the classification phase and the obtained results are shared.
Date of Conference: 24-26 April 2019
Date Added to IEEE Xplore: 22 August 2019
ISBN Information:
Print on Demand(PoD) ISSN: 2165-0608