Skip to main content

Multi-label Text Classification Using Semantic Features and Dimensionality Reduction with Autoencoders

  • Conference paper
  • First Online:
Language, Data, and Knowledge (LDK 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10318))

Included in the following conference series:

Abstract

Feature selection is of vital concern in text classification to reduce the high dimensionality of feature space. The wide range of statistical techniques which have been proposed for weighting and selecting features suffer from loss of semantic relationship among concepts and ignoring of dependencies and ordering between adjacent words. In this work we propose two techniques for incorporating semantics in feature selection. Furthermore, we use autoencoders to transform the features into a reduced feature space in order to analyse the performance penalty of feature extraction. Our intensive experiments, using the EUR-lex dataset, showed that semantic-based feature selection techniques significantly outperform the Bag-of-Word (BOW) frequency based feature selection method with term frequency/inverse document frequency (TF-IDF) for features weighting. In addition, after an aggressive dimensionality reduction of original features with a factor of 10, the autoencoders are still capable of producing better features compared to BOW with TF-IDF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2016)

    Google Scholar 

  2. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391 (1990)

    Article  Google Scholar 

  3. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)

    Article  Google Scholar 

  4. Sebastiani, F.: Text categorization. In: Encyclopedia of Database Technologies and Applications, pp. 683–687. IGI Global (2005)

    Google Scholar 

  5. Fodor, I.K.: A survey of dimension reduction techniques, Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, vol. 9, pp. 1–18 (2002)

    Google Scholar 

  6. Cunningham, P.: Dimension reduction. In: Cord, M., Cunningham, P. (eds.) Machine Learning Techniques for Multimedia, pp. 91–112. Springer, Heidelberg (2008)

    Google Scholar 

  7. Pudil, P., Novovičová, J.: Novel methods for feature subset selection with respect to problem knowledge. In: Liu, H., Motoda, H. (eds.) Feature Extraction, Construction and Selection, vol. 453, pp. 101–116. Springer, New York (1998)

    Google Scholar 

  8. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    Google Scholar 

  9. Ogura, H., Amano, H., Kondo, M.: Feature selection with a measure of deviations from poisson in text categorization. Expert Syst. Appl. 36(3), 6826–6832 (2009)

    Article  Google Scholar 

  10. Soucy, P., Mineau, G.W.: Beyond TFIDF weighting for text categorization in the vector space model. In: IJCAI, vol. 5, pp. 1130–1135 (2005)

    Google Scholar 

  11. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, vol. 97, pp. 412–420 (1997)

    Google Scholar 

  12. Masuyama, T., Nakagawa, H.: Cascaded feature selection in SVMs text categorization. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 588–591. Springer, Heidelberg (2003). doi:10.1007/3-540-36456-0_65

    Chapter  Google Scholar 

  13. Lewis, D.D.: Feature selection and feature extraction for text categorization. In: Proceedings of the Workshop on Speech and Natural Language, pp. 212–217. Association for Computational Linguistics (1992)

    Google Scholar 

  14. Liu, Y., Loh, H.T., Lu, W.F.: Deriving taxonomy from documents at sentence level. In: Prado, H.A.D., Ferneda, E. (eds.) Emerging Technologies of Text Mining: Techniques and Applications, Idea, Hershey, PA, pp. 99–119 (2007)

    Google Scholar 

  15. Fürnkranz, J.: A study using n-gram features for text categorization. Austrian Res. Inst. Artif. Intell. 3, 1–10 (1998)

    Google Scholar 

  16. Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)

    Google Scholar 

  17. Khan, A., Baharudin, B., Khan, K.: Semantic based features selection and weighting method for text classification. In: 2010 International Symposium in Information Technology (ITSim), vol. 2, pp. 850–855. IEEE (2010)

    Google Scholar 

  18. Janik, M., Kochut, K.: Training-less ontology-based text categorization. In: Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2008) at the 30th European Conference on Information Retrieval, ECIR, vol. 20 (2008)

    Google Scholar 

  19. Chang, Y.-H., Huang, H.-Y.: An automatic document classifier system based on Naive Bayes classifier and ontology. In: 2008 International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3144–3149. IEEE (2008)

    Google Scholar 

  20. Chua, S., Kulathuramaiyer, N.: Feature selection based on semantics. In: Elleithy, K. (ed.) Innovations and Advanced Techniques in Systems, Computing Sciences and Software Engineering, pp. 471–476. Springer, Dordrecht (2008)

    Google Scholar 

  21. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3), 37–52 (1987)

    Article  Google Scholar 

  22. Jolliffe, I.: Principal Component Analysis. Wiley Online Library, Aberdeen (2002)

    Google Scholar 

  23. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  24. Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    Google Scholar 

  25. Lacoste-Julien, S., Sha, F., Jordan, M.I.: DiscLDA: discriminative learning for dimensionality reduction and classification. In: Advances in Neural Information Processing Systems, pp. 897–904 (2009)

    Google Scholar 

  26. Thonnard, O., Mees, W., Dacier, M.: Addressing the attack attribution problem using knowledge discovery and multi-criteria fuzzy decision-making. In: Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics, pp. 11–21. ACM (2009)

    Google Scholar 

  27. Van Der Maaten, L.: Fast optimization for t-SNE. In: 2010 Workshop on Challenges in Data Visualization Neural Information Processing Systems (NIPS), vol. 100 (2010)

    Google Scholar 

  28. Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps, and spectral clustering. MIJ 1, 2 (2003)

    Google Scholar 

  29. Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14(14), 585–591 (2001)

    Google Scholar 

  30. Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  31. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational linguistics, vol. 2, pp. 539–545. Association for Computational Linguistics (1992)

    Google Scholar 

  32. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.-A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    Google Scholar 

  33. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  34. Zhang, M.-L., Zhou, Z.-H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  MATH  Google Scholar 

  35. (01, 2017). http://www.ke.tu-darmstadt.de/resources/eurlex

  36. Loza Mencía, E., Fürnkranz, J.: Efficient multilabel classification algorithms for large-scale problems in the legal domain. In: Francesconi, E., Montemagni, S., Peters, W., Tiscornia, D. (eds.) Semantic Processing of Legal Texts. LNCS, vol. 6036, pp. 192–215. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12837-0_11

    Chapter  Google Scholar 

  37. Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The stanford coreNLP natural language processing toolkit. In: ACL (System Demonstrations), pp. 55–60 (2014)

    Google Scholar 

  38. Seitner, J., Bizer, C., Eckert, K., Faralli, S., Meusel, R., Paulheim, H., Ponzetto, S.: A large database of hypernymy relations extracted from the web. In: Proceedings of the Language Resources and Evaluation Conference, Portoroz, Slovenia, 10th edn. (2016)

    Google Scholar 

  39. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, New York (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wael Alkhatib .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Alkhatib, W., Rensing, C., Silberbauer, J. (2017). Multi-label Text Classification Using Semantic Features and Dimensionality Reduction with Autoencoders. In: Gracia, J., Bond, F., McCrae, J., Buitelaar, P., Chiarcos, C., Hellmann, S. (eds) Language, Data, and Knowledge. LDK 2017. Lecture Notes in Computer Science(), vol 10318. Springer, Cham. https://doi.org/10.1007/978-3-319-59888-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59888-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59887-1

  • Online ISBN: 978-3-319-59888-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics