skip to main content
research-article

Semantic Template-based Convolutional Neural Network for Text Classification

Authors Info & Claims
Published:20 November 2023Publication History
Skip Abstract Section

Abstract

We propose a semantic template-based distributed representation for the convolutional neural network called Semantic Template-based Convolutional Neural Network (STCNN) for text categorization that imitates the perceptual behavior of human comprehension. STCNN is a highly automatic approach that learns semantic templates that characterize a domain from raw text and recognizes categories of documents using a semantic-infused convolutional neural network that allows a template to be partially matched through a statistical scoring system. Our experiment results show that STCNN effectively classifies documents in about 140,000 Chinese news articles into predefined categories by capturing the most prominent and expressive patterns and achieves the best performance among all compared methods for Chinese topic classification. Finally, the same knowledge can be directly used to perform a semantic analysis task.

REFERENCES

  1. [1] Lewis D. D. and Ringuette M.. 1994. A comparison of two learning algorithms for text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval. 8193.Google ScholarGoogle Scholar
  2. [2] Ahmed A., Al-Masri N., Abu Sultan Yousef Y. S., Akkila A. N., Almasri A., Mahmoud A. Y., Zaqout I. S., and Abu-Naser S. S.. 2019. Knowledge-Based systems survey. Int. J. Acad. Eng. Res. 3, 7 (2019), 122.Google ScholarGoogle Scholar
  3. [3] Nalepa G. J.. 2016. Diversity of rule-based approaches: Classic systems and recent applications. Avant: Trends Interdisc. Studies 7, 2 (2016), 104116.Google ScholarGoogle Scholar
  4. [4] Finnemann N. O.. 2000. Rule-based and rule-generating systems. In: Andersen P. B., Emmeche Claus, Finnemann N. O., and Christiansen P. V. (Eds.). Downward Causation. University of Aarhus Press, Aarhus, Denmark, 278301.Google ScholarGoogle Scholar
  5. [5] Rajczi Alex. 2016. On the Incoherence Objection to Rule-Utilitarianism. Ethical Theory Moral Pract. 19, 4 (2016), 857876.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Chiticariu L., Li Y., and Reiss F.. 2013. Rule-based information extraction is dead! Long live rule-based information extraction systems! In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 827832.Google ScholarGoogle Scholar
  7. [7] Zhang D. and Zhou L.. 2004. Discovering golden nuggets: Data mining in financial application. IEEE Trans. Syst., Man, Cybernet. C (Appl. Rev.) 34, 4 (2004), 513522. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Ozbayoglu A. M., Gudelek M. U., and Sezer O. B.. 2020. Deep learning for financial applications: A survey. Appl. Soft Comput. 93 (2020), 106384. Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Malhotra R.. 2015. A systematic review of machine learning techniques for software fault prediction. Appl. Soft Comput. 27 (2015), 504518.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Sebastiani F.. 2002. Machine learning in automated text categorization, ACM Comput. Surveys 34, 1 (2002), 147.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Collobert R., Weston J., Bottou L., Karlen M., Kavukcuoglu K., and Kuksa P.. 2011. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12 (Aug. 2011), 24932537.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Guerra-Montenegro J., Sanchez-Medina J., Laña I., Sanchez-Rodriguez D., Alonso-Gonzalez I., and Del Ser J.. 2021. Computational Intelligence in the hospitality industry: A systematic literature review and a prospect of challenges. Appl. Soft Comput. (2021), 107082. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Gunning D. and Aha D. W.. 2019. Darpa's explainable artificial intelligence program. AI Mag. 40, 2 (2019), 4458.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Arrieta A. B., Díaz-Rodríguez N., Del Ser J., Bennetot A., Tabik S., Barbado A., García S., Gil-López S., Molina D., Benjamins R., Chatilaf R., and Herrerag F.. 2020. Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Info. Fusion 58 (2020), 82115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Evans R.. 2001. Applying machine learning toward an automatic classification of it. Lit. Ling. Comput. 16, 1 (2001), 4558.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Borko H. and Bernick M. D.. 1962. Automatic document classification. Technical Report TM-771, System Development Corporation, Santa Monica, CA.Google ScholarGoogle Scholar
  17. [17] Sebastiani F.. 2005. Text categorization. In Encyclopedia of Database Technologies and Applications, IGI Global, 683687.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Li Y., Zhang J., and Hu D.. 2010. Text clustering based on domain ontology and latent semantic analysis. In Proceedings of the International Conference on Asian Language Processing, IEEE, 219222.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Yessenalina A., Yue Y., and Cardie C.. 2010. Multi-level structured models for document-level sentiment classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 10461056.Google ScholarGoogle Scholar
  20. [20] Theodoridis S. and Koutroumbas K.. 2009. Template matching. In Pattern Recognition, 4th ed. Sergios Theodoridis, Konstantinos Koutroumbas (Eds.). Academic Press, 481519. Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Chambers N. and Jurafsky D.. 2010. Improving the use of pseudo-words for evaluating selectional preferences. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 445453.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Xie Y., Zhang W., Li C., Lin S., Qu Y., and Zhang Y.. 2013. Discriminative object tracking via sparse representation and online dictionary learning, IEEE Trans. Cybernet. 44, 4 (2013), 539553.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Khan A., Salim N., and Kumar Y. J.. 2015. A framework for multi-document abstractive summarization based on semantic role labelling. Appl. Soft Comput. 30 (2015), 737747.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Robinson K. A.. 1979. An entity/event data modelling method. Comput. J. 22, 3 (1979), 270281.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Agüero-Torales M. M., Salas J. I. A., and López-Herrera A. G.. 2021. Deep learning and multilingual sentiment analysis on social media data: An overview. Appl. Soft Comput. (2021), 107373.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Edmundson H. P.. 1969. New methods in automatic extracting. J. ACM 16, 2 (1969), 264285.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Bengio Y., Courville A., and Vincent P.. 2013. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 8 (2013), 17981828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Hastie T., Tibshirani R., and Friedman J.. 2008. The element of statistical learning: Data mining, inference and prediction. In Springer Series in Statistics (2nd ed.). Springer, 764.Google ScholarGoogle Scholar
  29. [29] Mikolov T., Chen K., Corrado G., and Dean J.. 2013. Efficient estimation of word representations in vector space. In Proceedings of International Conference of Learning Representations Workshop.Google ScholarGoogle Scholar
  30. [30] Le Q. and Mikolov T.. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 11881196.Google ScholarGoogle Scholar
  31. [31] Brown P. F., Della Pietra V. J., Desouza P. V., Lai J. C., and Mercer R. L.. 1992. Class-based n-gram models of natural language. Comput. Linguist. 18, 4 (1992), 467480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] van der Lee C., Krahmer E., and Wubben S.. 2018. Automated learning of templates for data-to-text generation: comparing rule-based, statistical and neural methods. In Proceedings of the 11thInternational Conference on Natural Language Generation. 3545.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Hsieh Y.-L., Chang Y.-C., Huang Y.-J., Yeh S.-H., Chen C.-H., and Hsu W.-L.. 2017. MONPA: Multi-objective named-entity and part-of-speech annotator for Chinese using recurrent neural network. In Proceedings of the 8th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 8085.Google ScholarGoogle Scholar
  34. [34] Dunning T. E.. 1993. Accurate methods for the statistics of surprise and coincidence. Comput. Linguist. 19, 1 (1993), 6174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Forman G.. 2003. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3 (2003), 12891305.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Kou G., Yang P., Peng Y., Xiao F., Chen Y., and Alsaadi F. E.. 2020. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl. Soft Comput. 86 (2020), 105836.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Yu J., Bohnet B., and Poesio M.. 2020. Named entity recognition as dependency parsing. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 64706476Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Bashaddadh O. M. A. and Mohd M.. 2011. Topic detection and tracking interface with named entities approach. In Proceedings of the International Conference on Semantic Technology and Information Retrieval. IEEE, 215219.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Manning C. and Schütze H.. 1999. Foundations of Statistical Natural Language Processing. MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Lovász L.. 1993. Random walks on graphs. Combinatorics, Paul Erdos is Eighty 2, 1–46 (1993), 4.Google ScholarGoogle Scholar
  41. [41] Li Y. and Zhang Z.-L.. 2010. Random walks on digraphs, the generalized digraph laplacian and the degree of asymmetry. In Proceedings of the International Workshop on Algorithms and Models for the Web-Graph. 7485.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Ricardo B.-Y. and Berthier R.-N.. 2011. Modern Information Retrieval: The Concepts and Technology Behind Search. Addison-Wesley Professional, New Jersey.Google ScholarGoogle Scholar
  43. [43] Miller G. A. and Charles W. G.. 1991. Contextual correlates of semantic similarity. Lang. Cogn. Process. 6, 1 (1991), 128.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Chang Y.-C., Hsieh Y.-L., Chen C.-C., Liu C., Lu C.-H., and Hsu W.-L.. 2014. Semantic frame-based statistical approach for topic detection. In: Proceedings of the 28th Pacific Asia Conference on Language, Information and Computation (PACLIC’14). 7584.Google ScholarGoogle Scholar
  45. [45] Manning C. D., Raghavan P., and Schütze H.. 2008. Introduction to Information Retrieval. Cambridge University Press.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Joulin A., Grave E., Bojanowski P., Douze M., Jégou H., and Mikolov T.. 2016. FastText.zip: Compressing text classification models. Retrieved from https://arXiv:1612.03651Google ScholarGoogle Scholar
  47. [47] Kim Y.. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14). 17461751.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Wu H., Zhang Z., and Wu Q.. 2021. Exploring syntactic and semantic features for authorship attribution. Appl. Soft Comput. 111 (2021), 107815.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Petersen S. E and Ostendorf M.. 2007. Text simplification for language learners: A corpus analysis. In Proceedings of the Workshop on Speech and Language Technology in Education.Google ScholarGoogle Scholar
  50. [50] McKeown M. G., Beck I. L., Omanson R. C., and Pople M. T.. 1985. Some effects of the nature and frequency of vocabulary instruction on the knowledge and use of words, Read. Res. Quart. (1985), 522535.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Freund Z. K.. 2014. A corpus analysis of grant guidelines: The education and training programme word list (ETPWL). J. Teach. English Spec. Acad. Purp. 2, 3 (2014), 501514.Google ScholarGoogle Scholar
  52. [52] Johnson R. and Zhang T.. 2015. Effective use of word order for text categorization with convolutional neural networks. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 103112. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Wang D., Tiwari P., Garg S., Zhu H., and Bruza P.. 2020. Structural block driven enhanced convolutional neural representation for relation extraction. Appl. Soft Comput. 86 (2020), 105913.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Fellbaum C.. 1998. A semantic network of English verbs. WordNet: An Electronic Lexical Database 3 (1998), 153178.Google ScholarGoogle Scholar
  55. [55] Sarica S. and Luo J.. 2021. Design knowledge representation with technology semantic network. In Proc. Design Soc. 1 (2021), 10431052.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Devlin J., Chang M.-W., Lee K., and Toutanova K.. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google ScholarGoogle Scholar
  57. [57] Hochreiter S. and Schmidhuber J.. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Kalchbrenner N., Grefenstette E., and Blunsom P.. 2014. A Convolutional Neural Network for Modelling Sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Kim Y.. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'14).Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Lecun Y., Bottou L., Bengio Y., and Haffner P.. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 22782324. Google ScholarGoogle ScholarCross RefCross Ref
  61. [61] Lee J., Yoon W., Kim S., Kim D., Kim S., So C. H., and Kang J.. 2019. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 4 (2019), 12341240. Google ScholarGoogle ScholarCross RefCross Ref
  62. [62] Lin Y., Meng Y., Sun X., Han Q., Kuang K., Li J., and Wu F.. 2021. BertGCN: Transductive text classification by combining GNN and BERT. In Proceedings of the Association for Computational Linguistics (ACL-IJCNLP'21).Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Liu Z., Huang D., Huang K., Li Z., and Zhao J.. 2021. FinBERT: A pre-trained financial language representation model for financial text mining. In Proceedings of the 29th International Joint Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  64. [64] Nguyen D. Q., Vu T., and Nguyen A. Tuan. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: System Demonstrations.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Shen Y., He X., Gao J., Deng L., and Mesnil G.. 2014. Learning semantic representations using convolutional neural networks for web search. In Proceedings of the 23rd International Conference on World Wide Web. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Sutskever I., Vinyals O., and Le Q. V.. 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Tai K. S., Socher R., and Manning C. D.. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser Ł., and Polosukhin I.. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. [69] Yao L., Mao C., and Luo Y.. 2019. Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence. 73707377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. [70] Yih W.-t., He X., and Meek C.. 2014. Semantic parsing for single-relation question answering. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Zhou Y., Li J., Chi J., Tang W., and Zheng Y.. 2022. Set-CNN: A text convolutional neural network based on semantic extension for short text classification. Knowl.-Based Syst. 257 (2022), 109948. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Chang Y.-C., Chen C.-C., Hsieh Y.-L., Chen C. C, and Hsu W.-L.. 2015. Linguistic template extraction for recognizing reader-emotion and emotional resonance writing assistance. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 775780.Google ScholarGoogle ScholarCross RefCross Ref
  73. [73] Gao Shang, Alawad Mohammed, Todd Young M., Gounley John, Schaefferkoetter Noah, Jun Yoon Hong, Wu Xiao-Cheng, Durbin Eric B., Doherty Jennifer, Stroup Antoinette, Coyle Linda, and Tourassi Georgia. 2021. Limitations of transformers on clinical text classification. IEEE J. Biomed. Health Inform. 25, 9 (2021), 35963607.Google ScholarGoogle ScholarCross RefCross Ref
  74. [74] Yeh Wen-Chao, Hsieh Yu-Lun, Chang Yung-Chun, and Hsu Wen-Lian. 2022. Multifaceted assessments of traditional chinese word segmentation tool on large corpora. In Proceedings of the 34th Conference on Computational Linguistics and Speech Processing. 193199.Google ScholarGoogle Scholar
  75. [75] Liu Yinhan, Ott Myle, Goyal Naman, Du Jingfei, Joshi Mandar, Chen Danqi, Levy Omer, Lewis Mike, Zettlemoyer Luke, and Stoyanov Veselin. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach. Retrieved from https://arxiv.org/abs/1907.11692Google ScholarGoogle Scholar
  76. [76] Sanh Victor, Debut Lysandre, Chaumond Julien, and Wolf Thomas. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. Retrieved from https://arxiv.org/abs/1910.01108?context=csGoogle ScholarGoogle Scholar

Index Terms

  1. Semantic Template-based Convolutional Neural Network for Text Classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 11
      November 2023
      255 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3633309
      • Editor:
      • Imed Zitouni
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 November 2023
      • Online AM: 16 October 2023
      • Accepted: 19 September 2023
      • Revised: 1 August 2023
      • Received: 26 January 2023
      Published in tallip Volume 22, Issue 11

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)114
      • Downloads (Last 6 weeks)10

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text