Skip to main content

An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13980))

Included in the following conference series:

Abstract

Data representation plays a crucial role in natural language processing (NLP), forming the foundation for most NLP tasks. Indeed, NLP performance highly depends upon the effectiveness of the preprocessing pipeline that builds the data representation. Many representation learning frameworks, such as Word2Vec, encode input data based on local contextual information that interconnects words. Such approaches can be computationally intensive, and their encoding is hard to explain. We here propose an interpretable representation learning framework utilizing Tsetlin Machine (TM). The TM is an interpretable logic-based algorithm that has exhibited competitive performance in numerous NLP tasks. We employ the TM clauses to build a sparse propositional (boolean) representation of natural language text. Each clause is a class-specific propositional rule that links words semantically and contextually. Through visualization, we illustrate how the resulting data representation provides semantically more distinct features, better separating the underlying classes. As a result, the following classification task becomes less demanding, benefiting simple machine learning classifiers such as Support Vector Machine (SVM). We evaluate our approach using six NLP classification tasks and twelve domain adaptation tasks. Our main finding is that the accuracy of our proposed technique significantly outperforms the vanilla TM, approaching the competitive accuracy of deep neural network (DNN) baselines. Furthermore, we present a case study showing how the representations derived from our framework are interpretable. (We use an asynchronous and parallel version of Tsetlin Machine: available at https://github.com/cair/PyTsetlinMachineCUDA).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Without loss of generality, we consider only one of the classes, thereby simplifying the notation. Any TM class is modeled and processed in the same way.

  2. 2.

    Classification is done using SVM from scikit-learn with default parameters.

  3. 3.

    We use the default scikit-learn parameters for PCA and LDA for feature compression.

  4. 4.

    For BERT representation, the pretrained “BERT Base Uncased” model is fine-tuned with 3 epochs, and hidden states from the \(11^{th}\) layer are visualized.

References

  1. Abeyrathna, K.D., et al.: Massively Parallel and Asynchronous tsetlin Machine Architecture Supporting Almost Constant-Time Scaling. In: The Thirty-eighth International Conference on Machine Learning (ICML), pp. 10–20 (2021)

    Google Scholar 

  2. Bender, E.M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: can language models be too big? In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610–623 (2021)

    Google Scholar 

  3. Bengio, Y.: Neural net language models. Scholarpedia 3(1), 3881 (2008)

    Article  Google Scholar 

  4. Bengio, Y., Ducharme, R., Vincent, P.: A neural probabilistic language model. Adv. Neural Inf. Proc. Syst. 13 (2000)

    Google Scholar 

  5. Berge, G.T., et al.: Using the tsetlin machine to learn human-interpretable rules for high-accuracy text categorization with medical applications. IEEE Access 7, 115134–115146 (2019)

    Article  Google Scholar 

  6. Bhattarai, B., Granmo, O.C., Jiao, L.: Measuring the novelty of natural language text using the conjunctive clauses of a tsetlin machine text classifier. In: Proceedings of ICAART (2021)

    Google Scholar 

  7. Bhattarai, B., Granmo, O.C., Jiao, L.: Convtexttm: an explainable convolutional tsetlin machine framework for text classification. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 3761–3770 (2022)

    Google Scholar 

  8. Bhattarai, B., Granmo, O.C., Jiao, L.: Explainable tsetlin machine framework for fake news detection with credibility score assessment. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference (2022)

    Google Scholar 

  9. Bhattarai, B., Granmo, O.C., Jiao, L.: Word-level human interpretable scoring mechanism for novel text detection using tsetlin machines. Appl. Intell. (2022)

    Google Scholar 

  10. Blakely, C., Granmo, O.: Closed-Form Expressions for Global and Local Interpretation of tsetlin Machines. In: Fujita, H., Selamat, A., Lin, J., Ali, M. (eds.) IEA/AIE 2021. LNCS (LNAI), vol. 12798, pp. 158–172. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-79457-6_14

    Chapter  Google Scholar 

  11. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp. 440–447 (2007)

    Google Scholar 

  12. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  13. Chang, E., Seide, F., Meng, H.M., Chen, Z., Shi, Y., Li, Y.C.: A system for spoken query information retrieval on mobile devices. IEEE Trans. Speech Audio proc. 10(8), 531–541 (2002)

    Article  Google Scholar 

  14. Chen, Q., Zhang, R., Zheng, Y., Mao, Y.: Dual contrastive learning: Text classification via label-aware data augmentation. arXiv preprint arXiv:2201.08702 (2022)

  15. Chen, T., Xu, R., He, Y., Wang, X.: Improving sentiment analysis via sentence type classification using bilstm-CRF and CNN. Expert Syst. Appl. 72, 221–230 (2017)

    Article  Google Scholar 

  16. Craven, M.W., et al.: Learning to extract symbolic knowledge from the world wide web. In: AAAI/IAAI (1998)

    Google Scholar 

  17. Darshana Abeyrathna, K., Granmo, O.C., Zhang, X., Jiao, L., Goodwin, M.: The regression tsetlin machine: a novel approach to interpretable nonlinear regression. Phil. Trans. R. Soc. A 378(2164), 20190165 (2020)

    Article  MathSciNet  Google Scholar 

  18. Debole, F., Sebastiani, F.: An analysis of the relative hardness of reuters-21578 subsets. J. Am. Soc. Inform. Sci. Technol. 56(6), 584–596 (2005)

    Article  Google Scholar 

  19. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)

    Google Scholar 

  20. Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the 2008 international conference on web search and data mining, pp. 231–240 (2008)

    Google Scholar 

  21. Gilpin, L.H., Bau, D., Yuan, B.Z., Bajwa, A., Specter, M., Kagal, L.: Explaining explanations: An overview of interpretability of machine learning. In: 2018 IEEE 5th International Conference on data science and advanced analytics (DSAA), pp. 80–89. IEEE (2018)

    Google Scholar 

  22. Granmo, O.C.: The tsetlin machine-a game theoretic bandit driven approach to optimal pattern recognition with propositional logic. arXiv preprint arXiv:1804.01508 (2018)

  23. Hinton, G., McClelland, J., Rumelhart, D.: Distributed representations. In: The Philosophy of Artificial Intelligence. Oxford University Press, pp. 248–280.(1990)

    Google Scholar 

  24. Ilic, S., Marrese-Taylor, E., Balazs, J.A., Matsuo, Y.: Deep contextualized word representations for detecting sarcasm and irony. In: WASSA EMNLP (2018)

    Google Scholar 

  25. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)

    Google Scholar 

  26. Lei, J., Rahman, T., Shafik, R., Wheeldon, A., Yakovlev, A., Granmo, O.C., Kawsar, F., Mathur, A.: Low-power audio keyword spotting using tsetlin machines. J. Low Power Electron. Appl. 11(2), 18 (2021)

    Article  Google Scholar 

  27. Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth. International Joint Conference on Artificial Intelligence, pp. 2873–2879 (2016)

    Google Scholar 

  28. Luo, Y., Uzuner, Ö., Szolovits, P.: Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations. Brief. Bio inform. 18(1), 160–178 (2017)

    Article  Google Scholar 

  29. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Proc. Syst. 26 (2013)

    Google Scholar 

  30. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), pp. 271–278 (2004)

    Google Scholar 

  31. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  32. Qu, X., Zou, Z., Cheng, Y., Yang, Y., Zhou, P.: Adversarial category alignment network for cross-domain sentiment classification. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long and Short Papers), 1, pp. 2496–2508 (2019)

    Google Scholar 

  33. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  34. Saha, R., Granmo, O.C., Goodwin, M.: Mining interpretable rules for sentiment and semantic relation analysis using tsetlin machines. In: Proceedings of International Conference on Innovative Techniques and Applications of Artificial Intelligence (2020)

    Google Scholar 

  35. Green, A.I.: Schwartz, R., Dodge, J., Smith, N., Etzioni, O. Commun. ACM 63, 54–63 (2020)

    Google Scholar 

  36. Serrano, S., Smith, N.A.: Is attention interpretable? In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2931–2951 (2019)

    Google Scholar 

  37. Shen, J., Qu, Y., Zhang, W., Yu, Y.: Wasserstein distance guided representation learning for domain adaptation. In: Thirty-second AAAI conference on artificial intelligence, pp. 4058–4065 (2018)

    Google Scholar 

  38. Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: Directional self-attention network for RNN/CNN-free language understanding. In: Proceedings of the AAAI conference on artificial intelligence (2018)

    Google Scholar 

  39. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Long Papers), 1, pp. 1556–1566 (2015)

    Google Scholar 

  40. Tänzer, M., Ruder, S., Rei, M.: Memorisation versus generalisation in pre-trained language models. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Long Papers)1, pp. 7564–7578 (2022)

    Google Scholar 

  41. Tsetlin, M.L.: On behaviour of finite automata in random medium. Avtomat. i Telemekh 22(10), 1345–1354 (1961)

    Google Scholar 

  42. Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on Machine learning, pp. 977–984 (2006)

    Google Scholar 

  43. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of human language technology conference and conference on empirical methods in natural language processing, pp. 347–354 (2005)

    Google Scholar 

  44. Yadav, R., Jiao, L., Granmo, O.C., Goodwin, M.: Human-level interpretable learning for aspect-based sentiment analysis. In: Proceedings of AAAI (2021)

    Google Scholar 

  45. Yadav, R.K., Jiao, L., Granmo, O.C., Goodwin, M.: Interpretability in word sense disambiguation using tsetlin machine. In: Proceedings of ICAART, pp. 402–409 (2021)

    Google Scholar 

  46. Yadav, R.K., Jiao, L., Granmo, O.C., Goodwin, M.: Robust Interpretable Text Classification against Spurious Correlations Using and-rules with Negation. In: The 31st International Joint Conference on Artificial Intelligence (IJCAI) (2022)

    Google Scholar 

  47. Yang, J., et al.: A survey of knowledge enhanced pre-trained models. arXiv preprint arXiv:2110.00269 (2021)

  48. Zhang, T., Huang, M., Zhao, L.: Learning structured representation for text classification via reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  49. Zhang, Y., Jin, R., Zhou, Z.H.: Understanding bag-of-words model: a statistical framework. Int. J. Mach. Learn. Cybern. 1(1), 43–52 (2010)

    Article  Google Scholar 

  50. Zhao, R., Mao, K.: Fuzzy bag-of-words model for document representation. IEEE Trans. Fuzzy Syst. 26(2), 794–804 (2017)

    Article  Google Scholar 

  51. Zhou, J., Tian, J., Wang, R., Wu, Y., Xiao, W., He, L.: Sentix: A sentiment-aware pre-trained model for cross-domain sentiment analysis. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 568–579 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bimal Bhattarai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bhattarai, B., Granmo, OC., Jiao, L. (2023). An Interpretable Knowledge Representation Framework for Natural Language Processing with Cross-Domain Application. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13980. Springer, Cham. https://doi.org/10.1007/978-3-031-28244-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28244-7_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28243-0

  • Online ISBN: 978-3-031-28244-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics