Acquisition of Domain-Specific Senses and Its Extrinsic Evaluation Through Text Categorization

Wangpoonsarp, Attaporn; Shimura, Kazuya; Fukumoto, Fumiyo

doi:10.1007/978-3-031-24340-0_34

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

342 Accesses

Abstract

This paper focuses on domain-specific senses and proposes a method for detecting predominant sense depending on each domain. We applied a simple Markov Random Walk (MRW) model to rank senses for each domain. It decides the importance of a vertex (senses) within a graph by using the similarity of senses. The similarity of senses is obtained by using distributed representations of words from gloss texts in the thesaurus. It captures large semantic context and thus does not require manual annotation of sense-tagged data. In order to evaluate the method, we applied the results of domain-specific senses to text categorization. The performance achieved in our test set WordNet3.1 and the Reuters corpus demonstrates applicability for the text categorization task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Snyder, B., Palmer, M.: The English all-words task. In: Proceedings of SENSEVAL-3, the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43 (2004)
Google Scholar
Koeling, R., McCarthy, D., Carroll, J.: Domain-specific sense distributions and predominant sense acquisition, In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 419–426 (2015)
Google Scholar
Yarowsky, D., Florian, R.: Evaluating sense disambiguation performance across diverse parameter spaces. J. Nat. Lang. Eng. 8, 293–310 (2002)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Unsupervised acquisition of predominant word senses. J. Comput. Linguist. 33, 553–590 (2007)
Google Scholar
Kusner, M.J., Sun, Y., Kolkin, N.L., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 957–966 (2015)
Google Scholar
Netlib, http://www.netlib.org/scalapack. Accessed 5 Dec 2018
Johnson, R., Zhang, T.: Effective Use of word order for text categorization with convolutional neural networks. arXiv:1412.1058 (2014)
Liu, J., Chang, W-C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–124 (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, H., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv:1207.0580
Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of the Second International Conference on Language and Evaluation (LREC2000), pp. 1413–1418 (2000)
Google Scholar
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Google Scholar
Wu, W., Li, H., Wang, H., Zhu, K.Q.: A probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492 (2012)
Google Scholar
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2915–2921 (2017)
Google Scholar
Magnini, B., Strapparava, C., Pezzulo, G., Gliozzo, A.: The role of domain information in word sense disambiguation. J. Nat. Lang. Eng. 1, 359–373 (1998)
Google Scholar
Agirre, E., Lacalle, O.L.D., Soroa, A.: Knowledge-based WSD on specific domains: performing better than generic supervised WSD. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp. 1501–1506 (2009)
Google Scholar
Faralli, S., Navigli, R.: A new minimally-supervised framework for domain word sense disambiguation. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1411–1422 (2012)
Google Scholar
Taghipour, K., Ng, H.T.: Semi-supervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of the 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 314–323 (2015)
Google Scholar
Abualhaija, S., Tahmasebi, N., Forin, D., Zimmermann, K.: Parameter transfer across domains for word sense disambiguation. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 1–8 (2017)
Google Scholar
Lopez-Arevalo, I., Sosa-Sosa, V.J., Rojas-Lopez, F., Tello-Leal, E.: Improving selection of Synsets from WordNet for domain-specific word sense disambiguation. J. Comput. Speech Lang. 41, 128–145 (2017)
Article Google Scholar
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Finding predominant word senses in untagged text. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2014), pp. 279–286 (2014)
Google Scholar
Rose, T., Stevenson, M., Whitehead, M.: The reuters corpus Volume 1 - from Yesterday’s News to tomorrow’s language resources. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), pp. 29–31 (2002)
Google Scholar
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Google Scholar
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram feature. In: Proceedings of NAACL 2018 - Conference of the North American Chapter of the Association for Computational Linguistics, pp. 528–540 (2017)
Google Scholar
Navigli, R., Lapata, M.: An experimental study of graph connectivity for unsupervised word sense disambiguation. J. IEEE Trans. Pattern Anal. Mach. Intell. 32, 678–692 (2010)
Google Scholar
Mihalcea, R.: Language Independent Extractive Summarization. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, pp. 49–52 (2005)
Google Scholar
Agirre, E., Soroa, A.: Personalizing Pagerank for word sense disambiguation. In: 12th Proceedings on Conference of the European Chapter of the Association for Computational Linguistics, pp. 33–41. ACL, Athens (2009)
Google Scholar
Reddy, S., Inumella, A., McCarthy, D., Stevenson, M.: IIITH: domain specific word sense disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 387–391 (2010)
Google Scholar
Perozzi, B., AI-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations (2014). arXiv:1403.6652
Wang, Y., et al.: Dual transfer learning for neural machine translation with marginal distribution regularization. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1512–1521 (2016)
Google Scholar
Johnson, R., Zhang, T.: Semi-supervised convolutional neural networks for text categorization via region embedding (2015). arXiv:1504.01255
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the EACL, pp. 427–431 (2017)
Google Scholar
Nooralahzadeh, F., Øvrelid, L., Lønning, J.T.: Evaluation of domain-specific word embeddings using knowledge resources. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Integrated Graduate School of Medicine, Engineering, and Agricultural Sciences, University of Yamanashi, Kofu, 400-8511, Japan
Attaporn Wangpoonsarp & Kazuya Shimura
Interdisciplinary Graduate School, University of Yamanashi, Kofu, 400-8511, Japan
Fumiyo Fukumoto

Authors

Attaporn Wangpoonsarp
View author publications
You can also search for this author in PubMed Google Scholar
Kazuya Shimura
View author publications
You can also search for this author in PubMed Google Scholar
Fumiyo Fukumoto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Attaporn Wangpoonsarp or Fumiyo Fukumoto .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wangpoonsarp, A., Shimura, K., Fukumoto, F. (2023). Acquisition of Domain-Specific Senses and Its Extrinsic Evaluation Through Text Categorization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_34
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Acquisition of Domain-Specific Senses and Its Extrinsic Evaluation Through Text Categorization