Abstract
This paper focuses on domain-specific senses and proposes a method for detecting predominant sense depending on each domain. We applied a simple Markov Random Walk (MRW) model to rank senses for each domain. It decides the importance of a vertex (senses) within a graph by using the similarity of senses. The similarity of senses is obtained by using distributed representations of words from gloss texts in the thesaurus. It captures large semantic context and thus does not require manual annotation of sense-tagged data. In order to evaluate the method, we applied the results of domain-specific senses to text categorization. The performance achieved in our test set WordNet3.1 and the Reuters corpus demonstrates applicability for the text categorization task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Snyder, B., Palmer, M.: The English all-words task. In: Proceedings of SENSEVAL-3, the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43 (2004)
Koeling, R., McCarthy, D., Carroll, J.: Domain-specific sense distributions and predominant sense acquisition, In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 419–426 (2015)
Yarowsky, D., Florian, R.: Evaluating sense disambiguation performance across diverse parameter spaces. J. Nat. Lang. Eng. 8, 293–310 (2002)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Unsupervised acquisition of predominant word senses. J. Comput. Linguist. 33, 553–590 (2007)
Kusner, M.J., Sun, Y., Kolkin, N.L., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 957–966 (2015)
Netlib, http://www.netlib.org/scalapack. Accessed 5 Dec 2018
Johnson, R., Zhang, T.: Effective Use of word order for text categorization with convolutional neural networks. arXiv:1412.1058 (2014)
Liu, J., Chang, W-C., Wu, Y., Yang, Y.: Deep learning for extreme multi-label text classification. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–124 (2017)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, H., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012). arXiv:1207.0580
Magnini, B., Cavaglia, G.: Integrating subject field codes into wordnet. In: Proceedings of the Second International Conference on Language and Evaluation (LREC2000), pp. 1413–1418 (2000)
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Wu, W., Li, H., Wang, H., Zhu, K.Q.: A probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492 (2012)
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 2915–2921 (2017)
Magnini, B., Strapparava, C., Pezzulo, G., Gliozzo, A.: The role of domain information in word sense disambiguation. J. Nat. Lang. Eng. 1, 359–373 (1998)
Agirre, E., Lacalle, O.L.D., Soroa, A.: Knowledge-based WSD on specific domains: performing better than generic supervised WSD. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence, pp. 1501–1506 (2009)
Faralli, S., Navigli, R.: A new minimally-supervised framework for domain word sense disambiguation. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1411–1422 (2012)
Taghipour, K., Ng, H.T.: Semi-supervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of the 2015 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 314–323 (2015)
Abualhaija, S., Tahmasebi, N., Forin, D., Zimmermann, K.: Parameter transfer across domains for word sense disambiguation. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 1–8 (2017)
Lopez-Arevalo, I., Sosa-Sosa, V.J., Rojas-Lopez, F., Tello-Leal, E.: Improving selection of Synsets from WordNet for domain-specific word sense disambiguation. J. Comput. Speech Lang. 41, 128–145 (2017)
McCarthy, D., Koeling, R., Weeds, J., Carroll, J.: Finding predominant word senses in untagged text. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-2014), pp. 279–286 (2014)
Rose, T., Stevenson, M., Whitehead, M.: The reuters corpus Volume 1 - from Yesterday’s News to tomorrow’s language resources. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), pp. 29–31 (2002)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)
Blei, D.M., Ng, A.Y., Jordan M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-gram feature. In: Proceedings of NAACL 2018 - Conference of the North American Chapter of the Association for Computational Linguistics, pp. 528–540 (2017)
Navigli, R., Lapata, M.: An experimental study of graph connectivity for unsupervised word sense disambiguation. J. IEEE Trans. Pattern Anal. Mach. Intell. 32, 678–692 (2010)
Mihalcea, R.: Language Independent Extractive Summarization. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, pp. 49–52 (2005)
Agirre, E., Soroa, A.: Personalizing Pagerank for word sense disambiguation. In: 12th Proceedings on Conference of the European Chapter of the Association for Computational Linguistics, pp. 33–41. ACL, Athens (2009)
Reddy, S., Inumella, A., McCarthy, D., Stevenson, M.: IIITH: domain specific word sense disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 387–391 (2010)
Perozzi, B., AI-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations (2014). arXiv:1403.6652
Wang, Y., et al.: Dual transfer learning for neural machine translation with marginal distribution regularization. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (2018)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies, pp. 1480–1489 (2016)
Zhang, R., Lee, H., Radev, D.R.: Dependency sensitive convolutional neural networks for modeling sentences and documents. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1512–1521 (2016)
Johnson, R., Zhang, T.: Semi-supervised convolutional neural networks for text categorization via region embedding (2015). arXiv:1504.01255
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the EACL, pp. 427–431 (2017)
Nooralahzadeh, F., Øvrelid, L., Lønning, J.T.: Evaluation of domain-specific word embeddings using knowledge resources. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (2018)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Wangpoonsarp, A., Shimura, K., Fukumoto, F. (2023). Acquisition of Domain-Specific Senses and Its Extrinsic Evaluation Through Text Categorization. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-031-24340-0_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)