Abstract
A measure of term relevancy is important in various applications such as Web search. Although the co-occurrence probability of terms in a database is a simple way to express term relevancy, it suffers from each term having a different co-occurrence tendency. In this paper, we propose a new measure of term relevancy: a ratio of actual and predicted values of co-occurrence (RAP). We construct a model predicting co-occurrence for each query as piecewise approximation lines.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Hatena Bookmark, http://b.hatena.ne.jp/.
- 2.
MeCab: Yet Another Part-of-Speech and Morphological Analyzer, http://taku910.github.io/mecab/.
References
Akamine, S., Kawahara, D., Kato, Y., Nakagawa, T., Inui, K., Kurohashi, S., Kidawara, Y.: WISDOM: a web information credibility analysis systematic. In: Proceedings of the ACL-IJCNLP 2009 Software Demonstrations, pp. 1–4 (2009)
Yamamoto, Y., Tanaka, K.: Enhancing credibility judgment of web search results. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1235–1244. ACM (2011)
Yumoto, T., Yamanaka, T., Nii, M., Kamiura, N.: Finding rare information from the web using social bookmarks and word co-occurrence. Int. J. Biomed. Soft Comput. Hum. Sci. 22(1), 9–18 (2017)
Church, K.W., Hanks, P.: Word association norms, mutual information, and lexicography. Comput. Linguist. 16(1), 22–29 (1990)
Cilibrasi, R.L., Vitanyi, P.M.: The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19(3), 370–383 (2007)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. ICLR Workshop (2013)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Poria, S., Cambria, E., Gelbukh, A.: Aspect extraction for opinion mining with a deep convolutional neural network. Knowl. Based Syst. 108, 42–49 (2016)
Kudo, T., Yamamoto, K., Matsumoto, Y.: Applying conditional random fields to japanese morphological analysis. In 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP2004), pp. 230–237 (2004)
Acknowledgements
This work was partially supported by JSPS KAKENHI Grant Number JP17K00429.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Koyama, Y., Yumoto, T., Isokawa, T., Kamiura, N. (2019). Measuring Term Relevancy Based on Actual and Predicted Co-occurrence. In: Lee, S., Ismail, R., Choo, H. (eds) Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication (IMCOM) 2019. IMCOM 2019. Advances in Intelligent Systems and Computing, vol 935. Springer, Cham. https://doi.org/10.1007/978-3-030-19063-7_78
Download citation
DOI: https://doi.org/10.1007/978-3-030-19063-7_78
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19062-0
Online ISBN: 978-3-030-19063-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)