Abstract
In this paper, we propose that document sets consist of two types, drift descriptions that record actions on diachronic objects that could be regarded as the same over time and diversity descriptions that record actions on different objects. This research finds diachronic objects to extract a document subset of drift descriptions. We assumed that a diachronic object would be mentioned similarly and have different time-distribution appearances. Consequently, we proposed a method to find words that represent diachronic objects by similar mentions and applied it to three different document sets. The results show that it is possible to extract document objects for drift descriptions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Global Environment Committee, Central Environment Council, Ministry of the Environment, Japan. https://www.env.go.jp/council/06earth/yoshi06.html. Accessed 25 Feb 2019
Wikipedia entity vectors. https://github.com/singletongue/WikiEntVec. Accessed 25 Feb 2019
Agarwal, P., Strötgen, J., Del Corro, L., Hoffart, J., Weikum, G.: Dianed: time-aware named entity disambiguation for diachronic corpora. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers), vol. 2, pp. 686–693 (2018)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 113–120 (2006)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1002 (2003)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Providing a unified account of definite noun phrases in discourse. In: 21st Annual Meeting of the Association for Computational Linguistics (1983)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Kenter, T., Wevers, M., Huijnen, P., De Rijke, M.: Ad hoc monitoring of vocabulary shifts over time. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1191–1200 (2015)
Kudo, T.: Mecab: Yet another part-of-speech and morphological analyzer. http://taku910.github.io/mecab/. Accessed 25 Feb 2019
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: 2013 Proceedings of International Conference on Learning Representations (2013)
Mizoguchi, R.: Theory and Practice of Ontology Engineering. Ohmsha, Tokyo (2012)
Nishida, K., Hoshide, T., Fujimura, K.: Improving tweet stream classification by detecting changes in word probability. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 971–980 (2012)
Poesio, M., Stuckardt, R., Versley, Y. (eds.): Anaphora Resolution: Algorithms, Resources, and Applications. TANLP. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-47909-4
Tanaka, K.: Extract object of changes from documents using similarities of co-occurrence word and its time distribution. In: Proceedings of the 33rd Annual Conference of the Japanese Society for Artificial Intelligence (2019)
Wanner, F., Stoffel, A., Jäckle, D., Kwon, B.C., Weiler, A., Keim, D.A.: State-of-the-art report of visual analysis for event detection in text data streams. In: EuroVis - STARs, pp. 125–139 (2014)
Acknowledgment
This work was supported by JSPS KAKENHI Grant Number JP16K00702.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tanaka, K., Hori, K. (2019). Finding Diachronic Objects of Drifting Descriptions by Similar Mentions. In: Ohara, K., Bai, Q. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2019. Lecture Notes in Computer Science(), vol 11669. Springer, Cham. https://doi.org/10.1007/978-3-030-30639-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-30639-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30638-0
Online ISBN: 978-3-030-30639-7
eBook Packages: Computer ScienceComputer Science (R0)