Abstract
With advances in technology and culture, our language changes. We invent new words, add or change meanings of existing words and change names of existing things. Unfortunately, our language does not carry a memory; words, expressions and meanings used in the past are forgotten over time. When searching and interpreting content from archives, language changes pose a great challenge. In this paper, we present results of automatic word sense change detection and show the utility for archive users as well as digital humanities’ research. Our method is able to capture changes that relate to the usage and culture of a word that cannot easily be found using dictionaries or other resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Basile, P., Caputo, A., Luisi, R., Semeraro, G.: Diachronic analysis of the italian language exploiting google Ngram. In: Proceedings of Third Italian Conference on Computational Linguistics (CLiC-it 2016) (2016)
Cook, P., Lau, J.H., McCarthy, D., Baldwin, T.: Novel word-sense identification. In: Proceedings of COLING 2014, Dublin, Ireland, pp. 1624–1635, August 2014. http://www.aclweb.org/anthology/C14-1154
Cooper, M.C.: A mathematical model of historical semantics and the grouping of word meanings into concepts. Comput. Linguist. 32(2), 227–248 (2005)
Dejica, D., Hansen, G., Sandrini, P., Para, I.: Language in the Digital Era. Challenges and Perspectives. De Gruyter, Berlin (2016)
Dorow, B., Eckmann, J.P., Sergi, D.: Using curvature and markov clustering in graphs for lexical acquisition and word sense discrimination. In: Proceedings of the Workshop MEANING-2005 (2005)
Frermann, L., Lapata, M.: A bayesian model of diachronic meaning change. TACL 4, 31–45 (2016)
Gulordava, K., Baroni, M.: A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In: Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics, GEMS 2011, pp. 67–71. Association for Computational Linguistics (2011)
Hamilton, W.L., Leskovec, J., Jurafsky, D.: Cultural shift or linguistic drift? comparing two computational measures of semantic change. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2016)
Hamilton, W.L., Leskovec, J., Jurafsky, D.: Diachronic word embeddings reveal statistical laws of semantic change. CoRR abs/1605.09096 (2016)
Kim, Y., Chiu, Y.I., Hanaki, K., Hegde, D., Petrov, S.: Temporal analysis of language through neural language models. In: Workshop on Language Technologies and Computational Social Science (2014)
Kulkarni, V., Al-Rfou, R., Perozzi, B., Skiena, S.: Statistically significant detection of linguistic change. In: Proceedings of the 24th International Conference on World Wide Web, pp. 625–635. ACM (2015)
Lau, J.H., Cook, P., McCarthy, D., Newman, D., Baldwin, T.: Word sense induction for novel sense detection. In: EACL 2012, 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 591–601 (2012). http://aclweb.org/anthology-new/E/E12/E12-1060.pdf
Miller, G.A.: WordNet: a lexical database for english. Commun. ACM 38, 39–41 (1995)
Mitra, S., Mitra, R., Maity, S.K., Riedl, M., Biemann, C., Goyal, P., Mukherjee, A.: An automatic approach to identify word sense changes in text media across timescales. Nat. Lang. Eng. 21(05), 773–798 (2015)
Mitra, S., Mitra, R., Riedl, M., Biemann, C., Mukherjee, A., Goyal, P.: That’s sick dude!: automatic identification of word sense change across different timescales. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 USA, pp. 1020–1029 (2014). http://aclweb.org/anthology/P/P14/P14-1096.pdf
Tahmasebi, N., Risse, T.: Word Sense Change Test Set (2017). https://doi.org/10.5281/zenodo.495572
OED, O.E.D. (2017). http://www.oed.com/view/Entry/197656?rskey=8IY6gT$&$result=1$&$isAdvanced=false#eid. Accessed 02 May 2016
Roslin Bennett, A.: The Telephone Systems of the Continent of Europe. Longmans Green and CO., London (1895). http://archive.org/stream/telephonesystems00bennrich#page/332/
Sagi, E., Kaufmann, S., Clark, B.: Semantic density analysis: comparing word meaning across time and phonetic space. In: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, GEMS 2009, pp. 104–111. ACL (2009). http://dl.acm.org/citation.cfm?id=1705415.1705429
Tahmasebi, N., Niklas, K., Zenz, G., Risse, T.: On the applicability of word sense discrimination on 201 years of modern english. Int. J. Dig. Libr. 13(3–4), 135–153 (2013). doi:10.1007/s00799-013-0105-8
Tahmasebi, N.N.: Models and algorithms for automatic detection of language evolution. Ph.D. thesis, Gottfried Wilhelm Leibniz Universitt Hannover (2013). http://edok01.tib.uni-hannover.de/edoks/e01dh13/771705034.pdf
Viklund, J., Borin, L.: How can big data help us study rhetorical history? In: Clarin Annual Conference (2016)
Wang, J., Bansal, M., Gimpel, K., Ziebart, B.D., Clement, T.Y.: A sense-topic model for word sense induction with unsupervised data enrichment. TACL 3, 59–71 (2015)
Wang, X., McCallum, A.: Topics over time: a non-markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, USA, pp. 424–433. ACM (2006)
Wijaya, D.T., Yeniterzi, R.: Understanding semantic change of words over centuries. In: Proceedings of the 2011 International Workshop on DETecting and Exploiting Cultural diversiTy on the Social Web, DETECT 2011, pp. 35–40. ACM, New York (2011)
Zhang, Y., Jatowt, A., Tanaka, K.: Detecting evolution of concepts based on cause-effect relationships in online reviews. In: Proceedings of the 25th International Conference on World Wide Web, pp. 649–660. ACM (2016)
Acknowledgments
This work has been funded in parts by the project “Towards a knowledge-based culturomics” supported by a framework grant from the Swedish Research Council (2012–2016; dnr 2012-5738). This work is also in parts funded by the European Research Council under Alexandria (ERC 339233) and the European Community’s H2020 Program under SoBigData (RIA 654024). We would like to thank Times Newspapers Limited for providing the archive of The Times for our research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tahmasebi, N., Risse, T. (2017). On the Uses of Word Sense Change for Research in the Digital Humanities. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2017. Lecture Notes in Computer Science(), vol 10450. Springer, Cham. https://doi.org/10.1007/978-3-319-67008-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-67008-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67007-2
Online ISBN: 978-3-319-67008-9
eBook Packages: Computer ScienceComputer Science (R0)