Abstract
As archives contain documents that span over a long period of time, the language used to create these documents and the language used for querying the archive can differ. This difference is due to evolution in both terminology and semantics and will cause a significant number of relevant documents being omitted. A static solution is to use query expansion based on explicit knowledge banks such as thesauri or ontologies. However as we are able to archive resources with more varied terminology, it will be infeasible to use only explicit knowledge for this purpose. There exist only few or no thesauri covering very domain specific terminologies or slang as used in blogs etc. In this Ph.D. thesis we focus on automatically detecting terminology evolution in a completely unsupervised manner as described in this technical paper.
This work is partly funded by the European Commission under LiWA (IST 216267).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berberich, K., Bedathur, S., Sozio, M., Wiekum, G.: Bridging the terminology gap in web archive search. In: WebDB (2009)
Deschacht, K., Francine Moens, M., Law, I.C.F.: Text analysis for automatic image annotation. In: Proc. of the 45 th Annual Meeting of the Association for Computational Linguistics. East Stroudsburg (2007)
Dorow, B.: A graph model for words and their meanings. PhD thesis, University of Stuttgart (2007)
Dorow, B., Widdows, D., Ling, K., Eckmann, J.P., Serqi, D., Moses, E.: Using curvature and Markov clustering in graphs for lexical acquisi tion and word sense discrimination. In: 2nd Workshop organized by the MEANING Project, Trento, Italy, February 3-4 (2005)
Ferret, O.: Discovering word senses from a network of lexical cooccurrences. In: Proc. of the 20th international conference on Computational Linguistics, Morristown, NJ, USA, ACL, p. 1326 (2004)
Lin, D.: Using syntactic dependency as local context to resolve word sense ambiguity. In: Proc. of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, Morristown, NJ, USA, ACL, pp. 64–71 (1997)
Lin, D.: Automatic retrieval and clustering of similar words. In: Proc. of the 17th international conference on Computational linguistics, Morristown, NJ, USA, ACL, pp. 768–774 (1998)
Lin, Y.R., Chi, Y., Zhu, S., Sundaram, H., Tseng, B.L.: Facetnet: a framework for analyzing communities and their evolutions in dynamic networks. In: Proc. of the 17th international conference on World Wide Web, pp. 685–694. ACM, New York (2008)
Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proc. of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp. 198–207. ACM Press, New York (2005)
Miller, G.A.: Wordnet: A lexical database for english. Communications of the ACM 38, 39–41 (1995)
Oyama, S., Shirasuna, K., Tanaka, K.: Identification of time-varying objects on the web. In: Proc. of the 8th ACM/IEEE-CS joint conference on Digital libraries, pp. 285–294. ACM, New York (2008)
Palla, G., Barabasi, A.L., Vicsek, T.: Quantifying social group evolution. Nature 446(7136), 664–667 (2007)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818
Pantel, P., Lin, D.: Discovering word senses from text. In: Proc. of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 613–619 (2002)
Schütze, H.: Automatic word sense discrimination. Journal of Computational Linguistics 24, 97–123 (1998)
Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: Monic: modeling and monitoring cluster transitions. In: Proc. of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 706–711. ACM, New York (2006)
Tahmasebi, N., Iofciu, T., Risse, T., Niederee, C., Siberski, W.: Terminology evolution in web archiving: Open issues. In: Proc. of 8th International Web Archiving Workshop in conjunction with ECDL (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tahmasebi, N. (2009). Automatic Detection of Terminology Evolution. In: Meersman, R., Herrero, P., Dillon, T. (eds) On the Move to Meaningful Internet Systems: OTM 2009 Workshops. OTM 2009. Lecture Notes in Computer Science, vol 5872. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05290-3_93
Download citation
DOI: https://doi.org/10.1007/978-3-642-05290-3_93
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05289-7
Online ISBN: 978-3-642-05290-3
eBook Packages: Computer ScienceComputer Science (R0)