Abstract
Digital libraries build on classifying contents by capturing their semantics and (optionally) aligning the description with an underlying categorization scheme. This process is usually based on human intervention, either by the content creator or a curator. As such, this procedure is highly time-consuming and - thus - expensive. In order to support the human in data curation, we introduce an annotation tagging system called “AnnoTag”. AnnoTag aims at providing concise content annotations by employing entity-level analytics in order to derive semantic descriptions in the form of tags. In particular, we are generating “Semantic LOD Tags” (linked open data) that allow an interlinking of the derived tags with the LOD cloud. Based on a qualitative evaluation on Web news articles we prove the viability of our approach and the high-quality of the automatically extracted information.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
LibraryThing Tags https://blog.librarything.com/main/category/tags/.
- 2.
- 3.
AnnoTag Website https://spaniol.users.greyc.fr/research/AnnoTag/.
- 4.
Harvard Dataverse News Articles https://doi.org/10.7910/DVN/GMFCTR.
References
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: ISWC/ASWC, pp. 722–735 (2007)
Bikakis, N., Giannopoulos, G., Dalamagas, T., Sellis, T.: Integrating keywords and semantics on document annotation and search. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6427, pp. 921–938. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16949-6_19
Eckart de Castilho, R., et al.: A web-based tool for the integrated annotation of semantic and syntactic structures. In: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), pp. 76–84. The COLING 2016 Organizing Committee, Osaka (2016). https://www.aclweb.org/anthology/W16-4011
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02) (2002)
Cunningham, H., et al.: Text Processing with GATE (Version 6) (2011). http://tinyurl.com/gatebook
Giannopoulos, Giorgos, Bikakis, Nikos, Dalamagas, Theodore, Sellis, Timos: GoNTogle: a tool for semantic annotation and search. In: Aroyo, Lora, et al. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 376–380. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13489-0_27
Govind, Kumar, A., Alec, C., Spaniol, M.: CALVADOS: a tool for the semantic analysis and digestion of web contents. In: Proceedings of the 16th Extended Semantic Web Conference (ESWC 2019), Portorož, Slovenia, 2–6 June, pp. 84–89 (2019)
Hoffart, J., Milchevski, D., Weikum, G.: STICS: searching with strings, things, and cats, p. 1247–1248 (2014). https://doi.org/10.1145/2600428.2611177
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)
Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK, pp. 782–792 (2011)
Kumar, A., Govind, Alec, C., Spaniol, M.: Blogger or president? Exploitation of patterns in entity type graphs for representative entity type classification. In: Proceedings of the 12th International ACM Web Science Conference (WebSci ’20), pp. 59–68 (2020)
Kumar, A., Spaniol, M.: Semantic tagging via entity-level analytics: assessment of concise content tagging. In: Proceedings of the 25th International Conference on Theory and Practice of Digital Libraries (TPDL 2021), 8 p. (2021, to appear)
Liao, X., Zhao, Z.: Unsupervised approaches for textual semantic annotation, a survey. ACM Comput. Surv. 52(4), 66:1–66:45 (2019). https://doi.org/10.1145/3324473
Macgregor, G., McCulloch, E.: Collaborative tagging as a knowledge organisation and resource discovery tool. Libr. Rev. 55(5), 291–300 (2006)
Medeiros, J.F., Pereira Nunes, B., Siqueira, S.W.M., Portes Paes Leme, L.A.: TagTheWeb: using Wikipedia categories to automatically categorize resources on the web. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 153–157. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_29
REFINITIV: Open Calais. http://www.opencalais.com (2021). Accessed 26 Apr 2021
Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In: 16th International World Wide Web Conference (WWW 2007), pp. 697–706. ACM (2007)
Tao, C., Song, D., Sharma, D.K., Chute, C.G.: Semantator: semantic annotator for converting biomedical text to linked data. J. Biomed. Inform. 46(5), 882–893 (2013). https://doi.org/10.1016/j.jbi.2013.07.003
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kumar, A., Spaniol, M. (2021). AnnoTag: Concise Content Annotation via LOD Tags derived from Entity-Level Analytics. In: Berget, G., Hall, M.M., Brenn, D., Kumpulainen, S. (eds) Linking Theory and Practice of Digital Libraries. TPDL 2021. Lecture Notes in Computer Science(), vol 12866. Springer, Cham. https://doi.org/10.1007/978-3-030-86324-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-86324-1_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86323-4
Online ISBN: 978-3-030-86324-1
eBook Packages: Computer ScienceComputer Science (R0)