Skip to main content

Towards Querying Multimodal Annotations Using Graphs

  • Conference paper
  • First Online:
Research and Education in Urban History in the Age of Digital Libraries (UHDL 2023)

Abstract

Photographs and 3D reconstructions of buildings as well as textual information and documents play an important role in art history and architectural studies when it comes to investigating architecture, the construction history of buildings, and the impact these constructions had on a city. Advanced tools have the potential to enhance and support research workflows and source criticism by linking corresponding materials and annotations, such that relevant data can be quickly queried and identified. Images are a primary source in the 3D reconstruction process, with the possibility to create spatializations of additional photographs of buildings which were not part of the initial SfM process, enabling the linking of annotations between these photographs and the respective 3D model. In contrast, identifying and locating respective annotations in text sources requires a different approach due to their more abstract nature. This paper presents concepts for automatic linking of texts and their respective annotations to corresponding images, as well as to 3D models and their annotations. Controlled vocabularies for architectural elements and a graph representation are utilized to reduce ambiguity when querying related instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 02 September 2023

    A correction has been published.

References

  1. Abe, S., Elsner, J.: Introduction: some stakes of comparison. In: Comparativism in Art History, pp. 1–15. Routledge (2017). https://doi.org/10.1109/BigData47090.2019.9005633

  2. Agosti, M., Ferro, N.: A formal model of annotations of digital content. ACM Trans. Inf. Syst. 26(1), 3-es (2007). https://doi.org/10.1145/1292591.1292594

  3. Agosti, M., Ferro, N., Orio, N.: Annotating illuminated manuscripts: an effective tool for research and education. In: Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2005, pp. 121–130. ACM, New York, NY, USA (2005). https://doi.org/10.1145/1065385.1065412

  4. Baca, M., Gill, M.: Encoding multilingual knowledge systems in the digital age: the Getty vocabularies. In: Smiraglia, R.P. (ed.) Proceedings from North American Symposium on Knowledge Organization, vol. 5, pp. 41–63 (2015). https://doi.org/10.7152/nasko.v5i1.15179

  5. Baker, S., Kiela, D., Korhonen, A.: Robust text classification for sparsely labelled data using multi-level embeddings. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2333–2343. The COLING 2016 Organizing Committee, Osaka, Japan, December 2016

    Google Scholar 

  6. Beaudoin, J.E.: An investigation of image users across professions: a framework of their image needs, Retrieval and Use. Ph.D. thesis, Drexel University Philadelphia (2009)

    Google Scholar 

  7. Bekiari, C., et al.: Definition of the CIDOC conceptual reference model v7.1.1. In: The CIDOC Conceptual Reference Model Special Interest Group (2021). https://doi.org/10.26225/FDZH-X261

  8. Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615–3620. Association for Computational Linguistics, Hong Kong, China, November 2019. https://doi.org/10.18653/v1/D19-1371

  9. Bernhauer, D., Nečaskỳ, M., Škoda, P., Klímek, J., Skopal, T.: Open dataset discovery using context-enhanced similarity search. Knowl. Inf. Syst. 64(12), 3265–3291 (2022). https://doi.org/10.1007/s10115-022-01751-z

    Article  Google Scholar 

  10. Bruschke, J., Kröber, C., Maiwald, F., Utescher, R., Pattee, A.: Introducing a multimodal dataset for the research of architectural elements. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLVIII-M-2-2023, 325–331 (2023). https://doi.org/10.5194/isprs-archives-XLVIII-M-2-2023-325-2023

  11. Bruschke, J., Niebling, F., Maiwald, F., Friedrichs, K., Wacker, M., Latoschik, M.E.: Towards browsing repositories of spatially oriented historic photographic images in 3d web environments. In: Proceedings of the 22nd International Conference on 3D Web Technology. Web3D 2017, ACM, New York, NY, USA (2017). https://doi.org/10.1145/3055624.3075947

  12. Bruschke, J., Wacker, M.: Application of a graph database and graphical user interface for the CIDOC CRM. In: Access and Understanding-Networking in the Digital Era. Session J1. The 2014 Annual Conference of CIDOC, the International Committee for Documentation of ICOM (2014)

    Google Scholar 

  13. Chandrasekaran, D., Mago, V.: Evolution of semantic similarity-a survey. ACM Comput. Surv. 54(2), Article 41 (2021). https://doi.org/10.1145/3440755

  14. Chatzakis, M., Mountantonakis, M., Tzitzikas, Y.: RDFsim: similarity-based browsing over dbpedia using embeddings. Information 12(11), 440 (2021). https://doi.org/10.3390/info12110440

    Article  Google Scholar 

  15. Chen, H., Sultan, S.F., Tian, Y., Chen, M., Skiena, S.: Fast and accurate network embeddings via very sparse random projection. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 399–408. ACM (2019). https://doi.org/10.1145/3357384.3357879

  16. Croce, V., Caroti, G., De Luca, L., Jacquot, K., Piemonte, A., Véron, P.: From the semantic point cloud to heritage-building information modeling: a semiautomatic approach exploiting machine learning. Remote Sens. 13(3), 461 (2021). https://doi.org/10.3390/rs13030461

    Article  Google Scholar 

  17. Dewitz, L., et al.: Historical photos and visualizations: potential for research. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-2/W15, 405–412 (2019). https://doi.org/10.5194/isprs-archives-XLII-2-W15-405-2019

  18. Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, pp. 577–586. ACM (2011). https://doi.org/10.1145/1963405.1963487

  19. Dürre, S.: Die Skulpturen des Dresdner Zwingers : Untersuchung zur Aufstellung, Ikonographie, zum Stil und zu den Veränderungen 1712–2002. Ph.D. thesis, Technische Universität Dresden (2003)

    Google Scholar 

  20. Ehrmann, M., Hamdi, A., Pontes, E.L., Romanello, M., Doucet, A.: Named entity recognition and classification on historical documents: a survey. arXiv preprint arXiv:2109.11406 (2021). https://doi.org/10.48550/arXiv.2109.11406

  21. Erdmann, M., Maedche, A., Schnurr, H.P., Staab, S.: From manual to semi-automatic semantic annotation: about ontology-based text annotation tools. In: Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content, pp. 79–85. International Committee on Computational Linguistics, Centre Universitaire, Luxembourg, August 2000

    Google Scholar 

  22. von Fellenberg, V., Schoen, H.: Externe impulse und interne imperative: Zur bedeutung von provenienzforschung und kulturgutschutz in deutschland für die kunstgeschichte. Kunstchronik. Monatsschrift für Kunstwissenschaft, Museumswesen und Denkmalpflege 69(7), 322–327 (2016)

    Google Scholar 

  23. Fiorucci, M., Khoroshiltseva, M., Pontil, M., Traviglia, A., Del Bue, A., James, S.: Machine learning for cultural heritage: a survey. Pattern Recogn. Lett. 133, 102–108 (2020). https://doi.org/10.1016/j.patrec.2020.02.017

    Article  Google Scholar 

  24. Grilli, E., Farella, E.M., Torresani, A., Remondino, F.: Geometric features analysis for the classification of cultural heritage point clouds. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-2/W15, 541–548 (2019). https://doi.org/10.5194/isprs-archives-XLII-2-W15-541-2019

  25. Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 855–864. ACM, New York, NY, USA (2016). https://doi.org/10.1145/2939672.2939754

  26. Handschuh, S., Staab, S.: Annotation for the Semantic Web. IOS Press, Amsterdam (2003)

    Google Scholar 

  27. Harpring, P.: Development of the getty vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation J. Art Libr. Soc. North Am. 29(1), 67–72 (2010). https://doi.org/10.1086/adx.29.1.27949541

    Article  Google Scholar 

  28. Heller, M.: Rethinking historical methods in organization studies: organizational source criticism. Organ. Stud. 44(6), 987–1002 (2023). https://doi.org/10.1177/01708406231156978

    Article  Google Scholar 

  29. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. NJ, 2nd edn, Pearson/Prentice Hall, Upper Saddle River (2009)

    Google Scholar 

  30. Ko, H., Lee, S., Park, Y., Choi, A.: A survey of recommendation systems: recommendation models, techniques, and application fields. Electronics 11(1) (2022). https://doi.org/10.3390/electronics11010141

  31. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems. Computer 42(8), 30–37 (2009). https://doi.org/10.1109/MC.2009.263

    Article  Google Scholar 

  32. Koroteev, M.V.: BERT: a review of applications in natural language processing and understanding. arXiv preprint arXiv:2103.11943 (2021). https://doi.org/10.48550/arXiv.2103.11943

  33. Leme, L.A.P., Brauner, D.F., Breitman, K.K., Casanova, M.A., Gazola, A.: Matching object catalogues. Innov. Syst. Softw. Eng. 4, 315–328 (2008). https://doi.org/10.1007/s11334-008-0070-3

    Article  Google Scholar 

  34. Li, S., Cai, H., Kamat, V.R.: Integrating natural language processing and spatial reasoning for utility compliance checking. J. Constr. Eng. Manage. 142(12), 04016074 (2016). https://doi.org/10.1061/(ASCE)CO.1943-7862.0001199

    Article  Google Scholar 

  35. Li, Y., Bandar, Z.A., McLean, D.: An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15(4), 871–882 (2003). https://doi.org/10.1109/TKDE.2003.1209005

    Article  Google Scholar 

  36. Livi, L., Rizzi, A.: The graph matching problem. Pattern Anal. Appl. 16(3), 253–283 (2013). https://doi.org/10.1007/s10044-012-0284-8

    Article  MathSciNet  MATH  Google Scholar 

  37. Lopatin, L.: Library digitization projects, issues and guidelines: a survey of the literature. Library Hi Tech 24(2), 273–289 (2006). https://doi.org/10.1108/07378830610669637

    Article  Google Scholar 

  38. López, F.J., Lerones, P.M., Llamas, J.M., Gómez-García-Bermejo, J., Zalama, E.: Linking HBIM graphical and semantic information through the Getty AAT: practical application to the castle of Torrelobatn. IOP Conf. Ser. Mater. Sci. Eng. 364, 012100 (2018). https://doi.org/10.1088/1757-899X/364/1/012100

    Article  Google Scholar 

  39. Maiwald, F., Henze, F., Bruschke, J., Niebling, F.: Geo-information technologies for a multimodal access on historical photographs and maps for research and communication in urban history. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-2/W11, 763–769 (2019). https://doi.org/10.5194/isprs-archives-XLII-2-W11-763-2019

  40. Maiwald, F., Bruschke, J., Schneider, D., Wacker, M., Niebling, F.: Giving historical photographs a new perspective: introducing camera orientation parameters as new metadata in a large-scale 4d application. Remote Sens. 15(7), 1879 (2023). https://doi.org/10.3390/rs15071879

    Article  Google Scholar 

  41. Manuel, A., Gattet, E., De Luca, L., Véron, P.: An approach for precise 2D/3D semantic annotation of spacially-oriented images for in-situ visualization applications. In: Digtal Heritage International Congress (2013)

    Google Scholar 

  42. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748

    Article  Google Scholar 

  43. Minaee, S., Boykov, Y.Y., Porikli, F., Plaza, A.J., Kehtarnavaz, N., Terzopoulos, D.: Image segmentation using deep learning: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3523–3542 (2022). https://doi.org/10.1109/TPAMI.2021.3059968

    Article  Google Scholar 

  44. Mirza, P., Razniewski, S., Darari, F., Weikum, G.: Enriching knowledge bases with counting quantifiers. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 179–197. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_11

    Chapter  Google Scholar 

  45. Morbidoni, C., Pierdicca, R., Paolanti, M., Quattrini, R., Mammoli, R.: Learning from synthetic point cloud data for historical buildings semantic segmentation. J. Comput. Cult. Herit. 13(4), Article 34 (2020). https://doi.org/10.1145/3409262

  46. Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. In: Cherifi, H., Gaito, S., Mendes, J.F., Moro, E., Rocha, L.M. (eds.) COMPLEX NETWORKS 2019. SCI, vol. 881, pp. 928–940. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-36687-2_77

    Chapter  Google Scholar 

  47. Münster, S., Maiwald, F., Lehmann, C., Lazariv, T., Hofmann, M., Niebling, F.: An automated pipeline for a browser-based, city-scale mobile 4d VR application based on historical images. In: Proceedings of the 2nd Workshop on Structuring and Understanding of Multimedia HeritAge Contents, SUMAC 2020, pp. 33–40. ACM, New York, NY, USA (2020). https://doi.org/10.1145/3423323.3425748

  48. Niebling, F., Maiwald, F., Barthel, K., Latoschik, M.E.: 4D augmented city models, photogrammetric creation and dissemination. In: Münster, S., Friedrichs, K., Niebling, F., Seidel-Grzesinska, A. (eds.) UHDL/DECH -2017. CCIS, vol. 817, pp. 196–212. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76992-9_12

    Chapter  Google Scholar 

  49. Oren, E., Möller, K.H., Scerri, S., Handschuh, S., Sintek, M.: What are semantic annotations? Relatório técnico. DERI Galway 9, 62 (2006)

    Google Scholar 

  50. Pande, A., Ni, K., Kini, V.: SWAG: item recommendations using convolutions on weighted graphs. In: International Conference on Big Data, pp. 2903–2912. IEEE (2019). https://doi.org/10.1109/BigData47090.2019.9005633

  51. Ramalho, T., et al.: Encoding spatial relations from natural language. arXiv preprint arXiv:1807.01670 (2018). 10.48550/arXiv. 1807.01670

  52. Régimbeau, G.: Image source criticism in the age of the digital humanities. In: Saou-Dufrêne, B. (ed.) Heritage and Digital humanities, pp. 179–194. Lit Verlag (2014)

    Google Scholar 

  53. Ricci, F., Rokach, L., Shapira, B.: Recommender systems: techniques, applications, and challenges. In: Ricci, F., Rokach, L., Shapira, B. (eds.) Recommender Systems Handbook. pp. 1–35. Springer, US, New York, NY, USA (2022). https://doi.org/10.1007/978-1-0716-2197-4_1

  54. Schmidt, S.C., Thiery, F., Trognitz, M.: Practices of linked open data in archaeology and their realisation in Wikidata. Digital 2(3), 333–364 (2022)

    Article  Google Scholar 

  55. Shin, H.J., Park, J.Y., Yuk, D.B., Lee, J.S.: BERT-based spatial information extraction. In: Proceedings of the Third International Workshop on Spatial Language Understanding, pp. 10–17. Association for Computational Linguistics, November 2020. https://doi.org/10.18653/v1/2020.splu-1.2

  56. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, no. 1, February 2017. https://doi.org/10.1609/aaai.v31i1.11164

  57. Utescher, R., Patee, A., Maiwald, F., Bruschke, J., Hoppe, S., Münster, S., Niebling, F., Zarrieß, S.: Exploring naming inventories for architectural elements for use in multi-modal machine learning applications. In: Workshop on Computational Methods in the Humanities (2022)

    Google Scholar 

  58. Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014). https://doi.org/10.1145/2629489

    Article  Google Scholar 

  59. Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: 32nd Annual Meeting of the Association for Computational Linguistics, pp. 133–138. Association for Computational Linguistics, Las Cruces, New Mexico, USA, June 1994. https://doi.org/10.3115/981732.981751

  60. Xu, H., Liu, B., Shu, L., Yu, P.S.: BERT post-training for review reading comprehension and aspect-based sentiment analysis. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 2324–2335. Association for Computational Linguistics, Minneapolis, Minnesota, June 2019. https://doi.org/10.18653/v1/N19-1242

  61. Yan, J., Yin, X.C., Lin, W., Deng, C., Zha, H., Yang, X.: A short survey of recent advances in graph matching. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, ICMR 2016, pp. 167–174. ACM, New York, NY, USA (2016). https://doi.org/10.1145/2911996.2912035

Download references

Acknowledgments

The work presented in this paper has been funded by the German Federal Ministry of Education and Research (BMBF) as part of the research project “HistKI”, grant identifier 01UG2120.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonas Bruschke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bruschke, J., Kröber, C., Utescher, R., Niebling, F. (2023). Towards Querying Multimodal Annotations Using Graphs. In: Münster, S., Pattee, A., Kröber, C., Niebling, F. (eds) Research and Education in Urban History in the Age of Digital Libraries. UHDL 2023. Communications in Computer and Information Science, vol 1853. Springer, Cham. https://doi.org/10.1007/978-3-031-38871-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-38871-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-38870-5

  • Online ISBN: 978-3-031-38871-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics