Abstract
The recent achievements in Natural Language Processing in terms of scalability and performance, and the large availability of background knowledge within the Semantic Web and the Linked Open Data initiative, encourage researchers in doing a further step towards the creation of machines capable of understanding multimedia documents by exploiting background knowledge. To pursue this direction it turns out to be necessary to maintain a clear link between knowledge and the documents containing it. This is achieved in the KnowledgeStore, a scalable content management system that supports the tight integration and storage of multimedia resources and background and extracted knowledge. Integration is done by (i)identifying mentions of named entities in multimedia resources, (ii)establishing mention coreference and either (iii)linking mentions to entities in the background knowledge, or (iv)extending that knowledge with new entities. We present the KnowledgeStore and describe its use in creating a large scale repository of knowledge and multimedia resources in the Italian Trentino region, whose interlinking allows us to explore advanced tasks such as entity-based search and semantic enrichment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
LiveMemories (Active Digital Memories of Collective Life—http://www.livememories.org).
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
Subject, predicate and object are the standard terms denoting the components of a triple in the Semantic Web literature: although they are named after the components of a natural language sentence, they convey no linguistic semantics.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
References
Bentivogli, L., Girardi, C., Pianta, E.: Creating a gold standard for person cross-document coreference resolution in Italian news. In: Proceedings of LREC ’08 Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management, Marrakech (2008)
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia – a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)
Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K.: Using background knowledge to support coreference resolution. In: Proceedings of 19th European: Conference on Artificial Intelligence, ECAI ’10, pp. 759–764. IOS Press, Amsterdam (2010)
Buscaldi, D., Magnini, B.: Grounding toponyms in an Italian local news corpus. In: Proceedings of 6th Workshop on Geographic Information Retrieval, GIR ’10, Zurich, pp. 15:1–15:5. ACM, New York (2010)
Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: Proceedings of 14th International Conference on World Wide Web, WWW ’05, Chiba, pp. 613–622. ACM, New York (2005)
Connolly, D.: Gleaning resource descriptions from dialects of languages (GRDDL). W3C recommendation, W3C (2007). http://www.w3.org/TR/2007/REC-grddl-20070911/
Etzioni, O., Fader, A., Christensen, J., Soderland Mausam, S.: Open information extraction: the second generation. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, Barcelona, Lisbon, Portugal, pp. 3–10. IJCAI’11, Menlo Park (2011)
Ghosh, S., Shankar, N., Owre, S.: Machine reading using Markov logic networks for collective probabilistic inference. In: Proceedings of ECML-PKDD Workshop on Collective Learning and Inference on Structured Data, CoLISD ’11 (2011). http://www.cse.iitm.ac.in/CoLISD/2011/CoLISD.html
Heyer, L.J., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Res. 9(11), 1106–1115 (1999)
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence – Special Issue: Artificial Intelligence, Wikipedia and Semi-Structured Resources (2012)
Homola, M., Serafini, L.: Contextualized knowledge repositories for the semantic web. Web Semant. Sci. Serv. Agents Worldw. Web 12, 64–87 (2012)
Magnini, B., Pianta, E., Girardi, C., Negri, M., Romano, L., Speranza, M., Bartalesi Lenzi, V., Sprugnoli, R.: I-CAB: the Italian content annotation bank. In: Proceedings of 5th International Conference on Language Resources and Evaluation, LREC ’06, Genova (2006)
Mendes, P.N., Jakob, M., GarcÃa-Silva, A., Bizer, C.: DBPedia Spotlight: shedding light on the web of documents. In: Proceedings of 7th International Conference on Semantic Systems, pp. 1–8. ACM, New York (2011)
Oltramari, A., Lebiere, C.: Extending cognitive architectures with semantic resources. In: Proceedings of 4th International Conference on Artificial General Intelligence, AGI ’11, Mountain View, pp. 222–231. Springer, Berlin (2011)
Pemberton, S., Adida, B., McCarron, S., Birbeck, M.: RDFa in XHTML: syntax and processing. W3C recommendation (2008). http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014
Pianta, E., Tonelli, S.: KX: a flexible system for keyphrase extraction. In: Proceedings of 5th International Workshop on Semantic Evaluation, SemEval ’10, pp. 170–173, Uppsala (2010)
Pianta, E., Girardi, C., Zanoli, R.: The TextPro tool suite. In: Proceedings of 6th International Conference on Language Resources and Evaluation, LREC ’08. ELRA, Marrakech (2008)
Rahman, A., Ng, V.: Coreference resolution with world knowledge. In: Proceedings of 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1, ACL-HLT ’11, Portland, pp. 814–824. ACL, Portland (2011)
Suãrez-Figueroa, M.C., Atemezing, G.A., Corcho, O.: The landscape of multimedia ontologies in the last decade. Multimed. Tools Appl. 55(3), 1–23 (2011). http://link.springer.com/article/10.1007/s11042-011-0905-z?null
Tamilin, A., Magnini, B., Serafini, L.: Leveraging entity linking by contextualized background knowledge: a case study for news domain in Italian. In: Proceedings of 6th Workshop on Semantic Web Applications and Perspectives, SWAP ’10 (2010). http://www.inf.unibz.it/krdb/events/swap2010/page10/page10.html
Zanoli, R., Corcoglioniti, F., Girardi, C.: Exploiting background knowledge for clustering person names. In: Proceedings of Evalita 2011 – Evaluation of NLP and Speech Tools for Italian (2012). Springer, Berlin. http://www.evalita.it/2011/information_about_publications
Acknowledgements
This work was supported by the LiveMemories project (Active Digital Memories of Collective Life) funded by Autonomous Province of Trento (Italy).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Cattoni, R., Corcoglioniti, F., Girardi, C., Magnini, B., Serafini, L., Zanoli, R. (2013). Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore . In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31782-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-31782-8_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31781-1
Online ISBN: 978-3-642-31782-8
eBook Packages: Computer ScienceComputer Science (R0)