Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore

Cattoni, R.; Corcoglioniti, F.; Girardi, C.; Magnini, B.; Serafini, L.; Zanoli, R.

doi:10.1007/978-3-642-31782-8_6

R. Cattoni⁵,
F. Corcoglioniti^5,6,
C. Girardi⁵,
B. Magnini⁵,
L. Serafini⁵ &
…
R. Zanoli⁵

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

1259 Accesses
1 Citations

Abstract

The recent achievements in Natural Language Processing in terms of scalability and performance, and the large availability of background knowledge within the Semantic Web and the Linked Open Data initiative, encourage researchers in doing a further step towards the creation of machines capable of understanding multimedia documents by exploiting background knowledge. To pursue this direction it turns out to be necessary to maintain a clear link between knowledge and the documents containing it. This is achieved in the KnowledgeStore, a scalable content management system that supports the tight integration and storage of multimedia resources and background and extracted knowledge. Integration is done by (i)identifying mentions of named entities in multimedia resources, (ii)establishing mention coreference and either (iii)linking mentions to entities in the background knowledge, or (iv)extending that knowledge with new entities. We present the KnowledgeStore and describe its use in creating a large scale repository of knowledge and multimedia resources in the Italian Trentino region, whose interlinking allows us to explore advanced tasks such as entity-based search and semantic enrichment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
LiveMemories (Active Digital Memories of Collective Life—http://www.livememories.org).
2.
http://linkeddata.org/
3.
http://www.opencalais.com/
4.
http://gate.ac.uk/
5.
http://www.cnts.ua.ac.be/conll2002/ner/
6.
http://www.itl.nist.gov/iad/mig/tests/ace/
7.
http://uima.apache.org/
8.
http://nlp2rdf.org/nif-1-0
9.
http://dublincore.org/documents/dces/
10.
http://www.zemanta.com
11.
http://www.alchemyapi.com
12.
Subject, predicate and object are the standard terms denoting the components of a triple in the Semantic Web literature: although they are named after the components of a natural language sentence, they convey no linguistic semantics.
13.
http://hadoop.apache.org
14.
http://hbase.apache.org
15.
http://tomcat.apache.org
16.
http://www.hbql.com
17.
http://www.evalita.it/2011/tasks/NePS
18.
http://www.geonames.org/
19.
http://code.google.com/apis/maps/
20.
http://thewikimachine.fbk.eu

References

Bentivogli, L., Girardi, C., Pianta, E.: Creating a gold standard for person cross-document coreference resolution in Italian news. In: Proceedings of LREC ’08 Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management, Marrakech (2008)
Google Scholar
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia – a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)
Article Google Scholar
Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K.: Using background knowledge to support coreference resolution. In: Proceedings of 19th European: Conference on Artificial Intelligence, ECAI ’10, pp. 759–764. IOS Press, Amsterdam (2010)
Google Scholar
Buscaldi, D., Magnini, B.: Grounding toponyms in an Italian local news corpus. In: Proceedings of 6th Workshop on Geographic Information Retrieval, GIR ’10, Zurich, pp. 15:1–15:5. ACM, New York (2010)
Google Scholar
Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: Proceedings of 14th International Conference on World Wide Web, WWW ’05, Chiba, pp. 613–622. ACM, New York (2005)
Google Scholar
Connolly, D.: Gleaning resource descriptions from dialects of languages (GRDDL). W3C recommendation, W3C (2007). http://www.w3.org/TR/2007/REC-grddl-20070911/
Etzioni, O., Fader, A., Christensen, J., Soderland Mausam, S.: Open information extraction: the second generation. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, Barcelona, Lisbon, Portugal, pp. 3–10. IJCAI’11, Menlo Park (2011)
Google Scholar
Ghosh, S., Shankar, N., Owre, S.: Machine reading using Markov logic networks for collective probabilistic inference. In: Proceedings of ECML-PKDD Workshop on Collective Learning and Inference on Structured Data, CoLISD ’11 (2011). http://www.cse.iitm.ac.in/CoLISD/2011/CoLISD.html
Heyer, L.J., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Res. 9(11), 1106–1115 (1999)
Article Google Scholar
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence – Special Issue: Artificial Intelligence, Wikipedia and Semi-Structured Resources (2012)
Google Scholar
Homola, M., Serafini, L.: Contextualized knowledge repositories for the semantic web. Web Semant. Sci. Serv. Agents Worldw. Web 12, 64–87 (2012)
Article Google Scholar
Magnini, B., Pianta, E., Girardi, C., Negri, M., Romano, L., Speranza, M., Bartalesi Lenzi, V., Sprugnoli, R.: I-CAB: the Italian content annotation bank. In: Proceedings of 5th International Conference on Language Resources and Evaluation, LREC ’06, Genova (2006)
Google Scholar
Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBPedia Spotlight: shedding light on the web of documents. In: Proceedings of 7th International Conference on Semantic Systems, pp. 1–8. ACM, New York (2011)
Google Scholar
Oltramari, A., Lebiere, C.: Extending cognitive architectures with semantic resources. In: Proceedings of 4th International Conference on Artificial General Intelligence, AGI ’11, Mountain View, pp. 222–231. Springer, Berlin (2011)
Google Scholar
Pemberton, S., Adida, B., McCarron, S., Birbeck, M.: RDFa in XHTML: syntax and processing. W3C recommendation (2008). http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014
Pianta, E., Tonelli, S.: KX: a flexible system for keyphrase extraction. In: Proceedings of 5th International Workshop on Semantic Evaluation, SemEval ’10, pp. 170–173, Uppsala (2010)
Google Scholar
Pianta, E., Girardi, C., Zanoli, R.: The TextPro tool suite. In: Proceedings of 6th International Conference on Language Resources and Evaluation, LREC ’08. ELRA, Marrakech (2008)
Google Scholar
Rahman, A., Ng, V.: Coreference resolution with world knowledge. In: Proceedings of 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1, ACL-HLT ’11, Portland, pp. 814–824. ACL, Portland (2011)
Google Scholar
Suãrez-Figueroa, M.C., Atemezing, G.A., Corcho, O.: The landscape of multimedia ontologies in the last decade. Multimed. Tools Appl. 55(3), 1–23 (2011). http://link.springer.com/article/10.1007/s11042-011-0905-z?null
Tamilin, A., Magnini, B., Serafini, L.: Leveraging entity linking by contextualized background knowledge: a case study for news domain in Italian. In: Proceedings of 6th Workshop on Semantic Web Applications and Perspectives, SWAP ’10 (2010). http://www.inf.unibz.it/krdb/events/swap2010/page10/page10.html
Zanoli, R., Corcoglioniti, F., Girardi, C.: Exploiting background knowledge for clustering person names. In: Proceedings of Evalita 2011 – Evaluation of NLP and Speech Tools for Italian (2012). Springer, Berlin. http://www.evalita.it/2011/information_about_publications

Download references

Acknowledgements

This work was supported by the LiveMemories project (Active Digital Memories of Collective Life) funded by Autonomous Province of Trento (Italy).

Author information

Authors and Affiliations

Fondazione Bruno Kessler, Via Sommarive 18, 38123, Trento, Italy
R. Cattoni, F. Corcoglioniti, C. Girardi, B. Magnini, L. Serafini & R. Zanoli
DISI - University of Trento, Via Sommarive 14, 38123, Trento, Italy
F. Corcoglioniti

Authors

R. Cattoni
View author publications
You can also search for this author in PubMed Google Scholar
F. Corcoglioniti
View author publications
You can also search for this author in PubMed Google Scholar
C. Girardi
View author publications
You can also search for this author in PubMed Google Scholar
B. Magnini
View author publications
You can also search for this author in PubMed Google Scholar
L. Serafini
View author publications
You can also search for this author in PubMed Google Scholar
R. Zanoli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Cattoni .

Editor information

Editors and Affiliations

Psychology Department, Carnegie Mellon University, Forbes Avenue 5000, Pittsburgh, 15213, Pennsylvania, USA
Alessandro Oltramari
Faculty of Arts, Language, Cognition and, Vrije University Amsterdam, De Boelelaan 1105, Amsterdam, 1081, Netherlands
Piek Vossen
Department of Computing, Hong Kong Polytechnic University, PQ806 Mong Man Wai Building, Hong Kong, 999077, Hong Kong SAR
Lu Qin
Language Technologies Institute, Carnegie Mellon University, Forbes Avenue 5000, Marina Del Rey, 15213, Pennsylvania, USA
Eduard Hovy

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cattoni, R., Corcoglioniti, F., Girardi, C., Magnini, B., Serafini, L., Zanoli, R. (2013). Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore . In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31782-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-31782-8_6
Published: 27 September 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31781-1
Online ISBN: 978-3-642-31782-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics