Skip to main content

Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore

  • Chapter
  • First Online:
New Trends of Research in Ontologies and Lexical Resources

Abstract

The recent achievements in Natural Language Processing in terms of scalability and performance, and the large availability of background knowledge within the Semantic Web and the Linked Open Data initiative, encourage researchers in doing a further step towards the creation of machines capable of understanding multimedia documents by exploiting background knowledge. To pursue this direction it turns out to be necessary to maintain a clear link between knowledge and the documents containing it. This is achieved in the KnowledgeStore, a scalable content management system that supports the tight integration and storage of multimedia resources and background and extracted knowledge. Integration is done by (i)identifying mentions of named entities in multimedia resources, (ii)establishing mention coreference and either (iii)linking mentions to entities in the background knowledge, or (iv)extending that knowledge with new entities. We present the KnowledgeStore and describe its use in creating a large scale repository of knowledge and multimedia resources in the Italian Trentino region, whose interlinking allows us to explore advanced tasks such as entity-based search and semantic enrichment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    LiveMemories (Active Digital Memories of Collective Life—http://www.livememories.org).

  2. 2.

    http://linkeddata.org/

  3. 3.

    http://www.opencalais.com/

  4. 4.

    http://gate.ac.uk/

  5. 5.

    http://www.cnts.ua.ac.be/conll2002/ner/

  6. 6.

    http://www.itl.nist.gov/iad/mig/tests/ace/

  7. 7.

    http://uima.apache.org/

  8. 8.

    http://nlp2rdf.org/nif-1-0

  9. 9.

    http://dublincore.org/documents/dces/

  10. 10.

    http://www.zemanta.com

  11. 11.

    http://www.alchemyapi.com

  12. 12.

    Subject, predicate and object are the standard terms denoting the components of a triple in the Semantic Web literature: although they are named after the components of a natural language sentence, they convey no linguistic semantics.

  13. 13.

    http://hadoop.apache.org

  14. 14.

    http://hbase.apache.org

  15. 15.

    http://tomcat.apache.org

  16. 16.

    http://www.hbql.com

  17. 17.

    http://www.evalita.it/2011/tasks/NePS

  18. 18.

    http://www.geonames.org/

  19. 19.

    http://code.google.com/apis/maps/

  20. 20.

    http://thewikimachine.fbk.eu

References

  1. Bentivogli, L., Girardi, C., Pianta, E.: Creating a gold standard for person cross-document coreference resolution in Italian news. In: Proceedings of LREC ’08 Workshop on Resources and Evaluation for Identity Matching, Entity Resolution and Entity Management, Marrakech (2008)

    Google Scholar 

  2. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia – a crystallization point for the web of data. J. Web Semant. 7(3), 154–165 (2009)

    Article  Google Scholar 

  3. Bryl, V., Giuliano, C., Serafini, L., Tymoshenko, K.: Using background knowledge to support coreference resolution. In: Proceedings of 19th European: Conference on Artificial Intelligence, ECAI ’10, pp. 759–764. IOS Press, Amsterdam (2010)

    Google Scholar 

  4. Buscaldi, D., Magnini, B.: Grounding toponyms in an Italian local news corpus. In: Proceedings of 6th Workshop on Geographic Information Retrieval, GIR ’10, Zurich, pp. 15:1–15:5. ACM, New York (2010)

    Google Scholar 

  5. Carroll, J.J., Bizer, C., Hayes, P., Stickler, P.: Named graphs, provenance and trust. In: Proceedings of 14th International Conference on World Wide Web, WWW ’05, Chiba, pp. 613–622. ACM, New York (2005)

    Google Scholar 

  6. Connolly, D.: Gleaning resource descriptions from dialects of languages (GRDDL). W3C recommendation, W3C (2007). http://www.w3.org/TR/2007/REC-grddl-20070911/

  7. Etzioni, O., Fader, A., Christensen, J., Soderland Mausam, S.: Open information extraction: the second generation. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, Barcelona, Lisbon, Portugal, pp. 3–10. IJCAI’11, Menlo Park (2011)

    Google Scholar 

  8. Ghosh, S., Shankar, N., Owre, S.: Machine reading using Markov logic networks for collective probabilistic inference. In: Proceedings of ECML-PKDD Workshop on Collective Learning and Inference on Structured Data, CoLISD ’11 (2011). http://www.cse.iitm.ac.in/CoLISD/2011/CoLISD.html

  9. Heyer, L.J., Kruglyak, S., Yooseph, S.: Exploring expression data: identification and analysis of coexpressed genes. Genome Res. 9(11), 1106–1115 (1999)

    Article  Google Scholar 

  10. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence – Special Issue: Artificial Intelligence, Wikipedia and Semi-Structured Resources (2012)

    Google Scholar 

  11. Homola, M., Serafini, L.: Contextualized knowledge repositories for the semantic web. Web Semant. Sci. Serv. Agents Worldw. Web 12, 64–87 (2012)

    Article  Google Scholar 

  12. Magnini, B., Pianta, E., Girardi, C., Negri, M., Romano, L., Speranza, M., Bartalesi Lenzi, V., Sprugnoli, R.: I-CAB: the Italian content annotation bank. In: Proceedings of 5th International Conference on Language Resources and Evaluation, LREC ’06, Genova (2006)

    Google Scholar 

  13. Mendes, P.N., Jakob, M., García-Silva, A., Bizer, C.: DBPedia Spotlight: shedding light on the web of documents. In: Proceedings of 7th International Conference on Semantic Systems, pp. 1–8. ACM, New York (2011)

    Google Scholar 

  14. Oltramari, A., Lebiere, C.: Extending cognitive architectures with semantic resources. In: Proceedings of 4th International Conference on Artificial General Intelligence, AGI ’11, Mountain View, pp. 222–231. Springer, Berlin (2011)

    Google Scholar 

  15. Pemberton, S., Adida, B., McCarron, S., Birbeck, M.: RDFa in XHTML: syntax and processing. W3C recommendation (2008). http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014

  16. Pianta, E., Tonelli, S.: KX: a flexible system for keyphrase extraction. In: Proceedings of 5th International Workshop on Semantic Evaluation, SemEval ’10, pp. 170–173, Uppsala (2010)

    Google Scholar 

  17. Pianta, E., Girardi, C., Zanoli, R.: The TextPro tool suite. In: Proceedings of 6th International Conference on Language Resources and Evaluation, LREC ’08. ELRA, Marrakech (2008)

    Google Scholar 

  18. Rahman, A., Ng, V.: Coreference resolution with world knowledge. In: Proceedings of 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies – Volume 1, ACL-HLT ’11, Portland, pp. 814–824. ACL, Portland (2011)

    Google Scholar 

  19. Suãrez-Figueroa, M.C., Atemezing, G.A., Corcho, O.: The landscape of multimedia ontologies in the last decade. Multimed. Tools Appl. 55(3), 1–23 (2011). http://link.springer.com/article/10.1007/s11042-011-0905-z?null

  20. Tamilin, A., Magnini, B., Serafini, L.: Leveraging entity linking by contextualized background knowledge: a case study for news domain in Italian. In: Proceedings of 6th Workshop on Semantic Web Applications and Perspectives, SWAP ’10 (2010). http://www.inf.unibz.it/krdb/events/swap2010/page10/page10.html

  21. Zanoli, R., Corcoglioniti, F., Girardi, C.: Exploiting background knowledge for clustering person names. In: Proceedings of Evalita 2011 – Evaluation of NLP and Speech Tools for Italian (2012). Springer, Berlin. http://www.evalita.it/2011/information_about_publications

Download references

Acknowledgements

This work was supported by the LiveMemories project (Active Digital Memories of Collective Life) funded by Autonomous Province of Trento (Italy).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Cattoni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Cattoni, R., Corcoglioniti, F., Girardi, C., Magnini, B., Serafini, L., Zanoli, R. (2013). Anchoring Background Knowledge to Rich Multimedia Contexts in the KnowledgeStore . In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31782-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31782-8_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31781-1

  • Online ISBN: 978-3-642-31782-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics