Skip to main content

A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9049))

Abstract

Google’s Knowledge Graph offers structured summaries for entity searches. This provides a better user experience by focusing on the main aspects of the query entity only. But to do this Google relies on curated knowledge bases. In consequence, only entities included in such knowledge bases can benefit from such a feature. In this paper, we propose ARES, a system that automatically discovers a manageable number of attributes well-suited for high precision entity summarization. With any entity-centric query and exploiting diverse facts from Web documents, ARES derives a common structure (or schema) comprising attributes typical for entities of the same or similar entity type. To do this, we extend the concept of typicality from cognitive psychology and define a practical measure for attribute typicality. We evaluate the quality of derived structures for various entities and entity types in terms of precision and recall. ARES achieves results superior to Google’s Knowledge Graph or to frequency-based statistical approaches for structure extraction.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., et al.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), p. 5. ACM Press (2009)

    Google Scholar 

  2. Bakalov, A., Fuxman, A.: SCAD: collective discovery of attribute values categories and subject descriptors. In: Procedings of the 20th International World Wide Web Conference (WWW), Hyderabad, India, pp. 447–456 (2011)

    Google Scholar 

  3. Balog, K., et al.: Overview of the trec 2011 entity track. In: TREC (2011)

    Google Scholar 

  4. Barzilay, R., Lee, L.: Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL), Edmonton, Canada, pp. 16–23 (2003)

    Google Scholar 

  5. Bron, M., Balog, K., de Rijke, M.: Example based entity search in the web of data. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 392–403. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Cafarella, M.J., et al.: Data integration for the relational web. In: Proceedings of the Very Large Database Endowment (PVLDB), Lyon, France (2009)

    Google Scholar 

  7. Cafarella, M.J., et al.: WebTables: exploring the power of tables on the web. In: Proceedings of the Very Large Database Endowment (PVLDB), Auckland, New Zealand, pp. 538–549 (2008)

    Google Scholar 

  8. Cheng, T., et al.: EntityRank: searching entities directly and holistically. In: Proceedings of the 33rd International Conference on Very Large Databases. (VLDB), pp. 387–398 (2007)

    Google Scholar 

  9. Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 entity ranking track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Fader, A., et al.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, Scotland, UK, pp. 1535–1545 (2011)

    Google Scholar 

  11. Hasegawa, T., et al.: Discovering relations among named entities from large corpora. In: Proceedings of the 42th Annual Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain (2004)

    Google Scholar 

  12. Homoceanu, S., Geilert, F., Pek, C., Balke, W.-T.: Any suggestions? active schema support for structuring web information. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part II. LNCS, vol. 8422, pp. 251–265. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  13. Homoceanu, S., Balke, W.: What makes a phone a business phone. In: Proceedings of the International Conference on Web Intelligence (WI), Lyon, France (2011)

    Google Scholar 

  14. Kumar, R., Tomkins, A.: A characterization of online search behavior. Proc. IEEE Data Eng. Bull. 32(2), 1–9 (2009)

    Google Scholar 

  15. Lee, L.: Measures of distributional similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 25–32 (1999)

    Google Scholar 

  16. Lee, T., et al.: Attribute extraction and scoring: a probabilistic approach. In: Proc. of. ICDE (2013)

    Google Scholar 

  17. Lin, J., Katz, B.: Question answering from the web using knowledge annotation and knowledge mining techniques. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), p. 116. ACM Press (2003)

    Google Scholar 

  18. Metzger, S., et al.: QBEES: query by entity examples. In: Proc. of the 22nd ACM Int. Conf. on Information and Knowledge Management (CIKM), pp. 1829–1832. ACM, New York (2013)

    Google Scholar 

  19. Metzger, S., Schenkel, R.: S3 K: seeking statement-supporting top-k witnesses. In: Proceedings of the 20th Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, UK, pp. 37–46 (2011)

    Google Scholar 

  20. Qian, L., et al.: Sample-driven schema mapping. In: Proceedings of the International Conference on Management of Data (SIGMOD). ACM Press, Scottsdale (2012)

    Google Scholar 

  21. Rosch, E.: Cognitive representations of semantic categories. J. Exp. Psychol. Gen. 104(3), 192–233 (1975)

    Article  Google Scholar 

  22. Sydow, M., et al.: DIVERSUM: towards diversified summarisation of entities in knowledge graphs. In: Proceedings of the International Conference on Data Engineering Workshop (ICDEW), pp. 221–226 (2010)

    Google Scholar 

  23. Tesauro, G., et al.: Analysis of Watson’s Strategies for Playing Jeopardy! J. Artif. Intell. Res. 21, 205–251 (2013)

    MathSciNet  Google Scholar 

  24. Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327–352 (1977)

    Article  Google Scholar 

  25. Wittgenstein, L.: Philosophical Investigations. The MacMillan Company, New York (1953)

    Google Scholar 

  26. Yates, A., Etzioni, O.: Unsupervised methods for determining object and relation synonyms on the web. J. Artif. Intell. Res. 34, 255–296 (2009)

    MATH  Google Scholar 

  27. Zhou, M., et al.: Learning to rank from distant supervision: exploiting noisy redundancy for relational entity search. In: Proceedings of the International Conference on Data Engineering (ICDE) (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Silviu Homoceanu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Homoceanu, S., Balke, WT. (2015). A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18120-2_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18119-6

  • Online ISBN: 978-3-319-18120-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics