A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance

Homoceanu, Silviu; Balke, Wolf-Tilo

doi:10.1007/978-3-319-18120-2_29

A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance

Silviu Homoceanu¹⁷ &
Wolf-Tilo Balke¹⁷

Conference paper
First Online: 01 January 2015

1886 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9049))

Abstract

Google’s Knowledge Graph offers structured summaries for entity searches. This provides a better user experience by focusing on the main aspects of the query entity only. But to do this Google relies on curated knowledge bases. In consequence, only entities included in such knowledge bases can benefit from such a feature. In this paper, we propose ARES, a system that automatically discovers a manageable number of attributes well-suited for high precision entity summarization. With any entity-centric query and exploiting diverse facts from Web documents, ARES derives a common structure (or schema) comprising attributes typical for entities of the same or similar entity type. To do this, we extend the concept of typicality from cognitive psychology and define a practical measure for attribute typicality. We evaluate the quality of derived structures for various entities and entity types in terms of precision and recall. ARES achieves results superior to Google’s Knowledge Graph or to frequency-based statistical approaches for structure extraction.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., et al.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), p. 5. ACM Press (2009)
Google Scholar
Bakalov, A., Fuxman, A.: SCAD: collective discovery of attribute values categories and subject descriptors. In: Procedings of the 20th International World Wide Web Conference (WWW), Hyderabad, India, pp. 447–456 (2011)
Google Scholar
Balog, K., et al.: Overview of the trec 2011 entity track. In: TREC (2011)
Google Scholar
Barzilay, R., Lee, L.: Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL), Edmonton, Canada, pp. 16–23 (2003)
Google Scholar
Bron, M., Balog, K., de Rijke, M.: Example based entity search in the web of data. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 392–403. Springer, Heidelberg (2013)
Chapter Google Scholar
Cafarella, M.J., et al.: Data integration for the relational web. In: Proceedings of the Very Large Database Endowment (PVLDB), Lyon, France (2009)
Google Scholar
Cafarella, M.J., et al.: WebTables: exploring the power of tables on the web. In: Proceedings of the Very Large Database Endowment (PVLDB), Auckland, New Zealand, pp. 538–549 (2008)
Google Scholar
Cheng, T., et al.: EntityRank: searching entities directly and holistically. In: Proceedings of the 33rd International Conference on Very Large Databases. (VLDB), pp. 387–398 (2007)
Google Scholar
Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 entity ranking track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)
Chapter Google Scholar
Fader, A., et al.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, Scotland, UK, pp. 1535–1545 (2011)
Google Scholar
Hasegawa, T., et al.: Discovering relations among named entities from large corpora. In: Proceedings of the 42th Annual Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain (2004)
Google Scholar
Homoceanu, S., Geilert, F., Pek, C., Balke, W.-T.: Any suggestions? active schema support for structuring web information. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part II. LNCS, vol. 8422, pp. 251–265. Springer, Heidelberg (2014)
Chapter Google Scholar
Homoceanu, S., Balke, W.: What makes a phone a business phone. In: Proceedings of the International Conference on Web Intelligence (WI), Lyon, France (2011)
Google Scholar
Kumar, R., Tomkins, A.: A characterization of online search behavior. Proc. IEEE Data Eng. Bull. 32(2), 1–9 (2009)
Google Scholar
Lee, L.: Measures of distributional similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 25–32 (1999)
Google Scholar
Lee, T., et al.: Attribute extraction and scoring: a probabilistic approach. In: Proc. of. ICDE (2013)
Google Scholar
Lin, J., Katz, B.: Question answering from the web using knowledge annotation and knowledge mining techniques. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), p. 116. ACM Press (2003)
Google Scholar
Metzger, S., et al.: QBEES: query by entity examples. In: Proc. of the 22nd ACM Int. Conf. on Information and Knowledge Management (CIKM), pp. 1829–1832. ACM, New York (2013)
Google Scholar
Metzger, S., Schenkel, R.: S3 K: seeking statement-supporting top-k witnesses. In: Proceedings of the 20th Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, UK, pp. 37–46 (2011)
Google Scholar
Qian, L., et al.: Sample-driven schema mapping. In: Proceedings of the International Conference on Management of Data (SIGMOD). ACM Press, Scottsdale (2012)
Google Scholar
Rosch, E.: Cognitive representations of semantic categories. J. Exp. Psychol. Gen. 104(3), 192–233 (1975)
Article Google Scholar
Sydow, M., et al.: DIVERSUM: towards diversified summarisation of entities in knowledge graphs. In: Proceedings of the International Conference on Data Engineering Workshop (ICDEW), pp. 221–226 (2010)
Google Scholar
Tesauro, G., et al.: Analysis of Watson’s Strategies for Playing Jeopardy! J. Artif. Intell. Res. 21, 205–251 (2013)
MathSciNet Google Scholar
Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327–352 (1977)
Article Google Scholar
Wittgenstein, L.: Philosophical Investigations. The MacMillan Company, New York (1953)
Google Scholar
Yates, A., Etzioni, O.: Unsupervised methods for determining object and relation synonyms on the web. J. Artif. Intell. Res. 34, 255–296 (2009)
MATH Google Scholar
Zhou, M., et al.: Learning to rank from distant supervision: exploiting noisy redundancy for relational entity search. In: Proceedings of the International Conference on Data Engineering (ICDE) (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

IFIS TU Braunschweig, Mühlenpfordstraße 23, 38106, Braunschweig, Germany
Silviu Homoceanu & Wolf-Tilo Balke

Authors

Silviu Homoceanu
View author publications
You can also search for this author in PubMed Google Scholar
Wolf-Tilo Balke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silviu Homoceanu .

Editor information

Editors and Affiliations

Universität München, München, Germany
Matthias Renz
University of Southern California, Los Angeles, USA
Cyrus Shahabi
University of Queensland, Brisbane, Australia
Xiaofang Zhou
Monash University, Clayton, Australia
Muhammad Aamir Cheema

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Homoceanu, S., Balke, WT. (2015). A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_29

Download citation

DOI: https://doi.org/10.1007/978-3-319-18120-2_29
Published: 09 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18119-6
Online ISBN: 978-3-319-18120-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics