Abstract
Google’s Knowledge Graph offers structured summaries for entity searches. This provides a better user experience by focusing on the main aspects of the query entity only. But to do this Google relies on curated knowledge bases. In consequence, only entities included in such knowledge bases can benefit from such a feature. In this paper, we propose ARES, a system that automatically discovers a manageable number of attributes well-suited for high precision entity summarization. With any entity-centric query and exploiting diverse facts from Web documents, ARES derives a common structure (or schema) comprising attributes typical for entities of the same or similar entity type. To do this, we extend the concept of typicality from cognitive psychology and define a practical measure for attribute typicality. We evaluate the quality of derived structures for various entities and entity types in terms of precision and recall. ARES achieves results superior to Google’s Knowledge Graph or to frequency-based statistical approaches for structure extraction.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., et al.: Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM), p. 5. ACM Press (2009)
Bakalov, A., Fuxman, A.: SCAD: collective discovery of attribute values categories and subject descriptors. In: Procedings of the 20th International World Wide Web Conference (WWW), Hyderabad, India, pp. 447–456 (2011)
Balog, K., et al.: Overview of the trec 2011 entity track. In: TREC (2011)
Barzilay, R., Lee, L.: Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL), Edmonton, Canada, pp. 16–23 (2003)
Bron, M., Balog, K., de Rijke, M.: Example based entity search in the web of data. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 392–403. Springer, Heidelberg (2013)
Cafarella, M.J., et al.: Data integration for the relational web. In: Proceedings of the Very Large Database Endowment (PVLDB), Lyon, France (2009)
Cafarella, M.J., et al.: WebTables: exploring the power of tables on the web. In: Proceedings of the Very Large Database Endowment (PVLDB), Auckland, New Zealand, pp. 538–549 (2008)
Cheng, T., et al.: EntityRank: searching entities directly and holistically. In: Proceedings of the 33rd International Conference on Very Large Databases. (VLDB), pp. 387–398 (2007)
Demartini, G., Iofciu, T., de Vries, A.P.: Overview of the INEX 2009 entity ranking track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 254–264. Springer, Heidelberg (2010)
Fader, A., et al.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, Scotland, UK, pp. 1535–1545 (2011)
Hasegawa, T., et al.: Discovering relations among named entities from large corpora. In: Proceedings of the 42th Annual Meeting of the Association for Computational Linguistics (ACL), Barcelona, Spain (2004)
Homoceanu, S., Geilert, F., Pek, C., Balke, W.-T.: Any suggestions? active schema support for structuring web information. In: Bhowmick, S.S., Dyreson, C.E., Jensen, C.S., Lee, M.L., Muliantara, A., Thalheim, B. (eds.) DASFAA 2014, Part II. LNCS, vol. 8422, pp. 251–265. Springer, Heidelberg (2014)
Homoceanu, S., Balke, W.: What makes a phone a business phone. In: Proceedings of the International Conference on Web Intelligence (WI), Lyon, France (2011)
Kumar, R., Tomkins, A.: A characterization of online search behavior. Proc. IEEE Data Eng. Bull. 32(2), 1–9 (2009)
Lee, L.: Measures of distributional similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 25–32 (1999)
Lee, T., et al.: Attribute extraction and scoring: a probabilistic approach. In: Proc. of. ICDE (2013)
Lin, J., Katz, B.: Question answering from the web using knowledge annotation and knowledge mining techniques. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), p. 116. ACM Press (2003)
Metzger, S., et al.: QBEES: query by entity examples. In: Proc. of the 22nd ACM Int. Conf. on Information and Knowledge Management (CIKM), pp. 1829–1832. ACM, New York (2013)
Metzger, S., Schenkel, R.: S3 K: seeking statement-supporting top-k witnesses. In: Proceedings of the 20th Conference on Information and Knowledge Management (CIKM), Glasgow, Scotland, UK, pp. 37–46 (2011)
Qian, L., et al.: Sample-driven schema mapping. In: Proceedings of the International Conference on Management of Data (SIGMOD). ACM Press, Scottsdale (2012)
Rosch, E.: Cognitive representations of semantic categories. J. Exp. Psychol. Gen. 104(3), 192–233 (1975)
Sydow, M., et al.: DIVERSUM: towards diversified summarisation of entities in knowledge graphs. In: Proceedings of the International Conference on Data Engineering Workshop (ICDEW), pp. 221–226 (2010)
Tesauro, G., et al.: Analysis of Watson’s Strategies for Playing Jeopardy! J. Artif. Intell. Res. 21, 205–251 (2013)
Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327–352 (1977)
Wittgenstein, L.: Philosophical Investigations. The MacMillan Company, New York (1953)
Yates, A., Etzioni, O.: Unsupervised methods for determining object and relation synonyms on the web. J. Artif. Intell. Res. 34, 255–296 (2009)
Zhou, M., et al.: Learning to rank from distant supervision: exploiting noisy redundancy for relational entity search. In: Proceedings of the International Conference on Data Engineering (ICDE) (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Homoceanu, S., Balke, WT. (2015). A Chip Off the Old Block - Extracting Typical Attributes for Entities Based on Family Resemblance. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9049. Springer, Cham. https://doi.org/10.1007/978-3-319-18120-2_29
Download citation
DOI: https://doi.org/10.1007/978-3-319-18120-2_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18119-6
Online ISBN: 978-3-319-18120-2
eBook Packages: Computer ScienceComputer Science (R0)