skip to main content
10.1145/1772690.1772833acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
poster

Enabling entity-based aggregators for web 2.0 data

Published:26 April 2010Publication History

ABSTRACT

Selecting and presenting content culled from multiple heterogeneous and physically distributed sources is a challenging task. The exponential growth of the web data in modern times has brought new requirements to such integration systems. Data is not any more produced by content providers alone, but also from regular users through the highly popular Web 2.0 social and semantic web applications. The plethora of the available web content increased its demand by regular users who could not any more wait the development of advanced integration tools. They wanted to be able to build in a short time their own specialized integration applications. Aggregators came to the risk of these users. They allowed them not only to combine distributed content, but also to process it in ways that generate new services available for further consumption.

To cope with the heterogeneous data, the Linked Data initiative aims at the creation and exploitation of correspondences across data values. In this work, although we share the Linked Data community vision, we advocate that for the modern web, linking at the data value level is not enough. Aggregators should base their integration tasks on the concept of an entity, i.e., identifying whether different pieces of information correspond to the same real world entity, such as an event or a person. We describe our theory, system, and experimental results that illustrate the approach's effectiveness.

References

  1. S. Amer-Yahia, V. Markl, A. Y. Halevy, A. Doan, G. Alonso, D. Kossmann, and G. Weikum. Databases and web 2.0 panel at vldb 2007. SIGMOD Record, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. Andritsos, A. Fuxman, and R. J. Miller. Clean answers over dirty databases: A probabilistic approach. In ICDE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C. Bizer, T. Heath, and T. Berners-Lee. Linked Data - The story so far. IJSWIS, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  4. N. N. Dalvi, R. Kumar, B. Pang, R. Ramakrishnan, A. Tomkins, P. Bohannon, S. Keerthi, and S. Merugu. A web of concepts. In PODS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Trans. Knowl. Data Eng., 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Ioannou, S. Sathe, N. Bonvin, A. Jain, S. Bondalapati, G. Skobeltsyn, C. Niederée, and Z. Miklos. Entity Search with Necessity. In WebDB, 2009.Google ScholarGoogle Scholar
  7. G. D. Lorenzo, H. Hacid, H. young Paik, and B. Benatallah. Data integration in mashups. SIGMOD Rec., 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. B.-W. On, N. Koudas, D. Lee, and D. Srivastava. Group linkage. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  9. M. Zhong, M. Liu, and Q. Chen. Modeling heterogeneous data in dataspace. In IEEE IRI, 2008.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Enabling entity-based aggregators for web 2.0 data

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          WWW '10: Proceedings of the 19th international conference on World wide web
          April 2010
          1407 pages
          ISBN:9781605587998
          DOI:10.1145/1772690

          Copyright © 2010 Copyright is held by the author/owner(s)

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 26 April 2010

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        ePub

        View this article in ePub.

        View ePub