skip to main content
10.1145/3459637.3481991acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

HAO Unity: A Graph-based System for Unifying Heterogeneous Data

Published:30 October 2021Publication History

ABSTRACT

Many real-world applications have to face the problem of diversity in data formats and semantics. Currently, how to deal with heterogeneous data effectively is still a big challenge. With the rise of knowledge graphs, more and more applications are built upon graph-like data models, which benefit from flexible schemas and convenient support for relationship queries. We propose a graph-based unifying system for heterogeneous data unification, which helps to (1) transform data in many other formats into graphs, or conversely, from graph to other formats, (2) integrate graph data based on HAO intelligence, which achieves schema integration and entity consolidation, and (3) explore data at different levels via querying the integrated graphs. In this paper, we introduce the overall system architecture, explain in detail the implementation, and display the usage in two practical scenarios.

References

  1. Dippy Aggarwal and Karen C. Davis. 2018. Employing Graph Databases as a Standardization Model for Addressing Heterogeneity and Integration. In Quality Software Through Reuse and Integration,, Stuart H. Rubin and Thouraya Bouabana-Tebibel (Eds.). Springer International Publishing, Cham, 109--138.Google ScholarGoogle Scholar
  2. AnHai Doan, Alon Halevy, and Zachary Ives. 2012. Principles of Data Integration. Morgan Kaufmann, Boston, MA, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Xin Luna Dong and Divesh Srivastava. 2015. Big Data Integration. Synthesis Lectures on Data Management, Vol. 7, 1 (2015), 1--198.Google ScholarGoogle ScholarCross RefCross Ref
  4. Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-Trained Language Models. Proc. VLDB Endow., Vol. 14, 1 (Sept. 2020), 50--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sofía Maiolo, Lorena Etcheverry, and Adriana Marotta. 2020. Data Profiling in Property Graph Databases. Journal of Data and Information Quality, Vol. 12, 4 (2020), 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Neo4j. 2020. Developer Guides: Data Import. https://neo4j.com/developer/data-import/ Retrieved November 15, 2020 fromGoogle ScholarGoogle Scholar
  7. Michael Stonebraker. 2017. The seven tenets of scalable data unification. Tamr Inc (2017).Google ScholarGoogle Scholar
  8. Minghui Wu and Xindong Wu. 2019. On big wisdom. Knowledge and Information Systems, Vol. 58, 1 (2019), 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xindong Wu, Gongqing Wu, Xingquan Zhu, and Wei Ding. 2014. Data mining with big data. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 1 (2014), 97--107. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HAO Unity: A Graph-based System for Unifying Heterogeneous Data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
      October 2021
      4966 pages
      ISBN:9781450384469
      DOI:10.1145/3459637

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 October 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader