skip to main content
10.1145/2630602.2630605acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

Using Conditional Functional Dependency to Discover Abnormal Data in RDF Graphs

Published:21 June 2014Publication History

ABSTRACT

Many issues about data quality have been studied in relational data, such as data consistency, data deduplication, data accuracy, data completeness and so on. In this paper, we focus on the discovery of abnormal data in RDF graphs. As the amount of RDF data is increasing, data quality is becoming an important issue for usability of these RDF repositories. Although association rules have been used to find abnormals in RDF graph, existing solutions ignore the latent semantics of connected structures in RDF graphs. In order to detect latent dependencies in RDF graph, firstly, we innovatively define Graph-based Conditional Functional Dependency(GCFD) that can represent the attribute value and semantic structure dependencies of RDF data in a uniform manner. Then, we propose an efficient framework and some novel pruning rules to discover GCFD in large RDF graphs. Extensive experiments on several real-life RDF repositories confirm the superiority of our solution.

References

  1. Z. Abedjan and F. Naumann. Improving rdf data through association rule mining. Datenbank-Spektrum, 13(2):111--120, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. G. Ives. Dbpedia: A nucleus for a web of open data. In ISWC/ASWC, pages 722--735, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Bohannon, W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis. Conditional functional dependencies for data cleaning. In ICDE, pages 746--755, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  4. K. D. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference, pages 1247--1250, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. W. W. Eckerson. Data Quality and the Bottom Line: Achieving Business Success through a Commitment to High Quality Data. TDWI Report Series, The Data Warehousing Institute, Seattle, USA, February 2002.Google ScholarGoogle Scholar
  6. W. Fan, F. Geerts, J. Li, and M. Xiong. Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng., 23(5):683--698, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD Conference, pages 1--12, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Y. Huhtala, J. Kärkkäinen, P. Porkka, and H. Toivonen. Tane: An efficient algorithm for discovering functional and approximate dependencies. Comput. J., 42(2):100--111, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  9. A. B. Kahn. Topological sorting of large networks. Commun. ACM, 5(11):558--562, 1962. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Serge, H. Richard, and V. Victor. Foundations of Databases. Addison-Wesley Reading Massachusetts, 1995.Google ScholarGoogle Scholar
  11. F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. In WWW, pages 697--706, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. M. Wyss, C. Giannella, and E. L. Robertson. Fastfds: A heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances - extended abstract. In DaWaK, pages 101--110, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Yu and J. Heflin. Extending functional dependency to detect abnormal data in rdf graphs. In International Semantic Web Conference (1), pages 794--809, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using Conditional Functional Dependency to Discover Abnormal Data in RDF Graphs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SWIM'14: Proceedings of Semantic Web Information Management on Semantic Web Information Management
            June 2014
            45 pages
            ISBN:9781450329941
            DOI:10.1145/2630602

            Copyright © 2014 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 21 June 2014

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • tutorial
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader