skip to main content
10.1145/2063518.2063520acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

Parallelizing LIMES for large-scale link discovery

Published:07 September 2011Publication History

ABSTRACT

The Linked Open Data cloud consists of more than 26 billion triples, of which less than 3% are links between knowledge bases. However, such links play a central role in key tasks such as cross-ontology question answering, large-scale inferencing and link-based traversal query execution models. The mere size of the Linked Data Cloud makes manual linking impossible. Consequently, Link Discovery Frameworks have been developed over the last years with the aim of providing means to detect links between knowledge bases automatically. Yet, even the current runtime-optimized frameworks for linking lead to unacceptable runtimes when presented with very large datasets. This paper addresses the time complexity of Link Discovery on very large datasets by presenting and evaluating the parallelization of the time-optimized LIMES framework by means of the MapReduce paradigm.

References

  1. Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec 2006.Google ScholarGoogle Scholar
  2. Sören Auer, Chris Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. DBpedia: A nucleus for a web of open data. In ISWC, pages 722--735. Springer, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Liang T. Chen and Deepankar Bairagi. Developing Parallel Programs -- A Discussion of Popular Models. Technical report, Oracle Corporation, September 2010.Google ScholarGoogle Scholar
  4. Ali Ebnenasir and Rasoul Beik. Developing parallel programs: A design-oriented perspective. In IWMSE '09, pages 1--8, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tarek El-Ghazawi and Francois Cantonnet. Upc performance and potential: a npb experimental study. In Supercomputing, pages 1--26, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hugh Glaser, Ian C. Millard, Won-Kyung Sung, Seungwoo Lee, Pyung Kim, and Beom-Jong You. Research on linked data and co-reference resolution. Technical report, University of Southampton, 2009.Google ScholarGoogle Scholar
  7. Tom Heath and Christian Bizer. Linked Data: Evolving the Web into a Global Data Space. Morgan & Claypool, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. John L. Hennessy and David A. Patterson. Computer Architecture - A Quantitative Approach. Morgan Kaufmann, fourth edition, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Henry Kasim, Verdi March, Rita Zhang, and Simon See. Survey on Parallel Programming Model. In Network and Parallel Computing, pages 266--275. Springer Berlin/Heidelberg, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Vanessa Lopez, Victoria Uren, Marta Reka Sabou, and Enrico Motta. Cross ontology query answering on the semantic web: an initial evaluation. In K-CAP, pages 17--24, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. James McCusker and Deborah McGuinness. Towards identity in linked data. In Proceedings of OWL Experiences and Directions Seventh Annual Workshop, 2010.Google ScholarGoogle Scholar
  12. Axel-Cyrille Ngonga Ngomo and Sören Auer. A time-efficient approach for large-scale link discovery on the web of data. In IJCAI, 2011.Google ScholarGoogle Scholar
  13. Michael J. Quinn. Parallel Programming in C with MPI and OpenMP. McGraw-Hill Education Group, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yves Raimond, Christopher Sutton, and Mark Sandler. Automatic interlinking of music datasets on the semantic web. In Proceedings of the 1st Workshop about Linked Data on the Web, 2008.Google ScholarGoogle Scholar
  15. François Scharffe, Yanbin Liu, and Chuguang Zhou. Rdf-ai: an architecture for rdf datasets matching, fusion and interlink. In Proc. IJCAI 2009 workshop on Identity, reference, and knowledge representation (IR-KR), Pasadena (CA US), 2009.Google ScholarGoogle Scholar
  16. Julius Volz, Christian Bizer, Martin Gaedke, and Georgi Kobilarov. Discovering and maintaining links on the web of data. In ISWC, pages 650--665, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parallelizing LIMES for large-scale link discovery

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems
                September 2011
                129 pages
                ISBN:9781450306218
                DOI:10.1145/2063518

                Copyright © 2011 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 7 September 2011

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                Overall Acceptance Rate40of182submissions,22%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader