research-article

Parallelizing LIMES for large-scale link discovery

Authors:
Stanley Hillner

itemis AG, Ludwig-Erhard-Strasse, Leipzig, Germany

itemis AG, Ludwig-Erhard-Strasse, Leipzig, Germany
View Profile

,
Axel-Cyrille Ngonga Ngomo

University of Leipzig, Johannisgasse, Leipzig, Germany

University of Leipzig, Johannisgasse, Leipzig, Germany
View Profile

I-Semantics '11: Proceedings of the 7th International Conference on Semantic SystemsSeptember 2011Pages 9–16https://doi.org/10.1145/2063518.2063520

Published:07 September 2011Publication History

I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems

Pages 9–16

ABSTRACT

The Linked Open Data cloud consists of more than 26 billion triples, of which less than 3% are links between knowledge bases. However, such links play a central role in key tasks such as cross-ontology question answering, large-scale inferencing and link-based traversal query execution models. The mere size of the Linked Data Cloud makes manual linking impossible. Consequently, Link Discovery Frameworks have been developed over the last years with the aim of providing means to detect links between knowledge bases automatically. Yet, even the current runtime-optimized frameworks for linking lead to unacceptable runtimes when presented with very large datasets. This paper addresses the time complexity of Link Discovery on very large datasets by presenting and evaluating the parallelization of the time-optimized LIMES framework by means of the MapReduce paradigm.

References

Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Katherine A. Yelick. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, Dec 2006.Google Scholar
Sören Auer, Chris Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. DBpedia: A nucleus for a web of open data. In ISWC, pages 722--735. Springer, 2008. Google ScholarDigital Library
Liang T. Chen and Deepankar Bairagi. Developing Parallel Programs -- A Discussion of Popular Models. Technical report, Oracle Corporation, September 2010.Google Scholar
Ali Ebnenasir and Rasoul Beik. Developing parallel programs: A design-oriented perspective. In IWMSE '09, pages 1--8, 2009. Google ScholarDigital Library
Tarek El-Ghazawi and Francois Cantonnet. Upc performance and potential: a npb experimental study. In Supercomputing, pages 1--26, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. Google ScholarDigital Library
Hugh Glaser, Ian C. Millard, Won-Kyung Sung, Seungwoo Lee, Pyung Kim, and Beom-Jong You. Research on linked data and co-reference resolution. Technical report, University of Southampton, 2009.Google Scholar
Tom Heath and Christian Bizer. Linked Data: Evolving the Web into a Global Data Space. Morgan & Claypool, 2011. Google ScholarDigital Library
John L. Hennessy and David A. Patterson. Computer Architecture - A Quantitative Approach. Morgan Kaufmann, fourth edition, 2007. Google ScholarDigital Library
Henry Kasim, Verdi March, Rita Zhang, and Simon See. Survey on Parallel Programming Model. In Network and Parallel Computing, pages 266--275. Springer Berlin/Heidelberg, 2008. Google ScholarDigital Library
Vanessa Lopez, Victoria Uren, Marta Reka Sabou, and Enrico Motta. Cross ontology query answering on the semantic web: an initial evaluation. In K-CAP, pages 17--24, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
James McCusker and Deborah McGuinness. Towards identity in linked data. In Proceedings of OWL Experiences and Directions Seventh Annual Workshop, 2010.Google Scholar
Axel-Cyrille Ngonga Ngomo and Sören Auer. A time-efficient approach for large-scale link discovery on the web of data. In IJCAI, 2011.Google Scholar
Michael J. Quinn. Parallel Programming in C with MPI and OpenMP. McGraw-Hill Education Group, 2003. Google ScholarDigital Library
Yves Raimond, Christopher Sutton, and Mark Sandler. Automatic interlinking of music datasets on the semantic web. In Proceedings of the 1st Workshop about Linked Data on the Web, 2008.Google Scholar
François Scharffe, Yanbin Liu, and Chuguang Zhou. Rdf-ai: an architecture for rdf datasets matching, fusion and interlink. In Proc. IJCAI 2009 workshop on Identity, reference, and knowledge representation (IR-KR), Pasadena (CA US), 2009.Google Scholar
Julius Volz, Christian Bizer, Martin Gaedke, and Georgi Kobilarov. Discovering and maintaining links on the web of data. In ISWC, pages 650--665, 2009. Google ScholarDigital Library

Index Terms

Parallelizing LIMES for large-scale link discovery

Recommendations

RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaboration

The Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
Read More
A declarative framework for semantic link discovery over relational data
WWW '09: Proceedings of the 18th international conference on World wide web

In this paper, we present a framework for online discovery of semantic links from relational data. Our framework is based on declarative specification of the linkage requirements by the user, that allows matching data items in many real-world scenarios. ...
Read More
Towards Transfer Learning of Link Specifications
ICSC '13: Proceedings of the 2013 IEEE Seventh International Conference on Semantic Computing

Over the last years, link discovery frameworks have been employed successfully to create links between knowledge bases. Consequently, repositories of high-quality link specifications have been created and made available on the Web. The basic question ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems
September 2011
129 pages
ISBN:9781450306218
DOI:10.1145/2063518
Editors:
Chiara Ghidini
Bruno Kessler Foundation - Italy
,
Axel-Cyrille Ngonga Ngomo
University of Leipzig - Germany
,
Stefanie Lindstaedt
Know Center Graz - Austria
,
Tassilo Pellegrini
Semantic Web Company, Austria
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 September 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Linked Data web
link discovery
parallel computing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate40of182submissions,22%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 134
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Parallelizing LIMES for large-scale link discovery

I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

RDF, Jena, SparQL and the 'Semantic Web'

A declarative framework for semantic link discovery over relational data

Towards Transfer Learning of Link Specifications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Parallelizing LIMES for large-scale link discovery

I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

RDF, Jena, SparQL and the 'Semantic Web'

A declarative framework for semantic link discovery over relational data

Towards Transfer Learning of Link Specifications

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media