Skip to main content

A Semantic Data Parallel Query Method Based on Hadoop

  • Conference paper
  • First Online:
Book cover Web Information Systems Engineering – WISE 2016 (WISE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10041))

Included in the following conference series:

Abstract

To achieve efficient large-scale RDF data queries, we designed a parallel two-phase query strategy-PAQS for large-scale RDF data based on MapReduce, which is divided into two stages: the SPARQL pretreatment stage and the distributed query execution stage. In the SPARQL pretreatment stage, a SPARQL query classification algorithm is implemented, which determines the join order of connection variables by calculating the correlation between the variables in a SPARQL query statement; then, the join between SPARQL clauses is divided into the minimum number of MapReduce jobs according to the connection variables. The distributed query execution phase accomplishes large-scale RDF data query concurrently based on MapReduce jobs from the SPARQL pretreatment stage. The experimental results on the LUMB benchmark set indicate that PAQS can query large-scale RDF data with good efficiency, stability, and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Big data white paper in 2014. Ministry of Industry and Information Technology Telecommunications Research Institute (2014)

    Google Scholar 

  2. Manola, F., Miller, E.: RDF Primer [EB/OL]. W3C Recommendation (2004). http://www.w3.org/TR/rdf-syntax/

  3. Hoffart, J., Suchanek, F.M., Berberich, K., et al.: YAGO2: a spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Belleau, F., Nolin, M.A., Tourigny, N., et al.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inf. 41(5), 706–716 (2008)

    Article  Google Scholar 

  5. Kobilarov, G., et al.: Media meets semantic web – how the BBC uses DBpedia and linked data to make connections. In: Aroyo, L., et al. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 723–737. Springer, Heidelberg (2009). doi:10.1007/978-3-642-02121-3_53

    Chapter  Google Scholar 

  6. Mika P.: Social networks and the semantic web. In: Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, Beijing, 20–24 September 2004, pp. 285–291. IEEE, New Jersey (2004)

    Google Scholar 

  7. The Linked Open Data Project (LOD), 06 August 2015. http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData

  8. Xiao-feng, M.E.N.G., Xiang, C.I.: Big data management: concepts, techniques and challenges. J. Comput. Res. Dev. 50(1), 146–169 (2013)

    Google Scholar 

  9. Wang, S., Wang, H.-J., Tan, X.-P., et al.: Architecting big data: challenges, studies and forecasts. Chin. J. Comput. 34, 1741–1752 (2011)

    Article  Google Scholar 

  10. Li, R.: Research on key technologies of large-scaled Semantic Web ontologies querying and reasoning based on Hadoop. Chongqing University (2013)

    Google Scholar 

  11. Xiao-yong, D.U., Yan, W.A.N.G., Bin, L.U.: Research and development on Semantic Web data management. J. Softw. 20(11), 2950–2964 (2009)

    Article  Google Scholar 

  12. Bechhofer, S., Harmelen, F.V., Hendler, J., et al.: OWL web ontology language reference. W3C Recommendation 40(8), 25–39 (2004). http://www.w3.org/2004/OWL

  13. Shi, H.-J.: Research of massive semantic information parallel inference method based on cloud computing.Shanghai Jiaotong University (2012)

    Google Scholar 

  14. Myung, J., Yeon, J., Lee, S.G.: SPARQL basic graph pattern processing with iterative MapReduce. In: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud, 26 April 2010, pp. 1–6. ACM, New York (2010)

    Google Scholar 

  15. Husain, M., Mcglothlin, J., Masud, M.M., et al.: Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans. Knowl. Data Eng. 23(9), 1312–1327 (2011)

    Article  Google Scholar 

  16. Cure O, Naacke H, Randriamalala T, et al.: LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs. In: IEEE International Conference on Big Data, pp. 1823–1830. IEEE (2015)

    Google Scholar 

  17. Liu, B., Huang, K., Li, J., et al.: An incremental and distributed inference method for large-scale ontologies based on MapReduce paradigm. IEEE Trans. Cybern. 45(1), 53–64 (2015)

    Article  Google Scholar 

Download references

Acknowledgement

The National Natural Science Foundation of China under Grant No. 61301136, No. 61272148 and No. 61602525.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meiguang Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Yang, L., Yang, L., Niu, J., Hu, Z., Long, J., Zheng, M. (2016). A Semantic Data Parallel Query Method Based on Hadoop. In: Cellary, W., Mokbel, M., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2016. WISE 2016. Lecture Notes in Computer Science(), vol 10041. Springer, Cham. https://doi.org/10.1007/978-3-319-48740-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48740-3_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48739-7

  • Online ISBN: 978-3-319-48740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics