Skip to main content
Log in

Practical parallel string matching framework for RDF entailments with GPUs

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

Resource Description Framework (RDF) is a commonly used format for semantic web processing. It basically contains strings representing items and their relationships which can be queried or inferred. In this paper, we propose a framework for processing large RDF data sets. It is based on Brute-force string matching on GPUs (BFG). Graphics Processing Units (GPUs) are used as a parallel platform that allows thousands of threads to find RDF data. Our search algorithm is customized to suit the nature of RDF processing and GPU memory architecture. Then, the algorithm is integrated into the proposed framework for computing queries and chaining rules for RDF data. Experiments show that utilizing these algorithms can achieve the speedup of 7 times for querying and for forward chaining compared to using the sequential version. The proposed framework can achieve a string comparison rate of 67,000 comparisons per second using 2 GPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. With our hardware, transferring each half of RDF data and k e y w o r d A r r a y to 2 GPUs costs 10-20 times the time required to transfer one d a t a A r r a y and one k e y w o r d A r r a y to one GPU due to the bottleneck on PCI bus on our mainboard.

References

  • Apache Software Foundation (2013). Map Reduce tutorial. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html. Retrieved May 2016.

  • Atre, M., & Hendler, J. A. (2009). BitMat: A main memory bit-matrix of RDF triples. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems.

  • Atre, M., Chaoji, V., Zaki, M. J., & Hendler, J. A. (2010). Matrix bit loaded: A scalable lightweight Join query processor for RDF data. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, (New York, NY, USA), pp. 41–50, ACM.

  • Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z. (2007). The Semantic Web: 6th International Semantic Web Conference. In 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, November 11-15, 2007. Proceedings, ch. DBpedia: A Nucleus for a Web of Open Data, pp. 722–735. Berlin, Heidelberg: Springer Berlin Heidelberg.

  • Beckett, D. (2001). The design and implementation of the Redland libRDF API Library, Vol. 5.

  • Cambridge Semantic (2004). RDFS introduction. https://www.cambridgesemantics.com/semantic-university/rdfs-introduction. Retrieved July 2015.

  • Chang, Y.-S., Sheu, R.-K., Yuan, S.-M., & Hsu, J.-J. (2012). Scaling database performance on GPUs. Information Systems Frontiers, 14(4), 909–924.

    Article  Google Scholar 

  • Fernández, J.D., Martńez-Prieto, M.A., Gutiérrez, C., Polleres, A., & Arias, M. (2013). Binary RDF representation for publication and exchange (HDT), Web Semantics: Science. Services and Agents on the World Wide Web, 19, 22–41.

    Article  Google Scholar 

  • Fernndez, N., Arias, J., Snchez, L., Fuentes-Lorenzo, D., & Corcho, s. (2014). RDSZ: An approach for lossless RDF stream compression. In The Semantic Web: Trends and Challenges (V. Presutti, C. dAmato, F. Gandon, M. dAquin, S. Staab, and A. Tordai, eds.), vol. 8465 of Lecture Notes in Computer Science, pp. 52–67, Springer International Publishing.

  • Google (2014). Data dumps - Freebase API: Google Developers, 2014. Retrieved 23 Nov 2014 .

  • Groppe, J., & Groppe, S. (2011). Parallelizing join computations of SPARQL queries for large semantic web databases. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC ’11, (New York, NY, USA), pp. 1681–1686, ACM.

  • He, B., Fang, W., Luo, Q., Govindaraju, N.K., & Wang, T. (2008). Mars: A MapReduce framework on graphics processors. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, (New York, NY, USA), pp. 260–269, ACM.

  • He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., & Sander, P. (2008). Relational joins on graphics processors. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD ’08, (New York, NY, USA), pp. 511–524, ACM.

  • Heino, N., & Pan, J.Z. (2012). RDFS reasoning on massively parallel hardware. In The Semantic Web–ISWC (P. Cudr-Mauroux, J. Heflin, E. Sirin, T. Tudorache, J. Euzenat, M. Hauswirth, J. Parreira, J. Hendler, G. Schreiber, A. Bernstein, and E. Blomqvist, eds.), vol. 7649 of Lecture Notes in Computer Science, pp. 133–148, Springer Berlin Heidelberg.

  • Kaoudi, Z., & Kementsietsidis, A. (2014). Query processing for RDF databases, in Reasoning Web. Reasoning on the Web in the Big Data Era. In Koubarakis, M., Stamou, G., Stoilos, G., Horrocks, I., Kolaitis, P., Lausen, G., & Weikum, G. (Eds.) vol. 8714 of Lecture Notes in Computer Science, pp. 141–170, Springer International Publishing.

  • Khronos Group (2015). The open standard for parallel programming of heterogeneous systems. https://www.khronos.org/opencl/. Retrieved July 2015.

  • Kidwelly, P. (Ed.) (2015). Reasoning on Web Data: Algorithms and Performance, (Seoul, South Korea), Vol. 4.

  • Kotsis, G., & Khalil, I. (2013). Special issue on semantic information management guest editorial. Information Systems Frontiers, 15(2), 151–157.

    Article  Google Scholar 

  • Liu, C., & Qi, G. (2012). Toward scalable reasoning over annotated RDF data using MapReduce. In Web Reasoning and Rule Systems (M. Krtzsch and U. Straccia, eds.), vol. 7497 of Lecture Notes in Computer Science, pp. 238–241, Springer Berlin Heidelberg.

  • Madduri, K., & Wu, K. (2011). Massive-scale RDF processing using compressed bitmap indexes. In Scientific and Statistical Database Management (J. Bayard Cushing, J. French, and S. Bowers, eds.), vol. 6809 of Lecture Notes in Computer Science, pp. 470–479, Springer Berlin Heidelberg.

  • NIVIDIA (2013). An introduction to CUDA-Aware MPI. http://devblogs.nvidia.com/parallelforall/introduction-cuda-aware-mpi/. Retrieved July 2015.

  • NVIDIA (2014). NVIDIA NVLink high-speed interconnect: Application performance. http://www.nvidia.com/object/nvlink.html,11. Whitepaper. Retrieved July 2015.

  • NVIDIA (2015). NVIDIA GPU programming guide. https://developer.nvidia.com/nvidia-gpu-programming-guide. Retrieved: July 2015.

  • Sengupta, S., Harris, M., Zhang, Y., & Owens, J. (2007). Scan primitives for GPU computing. In Proceedings of Graphics Hardware. http://code.google.com/p/cudpp.

  • Stuart, J.A., & Owens, J.D. (2011). Multi-GPU MapReduce on GPU clusters.

  • ter Horst, H.J. (2004). Completeness, decidability and complexity of entailment for RDF schema and a semantic extension involving the owlvocabulary. In Web Semantics: Science, Services and Agents on the World Wide Web, vol. 3, no. 23, pp. 79–115, 2005. Selcted Papers from the International Semantic Web Conference, 2004 ISWC, 2004 3rd. International Semantic Web Conference.

  • Urbani, J., Kotoulas, S., Oren, E., & van Harmelen, F. (2009). Scalable distributed reasoning using MapReduce. In The Semantic Web - ISWC 2009 (A. Bernstein, D. Karger, T. Heath, L. Feigenbaum, D. Maynard, E. Motta, and K. Thirunarayan, eds.), vol. 5823 of Lecture Notes in Computer Science, pp. 634–649: Springer Berlin Heidelberg.

  • Viriyakamonphan, P., & Chantrapornchai, C. (2016). Query processing framework for HDT Using GPUs. In Proceedings of International Joint Conference on Computer Science and Software Engineering (JCSSE), IEEE.

  • W3C (2004a). Resource Description Framework. http://www.w3.org/2001/sw/wiki/RDFS. Retrieved July 2015.

  • W3C (2004b). RDF/RDFS reasoning capabilities. https://www.cambridgesemantics.com/semantic-university/rdfs-introduction. Retrieved July 2015.

  • W3C (2004c). Resource Description Framework. http://www.w3.org/RDF/. Retrieved July 2015.

Download references

Acknowledgments

This work was supported in part by the following institutes and research programs: The Thailand Research Fund (TRF) (Royal Golden Jubilee Ph.D. Program) under Grant no. PHD/0005/2554, Thailand Research Fund (Tourism and Hospitality Industry Program) contract number RDG5850042, Faculty of Engineering Funding (Kasetsart University), and NVIDIA hardware grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chantana Chantrapornchai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choksuchat, C., Chantrapornchai, C. Practical parallel string matching framework for RDF entailments with GPUs. Inf Syst Front 20, 863–882 (2018). https://doi.org/10.1007/s10796-016-9692-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-016-9692-4

Keywords

Navigation