Abstract
Resource Description Framework (RDF) is a commonly used format for semantic web processing. It basically contains strings representing terms and their relationships which can be queried or inferred. RDF is usually a large text file which contains many million relationships. In this work, we propose a framework, TripleID, for processing queries of large RDF data. The framework utilises Graphics Processing Units (GPUs) to search RDF relations. The RDF data is first transformed to the encoded form suitable for storing in the GPU memory. Then parallel threads on the GPU search the required data. We show in the experiments that one GPU on a personal desktop can handle 100 million triple relations, while a traditional RDF processing tool can process up to 10 million triples. Furthermore, we can query sample relations within 0.18 s with the GPU in 7 million triples, while the traditional tool takes at least 6 s for 1.8 million triples.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
RDF Vocabulary Description Language 1.0: RDF Schema. http://www.w3.org/TR/2004/REC-rdf-schema-20040210/#ch_type
ref.sh. (2015). https://github.com/seebi/rdf.sh. Retrieved Nov 2015
Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix “bit” loaded: A scalable lightweight join query processor for RDF data. In: Proceedings of the 19th International Conference on World Wide Web WWW 2010, pp. 41–50. ACM, New York (2010)
Atre, M., Hendler, J.A.: BitMat: A main memory bit-matrix of RDF triples. In: Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (2009)
Beckett, D.: The design and implementation of the Redland librdf RDF API Library. In: Proceedings of WWW10, Hong Kong, May 2001
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web: Scientific American (2001). citeulike-article-id:1176986 http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21&pageNumber=1&catID=2
Bizer, C., Lehmann, J., Kobilarov, G., Auer, R., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A crystallization point for the Web of Data. Web Semant. 7(3), 154–165 (2009)
Choksuchat, C., Chantrapornchai, C.: Large RDF representation framework for GPUs case study key-value storage and binary triple pattern. In: International Computer Science and Engineering Conference (ICSEC), pp. 13–18, September 2013
Choksuchat, C., Chantrapornchai, C., Haidl, M., Gorlatch, S.: Accelerating keyword search for big RDF web data on many-core systems. In: Fujita, H., Guizzi, G. (eds.) SoMeT 2015. CCIS, vol. 532, pp. 190–202. Springer, Heidelberg (2015)
Grant, C.K., Lee, F., Torres, E.: SPARQL Protocol for RDF.W3c Recommendation (2008). http://www.w3.org/TR/rdf-sparql-protocol/
Groppe, J., Groppe, S.: Parallelizing join computations of SPARQL queries for large semantic web databases. In: Proceedings of the 2011 ACM Symposium on Applied Computing SAC 2011. pp. 1681–1686. ACM, New York (2011). http://doi.acm.org/10.1145/1982185.1982536
He, B., Yang, K., Fang, R., Lu, M., Govindaraju, N., Luo, Q., Sander, P.: Relational joins on graphics processors. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data SIGMOD 2008, pp. 511–524. ACM, New York (2008). http://doi.acm.org/10.1145/1376616.1376670
Heino, N., Pan, J.Z.: RDFs reasoning on massively parallel hardware. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 133–148. Springer, Heidelberg (2012)
Kim, J., Kim, S.G., Nam, B.: Parallel multi-dimensional range query processing with R-trees on GPU. J. Parallel Distrib. Comput. 73(8), 1195–1207 (2013)
Kim, Y., Lee, Y., Lee, J.: An efficient approach to triple search and join of HDT processing using GPU. In: Proceedings of The Seventh International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA), pp. 70–74. IARIA (2015)
Liu, C., Urbani, J., Qi, G.: Efficient RDF stream reasoning with graphics processing units (GPUs). In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 343–344. WWW Companion 2014, International World Wide Web Conferences (2014)
Madduri, K., Wu, K.: Massive-scale RDF processing using compressed bitmap indexes. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 470–479. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22351-8_30
Makni, B.: Optimizing RDF stores by coupling general-purpose graphics processing units and central processing units. In: Proceedings of ISWC (2013). http://ceur-ws.org/Vol-1045/paper-06.pdf
Nam, B., Sussman, A.: Analyzing design choices for distributed multidimensional indexing. J. Supercomputing 59(3), 1552–1576 (2012). doi:10.1007/s11227-011-0567-7
NIVIDIA: An introduction to CUDA-Aware MPI. (2013). http://devblogs.nvidia.com/parallelforall/introduction-cuda-aware-mpi/. Retrieved July 2015
NVIDIA: NVIDIA GPU programming guide (2015). https://developer.nvidia.com/nvidia-gpu-programming-guide. Retrieved July 2015
Schmidt, M., Hornung, T., Meier, M., Pinkel, C., Lausen, G.: SP2Bench: A SPARQL performance benchmark. In: de Virgilio, R., Giunchiglia, F., Tanca, L. (eds.) Semantic Web Information Management, pp. 371–393. Springer, Heidelberg (2010). doi:10.1007/978-3-642-04329-1_16
Teams, R.: rdflib 4.2.1. (2015). http://rdflib.readthedocs.org/. Retrieved November 2015
W3C.: Resource description framework (2004). http://www.w3.org/RDF/. Retrieved July 2015
W3C.: Virtuosouniversalserver (2009). http://www.w3.org/wiki/VirtuosoUniversalServer. Retrieved Dec 2015
Wei, Z., Jaja, J.: A fast algorithm for constructing inverted files on heterogeneous platforms. In: 2011 IEEE International Parallel Distributed Processing Symposium (IPDPS), pp. 1124–1134, May 2011
Wei, Z., JaJa, J.: A fast algorithm for constructing inverted files on heterogeneous platforms. J. Parallel Distrib. Comput. 72(5), 728–738 (2012). doi:10.1016/j.jpdc.2012.02.005
Weiss, C., Karras, P.J.D., Martínez-Prieto, M.A., Bernstein, A.: Hexastore: Sextuple indexing for semantic web data management. In: Proceedings of PVLDB, pp. 1008–1019. ACM (2008). http://www.vldb.org/pvldb/1/1453965.pdf
zlib.: zlib usage example (2012). http://www.zlib.net/. Retrieved Nov 2015
Acknowledgement
This work was supported in part by the following institutes and research programs: The Thailand Research Fund (TRF) through the Royal Golden Jubilee Ph.D. Program under Grant PHD/0005/2554, DAAD (German Academic Exchange Service) Scholarship project id: 57084841, NVIDIA Hardware grant, and the Faculty of Engineering at Kasetsart University Research funding contract no. 57/12/MATE.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Chantrapornchai, C., Choksuchat, C., Haidl, M., Gorlatch, S. (2016). TripleID: A Low-Overhead Representation and Querying Using GPU for Large RDFs. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery. BDAS BDAS 2015 2016. Communications in Computer and Information Science, vol 613. Springer, Cham. https://doi.org/10.1007/978-3-319-34099-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-34099-9_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-34098-2
Online ISBN: 978-3-319-34099-9
eBook Packages: Computer ScienceComputer Science (R0)