Skip to main content

Emerging Pragmatic Patterns in Large-Scale RDF Data

  • Conference paper
  • First Online:
  • 1325 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9106))

Abstract

With the development of the Linked Data, an increasing number of RDF data sets are published in many application domains. To understand the underlying meaning and characteristics of large RDF data, and to reuse popular domain terms when publishing data, capturing emerging pragmatic patterns is critical. In this paper, we propose the notion of term co-instantiation graph (TIG) and a method to build a TIG for a given RDF dataset. We also describe a clustering-based approach to distill a set of pragmatic patterns from a TIG, which reveal the pragmatic custom of highly-correlated terms. Through extensive experiments on a real big dataset containing 21 M RDF documents, we analyze the macroscopic structure of the term co-instantiation graph and pragmatic patterns from the complex network point of view, and demonstrate our approach can not only give an elaborated ontology partitioning from the pragmatic perspective to ease the ontology reuse, but also provide a new way to explore the Linked Data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.w3.org/TR/2004/REC-rdf-mt-20040210/.

  2. 2.

    http://www.w3.org/TR/owl-absyn/.

  3. 3.

    http://www.w3.org/TR/swbp-vocab-pub/.

  4. 4.

    http://ws.nju.edu.cn/olg/.

  5. 5.

    It is refined from the raw TIM matrix by using heuristic rules, c.f. Sect. 3.1.

  6. 6.

    http://lod-cloud.net/.

  7. 7.

    http://xmlns.com/foaf/spec/20091215.html. Our experimental dataset is crawled in 2009. By then, the FOAF’s version is 0.96.

References

  1. Ding, L., Finin, T., Joshi, A.: Analyzing social networks on the semantic web. IEEE Intell. Syst. 9(1), 451–458 (2005)

    Google Scholar 

  2. Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing rdf graph summary with application to assisted sparql formulation. In: 2012 23rd International Workshop on Database and Expert Systems Applications (DEXA), pp. 261–266. IEEE (2012)

    Google Scholar 

  3. Zhang, Z., Gentile, A.L., Blomqvist, E., Augenstein, I., Ciravegna, F.: Statistical knowledge patterns: identifying synonymous relations in large linked datasets. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 703–719. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Cheng, G., Zhang, Y., Qu, Y.: Explass: exploring associations between entities via top-K ontological patterns and facets. In: Mika, P., et al. (eds.) ISWC 2014, Part II. LNCS, vol. 8797, pp. 422–437. Springer, Heidelberg (2014)

    Google Scholar 

  5. Cheng, G., Ge, W., Qu, Y.: Falcons: searching and browsing entities on the semantic web. In: Proceedings of WWW, pp. 1101–1102 (2008)

    Google Scholar 

  6. Salton, G., McGill, M.H.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  7. Guha, S., Rastogi, R., Shim, K.: Rock: a robust clustering algorithm for categorical attributes. In: Proceedings of ICDE, pp. 512–521 (1999)

    Google Scholar 

  8. Kannan, R., Vempala, S., Vetta, A.: On clustering: good, bad and spectral. J. ACM 51(3), 497–515 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  9. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  10. Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20(1), 53–65 (1987)

    Article  MATH  Google Scholar 

  11. Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (2005)

    Google Scholar 

  12. de Nooy, W., Mrvar, A., Batagelj, V.: Exploratory Social Network Analysis with Pajek. Cambridge University Press, Cambridge (2005)

    Book  Google Scholar 

  13. Gangemi, A.: Ontology design patterns for semantic web content. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 262–276. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Józefowska, J., Lawrynowicz, A., Lukaszewski, T.: Faster frequent pattern mining from the semantic web. In: Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol. 35, pp. 121–130. Springer, Heidelberg (2006)

    Google Scholar 

  15. Fanizzi, N., dAmato, C., Esposito, F.: Metric-based stochastic conceptual clustering for ontologies. Inf. Syst. 34(8), 792–806 (2009)

    Article  Google Scholar 

  16. Lisi, F.A., Esposito, F.: Mining the semantic web: a logic-based methodology. In: Hacid, M.-S., Murray, N.V., Ras, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488, pp. 102–111. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  17. Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.: Amie: association rule mining under incomplete evidence in ontological knowledge bases. In: Proceedings of the 22nd international conference on World Wide Web, pp. 413–422. International World Wide Web Conferences Steering Committee (2013)

    Google Scholar 

  18. Nebot, V., Berlanga, R.: Finding association rules in semantic web data. Knowl.-Based Syst. 25(1), 51–62 (2012)

    Article  Google Scholar 

  19. Chen, H., Ng, T.D., Martinez, J., Schatz, B.R.: A concept space approach to addressing the vocabulary problem in scientific information retrieval: an experiment on the worm community system. J. Am. Soc. Inform. Sci. 48(1), 17–31 (1997)

    Article  Google Scholar 

  20. Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of KDD, pp. 269–274 (2001)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under Grants 61402426 and partially supported by Collaborative Innovation Center of Novel Software Technology and Industrialization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiyi Ge .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ge, W., Hu, W., He, C., Zong, S. (2015). Emerging Pragmatic Patterns in Large-Scale RDF Data. In: Qiang, W., Zheng, X., Hsu, CH. (eds) Cloud Computing and Big Data. CloudCom-Asia 2015. Lecture Notes in Computer Science(), vol 9106. Springer, Cham. https://doi.org/10.1007/978-3-319-28430-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28430-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28429-3

  • Online ISBN: 978-3-319-28430-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics