Skip to main content

Unsupervised Semantic and Syntactic Based Classification of Scientific Citations

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9263))

Abstract

In the recent years, the number of scientific publications has increased substantially. A way to measure the impact of a publication is to count the number of citations to the paper. Thus, citations are being used as a proxy for a researcher’s contribution and influence in a field. Citation classification can provide context to the citations. To perform citation classification, supervised techniques are normally used. To the best of our knowledge there are no research that performs this task in a unsupervised manner. In this paper we present two techniques to cluster citations automatically without human intervention. This paper presents two novel techniques to cluster citations according to their contents (semantic) and the citation sentence styles (syntactic). The techniques are validated using external test sets from existing supervised citation classification studies.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abdullatif, M., Koh, Y.S., Dobbie, G., Alam, S.: Verb selection using semantic role labeling for citation classification. In: Proceedings of the 2013 Workshop on Computational Scientometrics: Theory and Applications CompSci 2013, pp. 25–30. ACM, New York (2013). http://doi.acm.org/10.1145/2508497.2508502

  2. Bonzi, S.: Characteristics of a literature as predictors of relatedness between cited and citing works. J. Am. Soc. Inf. Sci. 33(4), 208–216 (1982). http://dx.doi.org/10.1002/asi.4630330404

    Article  Google Scholar 

  3. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  4. Dong, C., Schäfer, U.: Ensemble-style self-training on citation classification. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 623–631. Asian Federation of Natural Language Processing, Chiang Mai, November 2011

    Google Scholar 

  5. Dong, C., Schäfer, U.: Ensemble-style self-training on citation classification. In: Proceedings of the 5th International Joint Conference on Natural Language Processing, pp. 623–631. Association for Computational Linguistics, November 2011

    Google Scholar 

  6. Fellbaum, C.: WordNet: An Electronic Lexical Database. Language, Speech, and Communication, MIT Press, Cambridge (1998)

    Google Scholar 

  7. Fung, B.C., Wang, K., Ester, M.: Hierarchical document clustering using frequent itemsets. In: SDM, vol. 3, pp. 59–70. SIAM (2003)

    Google Scholar 

  8. Garfield, E.: Journal impact factor: a brief review. Can. Med. Assoc. J. 161(8), 979–980 (1999)

    Google Scholar 

  9. Garfield, E., Stevens, M.E., Giuliano, V.E., Heilprin, L.B.: Can citation indexing be automated? Essay Inf. Sci. 1, 189–192 (1965)

    Google Scholar 

  10. Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1950)

    Article  MathSciNet  Google Scholar 

  11. Herlach, G.: Can retrieval of information from citation indexes be simplified? multiple mention of a reference as a characteristic of the link between cited and citing article. J. Am. Soc. Inf. Sci. 29(6), 308–310 (1978). http://dx.doi.org/10.1002/asi.4630290608

    Article  Google Scholar 

  12. Jaro, M.A.: Advances in record-linkage methodology as applied to matching the 1985 census of tampa, florida. J. Am. Stat. Assoc. 84(406), 414–420 (1989)

    Article  Google Scholar 

  13. Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 305–332. MIT Press (1998)

    Google Scholar 

  14. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10, 707 (1966)

    MathSciNet  Google Scholar 

  15. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, p. 14, California (1967)

    Google Scholar 

  16. Morato, J., Marzal, M.A., Lloréns, J., Moreiro, J.: Wordnet applications. In: Proceedings of the 2nd Global Wordnet Conference, vol. 2004 (2004)

    Google Scholar 

  17. Moravcsik, M., Murugesan, P.: Some results on the function and quality of citations. Soc. Stud. Sci. 5(1), 86 (1975)

    Article  Google Scholar 

  18. Nanba, H., Okumura, M.: Towards multi-paper summarization using reference information. In: International Joint Conference on Artificial Intelligence, vol. 16, pp. 926–931. Lawrence Erlbaum Associates Ltd (1999)

    Google Scholar 

  19. Patwardhan, S., Pedersen, T.: Using WordNet-based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL 2006 Workshop on Making Sense of Sense: Bringing Computational Linguistics and Psycholinguistics Together, pp. 1–8, Trento, April 2006

    Google Scholar 

  20. Porter, M.F.: An algorithm for suffix stripping. Prog. Electron. Lib. Inf. Syst. 14(3), 130–137 (1980)

    Google Scholar 

  21. Radoulov, R.: Exploring automatic citation classification. University of Waterloo, Waterloo (2008)

    Google Scholar 

  22. Teufel, S., Siddharthan, A., Tidhar, D.: Automatic classification of citation function. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing EMNLP 2006, pp. 103–110. Association for Computational Linguistics, Stroudsburg, PA (2006). http://dl.acm.org/citation.cfm?id=1610075.1610091

  23. Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E.G., Milios, E.E.: Semantic similarity methods in wordnet and their application to information retrieval on the web. In: Proceedings of the 7th Annual ACM International Workshop on Web Information and Data Management, pp. 10–16. ACM (2005)

    Google Scholar 

  24. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics ACL 1994, pp. 133–138. Association for Computational Linguistics, Stroudsburg (1994)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Abdullatif .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Abdullatif, M., Koh, Y.S., Dobbie, G. (2015). Unsupervised Semantic and Syntactic Based Classification of Scientific Citations. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science(), vol 9263. Springer, Cham. https://doi.org/10.1007/978-3-319-22729-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22729-0_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22728-3

  • Online ISBN: 978-3-319-22729-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics