Skip to main content

Citation Content Analysis and a Digital Library

  • Conference paper
  • First Online:
Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2018)

Abstract

This paper presents an approach of two-way data exchange between the citation content analysis, provided by the Cirtec project, and the big research digital library Socionet. Many papers in Socionet have citation relationships with other papers and also linkages with authors’ personal profiles and through them with other information objects. It allows making an enrichment of data for the citation content analysis by different additional information and, as well, linking results of such analysis with objects in a digital library, like papers, their authors, affiliation organizations, etc. We discuss what numeric and qualitative indicators can be built by citation content analysis based on the Cirtec open citation data. Since these indicators have IDs related with digital library objects, they can be integrated and visualized as computer-generated annotations to papers’ full texts in PDF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://socionet.ru/collection.xml?h=spz:neicon&l=en.

  2. 2.

    https://socionet.ru/collection.xml?h=repec:rnp&l=en.

  3. 3.

    https://socionet.ru/collection.xml?h=repec:hig&l=en.

  4. 4.

    http://cirtec.ranepa.ru/data/RePEc/hig/fsight/v%253A11%253Ay%253A2017%253Ai%253A4%253Ap%253A84-95/.

  5. 5.

    In [14] we listed types of in-text citations that were processed.

  6. 6.

    https://authors.repec.org/.

  7. 7.

    https://edirc.repec.org/.

  8. 8.

    An example - https://socionet.ru/fs/ap.cgi?h=repec:hig:fsight:v:11:y:2017:i:4:p:84-95.

  9. 9.

    An example - https://socionet.ru/fs/ap.cgi?h=repec:per:pers:pku327.

  10. 10.

    Victor Lyapunov made needed software and calculations for these experiments.

  11. 11.

    https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance.

  12. 12.

    Thomas Krichel and Roman Puzyrev made needed software and calculations.

  13. 13.

    Aleksandr Tuzovsky and Amir Bakarov made needed software and calculations.

  14. 14.

    https://en.wikipedia.org/wiki/N-gram.

  15. 15.

    https://en.wikipedia.org/wiki/Topic_model.

  16. 16.

    https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation.

  17. 17.

    https://en.wikipedia.org/wiki/Word_embedding.

  18. 18.

    https://en.wikipedia.org/wiki/Word2vec.

  19. 19.

    https://en.wikipedia.org/wiki/Cluster_analysis#Centroid-based_clustering.

References

  1. Berger, M., McDonough, K., Seversky, L.M.: cite2vec: citation-driven document exploration via word embeddings. IEEE Trans. Vis. Comput. Graph. 23(1), 691–700 (2017). https://doi.org/10.1109/TVCG.2016.2598667

    Article  Google Scholar 

  2. Bertin, M., Atanassova, I.: A study of lexical distribution in citation contexts through the IMRaD standard. In: Proceedings of the First Workshop on Bibliometric-Enhanced Information Retrieval Co-located with 36th European Conference on Information Retrieval (ECIR 2014), 13 April 2014, vol. 1143, pp. 5–12 (2014)

    Google Scholar 

  3. Bertin, M., Atanassova, I.: Factorial correspondence analysis applied to citation contexts. In: BIR@ ECIR, pp. 22–29 (2015)

    Google Scholar 

  4. Bertin, M., Atanassova, I., Gingras, Y., Larivière, V.: The invariant distribution of references in scientific articles. J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016). https://doi.org/10.1002/asi.23367

    Article  Google Scholar 

  5. Bertin, M., Atanassova, I.: InTeReC: in-text reference corpus for applying natural language processing to bibliometrics. In: Proceedings of the Seventh Workshop on Bibliometric-enhanced Information Retrieval (BIR), Grenoble, France, pp. 54–62. CEURWS.org (2018)

    Google Scholar 

  6. Bilder, G., Lin, J., Neylon, C.: Principles for Open Scholarly Infrastructures. Science in the Open (2015). https://doi.org/10.6084/m9.figshare.1314859

  7. Boyack, K.W., van Eck, N.J., Colavizza, G., Waltman, L.: Characterizing in-text citations in scientific articles: a large-scale analysis. J. Inform. 12(1), 59–73 (2018). https://doi.org/10.1016/j.joi.2017.11.005

    Article  Google Scholar 

  8. Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., Zhai, C.: Content-based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65(9), 1820–1833 (2014). https://doi.org/10.1002/asi.23256

    Article  Google Scholar 

  9. He, J., Chen, C.: Understanding the changing roles of scientific publications via citation embeddings. arXiv preprint arXiv:1711.05822 (2017)

  10. Hernández-Alvarez, M., Gómez, J.M.: Survey about citation context analysis: tasks, techniques, and resources. Nat. Lang. Eng. 22(3), 327–349 (2016). https://doi.org/10.1017/S1351324915000388

    Article  Google Scholar 

  11. Jebari, C., Cobo, M.J., Herrera-Viedma, E.: A new approach for implicit citation extraction. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, Antonio J. (eds.) IDEAL 2018. LNCS, vol. 11315, pp. 121–129. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03496-2_14

    Chapter  Google Scholar 

  12. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  13. Parinov, S.: Towards a semantic segment of a research e-infrastructure: necessary information objects, tools and services. Int. J. Metadata Semant. Ontol. 8(4), 322–331 (2013). https://doi.org/10.1504/ijmso.2013.058415

    Article  Google Scholar 

  14. Parinov, S.: Semantic attributes for citation relationships: creation and visualization. In: Garoufallou, E., Virkus, S., Siatri, R., Koutsomiha, D. (eds.) MTSR 2017. CCIS, vol. 755, pp. 286–299. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70863-8_28

    Chapter  Google Scholar 

  15. Parinov, S.: Open citation data and a digital library. In: The Selected Papers of the XX International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2018), Moscow, Russia, 9–12 October 2018, vol. 2277, pp. 216-221. CEUR (2018)

    Google Scholar 

  16. Parinov, S., Lyapunov, V., Puzyrev, R., Kogalovsky, M.: Semantically enrichable research information system SocioNet. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2015. CCIS, vol. 518, pp. 147–157. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24543-0_11

    Chapter  Google Scholar 

  17. Pride, D., Knoth, P.: Incidental or influential?–a decade of using text-mining for citation function classification. In: 16th International Society of Scientometrics and Informetrics Conference, Wuhan, 16–20 October 2017 (2017)

    Google Scholar 

  18. Qayyum, F., Afzal, M.T.: Identification of important citations by exploiting research articles’ metadata and cue-terms from content. Scientometrics 118, 21–43 (2019). https://doi.org/10.1007/s11192-018-2961-x

    Article  Google Scholar 

  19. Waltman, L.: A review of the literature on citation impact indicators. J. Inform. 10(2), 365–391 (2016). https://doi.org/10.1016/j.joi.2016.02.007

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

A part of this research – the approach of using citation contexts for building statistics with focus on the supercomputer simulation of interactions among the agents and research community environment, is funded by RSF grant (project No. 19-18-00240).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergey Parinov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Parinov, S. (2019). Citation Content Analysis and a Digital Library. In: Manolopoulos, Y., Stupnikov, S. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2018. Communications in Computer and Information Science, vol 1003. Springer, Cham. https://doi.org/10.1007/978-3-030-23584-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-23584-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-23583-3

  • Online ISBN: 978-3-030-23584-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics