Skip to main content
Log in

Institution information specification and correlation based on institutional PIDs and IND tool

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Institution information specification and correlation is a necessity for research evaluation and resource sharing, current attempts are mainly focused on institution name disambiguation (IND) based on institution name, address, author, et al., and lack of a unified and universal indicator. To enhance the correlation of institution information, institutional persistent identifier (PID) is introduced in this study, together with a redesigned tool based on existing techniques of IND. And an institution metadata specification model is built for data preprocess by inheriting some authoritative metadata standards. Further, a visual platform is implemented to demonstrate the correlated institution information and supports institution query. The performance of the proposed approach is evaluated on large datasets of three countries, and the test results demonstrate that the precision and recall are high.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://doi.org/10.6084/m9.figshare.8341370.

References

  • Bourke, P., & Butler, L. (1996). Standards issues in a national bibliometric database: The Australian case. Scientometrics,35(2), 199–207.

    Article  Google Scholar 

  • Cuxac, P., Lamirel, C. J., & Bonvallot, V. (2013). Efficient supervised and semi-supervised approaches for affiliations disambiguation. Scientometrics,97(1), 47–58.

    Article  Google Scholar 

  • De Bruin, R. E. (1990). The unification of addresses in scientific publications. Informetrics 1989/90, 6578. Amsterdam: Elsevier.

    Google Scholar 

  • French, C. J., Powell, I. A., & Schulman, E. (2000). Using clustering strategies for creating authority files. Journal of the American Society for Information Science and Technology,51(8), 774–786.

    Article  Google Scholar 

  • Galvez, C., & Moya-Anegon, F. (2006). The unification of institutional addresses applying parametrized. Scientometrics,69(2), 323–345.

    Article  Google Scholar 

  • Huang, S. L., Deng, H. Z., Tang, W. S., Wang, Q. W., & Chen, L. (2012). A Chinese organization’s full name and matching abbreviation algorithm based on edit-distance. Journal of Shandong University,47(5), 43–48.

    Google Scholar 

  • Huang, J., Ertekin, S., & Giles, L. C. (2006). Efficient name disambiguation for large-scale databases. In European conference on principle & practice of knowledge discovery in databases (Vol. 4213, pp. 536–544). Springer-Verlag.

  • Huang, Q. S., Yang, B., Yan, L. S., & Rousseau, R. (2014). Institution name disambiguation for research assessment. Scientometrics,99(3), 823–838.

    Article  Google Scholar 

  • Jiang, Y., Zheng, T. H., Wang, X., Lu, B., & Wu, K. (2011). Affiliation disambiguation for constructing semantic digital libraries. Journal of the American Society for Information Science and Technology,62(6), 1029–1041.

    Article  Google Scholar 

  • Juha, H. (2010). Persistent identifiers - an overview. Technology Watch Report (TWR): Standards in metadata and Interoperability.

  • Levenshtein, I. V. (1996). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady,10, 707–710.

    MathSciNet  Google Scholar 

  • Onodera, N., Iwasawa, M., Midorikawa, N., Yoshikane, F., Amano, K., et al. (2011). A method for eliminating articles by homobynous authors form the large number if articles retrieved by author search. Journal of the American Society for Information Science and Technology,62(4), 667–690.

    Article  Google Scholar 

  • Shen, Q. Z., Zhang, Y. J., & et al. (2015, December 31). Metadata standard of NSTL unified literature 3.0 (official version). Retrieved October 8, 2018 from http://spec.nstl.gov.cn/embed/metastandard.html?parentPageId=1551943054980&metastandardid=357&base=base.

  • Sun, X. H., Wang, L., Wu, J. Y., Hua, N. W., & Li, L. J. (2018). Matching strategies for institution names in literature database. Data Analysis and Knowledge Discovery,2(8), 92–101.

    Google Scholar 

  • Wan, H. Y., Liu, J. L., & Huang, S. Q. (2017). Name recognition of chinese medical institutions based on cascading conditional random fields. Journal of University of Jinan (Science and Technology),31(1), 61–66.

    Google Scholar 

  • Xian, J. G., Li, J., Kou, T. Y., Luo, T. T., & Huang, W. Y. (2018). Construction and application of upper country ontology based on OWL and SKOS. In conference: The 2nd international conference (pp. 1–6).

  • Xiang, M. X. (2016). Research and application of the Chinese organization names recognition and disambiguation. East China normal university, MA dissertation, China.

  • Yang, H. K., Peng, H. T., & Jiang, Y. J. (2008). Author name disambiguation for citation using topic and web correlation. In Proceedings of the 12th Conference in the series of European digital library conferences (ECDL2008) (pp. 185–196). Aarhus.

  • Yang, B., Yang, W. J., & Yan, L. S. (2015). Research on rule-based normalization of institution name. New Technology of Library and Information Service,6, 57–63.

    Google Scholar 

  • Yerva, R. S., & Miklós, Z. (2010). It Was Easy, when Apples and Blackberries were only fruits. In Proceedings of the third web people search evaluation workshop. Padua.

  • Yoshida, M., Matsushima, S., Ono, S., & et al. (2010). Tweet categorization by query categorization for on-line reputation management. In Proceedings of the third web people search evaluation workshop. Padua.

  • Zhang, H. X., & Wang, L. L. (1997). Identification and analysis of Chinese organization. Journal of Chinese Information Processing,11(4), 22–33.

    Google Scholar 

  • Zhang, S., Wu, J., Zheng, D., Meng, Y., & Yu, H. (2012). An adaptive method for organization name disambiguation with feature reinforcing. In Proceedings of the 26th Pacific Asia conference on language, information and computation (pp. 237–245).

  • Zhao, S. (2017, October 20). What is ETL? (Extract, transform, load) | Experian. Retrieved October 18, 2018, from experian data quality https://www.webopedia.com/TERM/E/ETL.html.

  • Zhao, J., & Liu, F. (2008). Product named entity recognition in Chinese text. Language Resources & Evaluation,42(2), 197–217.

    Article  Google Scholar 

  • Zhu, H. D., Yang, L., & Wang, B. D. (2016). Recognizing Chinese organization names based on deep learning. New Technology of Library and Information Service,12, 40–47.

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by “Design and Research on A Next Generation of Open Knowledge Services System and Key Technologies” project (No.: 2019XM55) and “Basic Research Business Fee Project of Chinese Academy of Agricultural Science” project (No.: Y2019PT15).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiao Li.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Li, J., Sun, T. et al. Institution information specification and correlation based on institutional PIDs and IND tool. Scientometrics 122, 381–396 (2020). https://doi.org/10.1007/s11192-019-03268-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-019-03268-9

Keywords

Navigation