Institution information specification and correlation based on institutional PIDs and IND tool

Huang, Yongwen; Li, Jiao; Sun, Tan; Xian, Guojian

doi:10.1007/s11192-019-03268-9

Institution information specification and correlation based on institutional PIDs and IND tool

Published: 18 November 2019

Volume 122, pages 381–396, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Yongwen Huang¹^na1,
Jiao Li ORCID: orcid.org/0000-0002-8876-3728¹^na1,
Tan Sun^1,2^na1 &
…
Guojian Xian^1,2

533 Accesses
3 Citations
Explore all metrics

Abstract

Institution information specification and correlation is a necessity for research evaluation and resource sharing, current attempts are mainly focused on institution name disambiguation (IND) based on institution name, address, author, et al., and lack of a unified and universal indicator. To enhance the correlation of institution information, institutional persistent identifier (PID) is introduced in this study, together with a redesigned tool based on existing techniques of IND. And an institution metadata specification model is built for data preprocess by inheriting some authoritative metadata standards. Further, a visual platform is implemented to demonstrate the correlated institution information and supports institution query. The performance of the proposed approach is evaluated on large datasets of three countries, and the test results demonstrate that the precision and recall are high.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The role of unique identifiers in bibliographic information systems

Article 01 July 2014

N. A. Mazov & V. N. Gureev

Availability of digital object identifiers in publications archived by PubMed

Article 03 January 2017

Christophe Boudry & Ghislaine Chartron

Missing institutions in OpenAlex: possible reasons, implications, and solutions

Article 05 February 2024

Lin Zhang, Zhe Cao, … Ying Huang

Notes

https://doi.org/10.6084/m9.figshare.8341370.

References

Bourke, P., & Butler, L. (1996). Standards issues in a national bibliometric database: The Australian case. Scientometrics,35(2), 199–207.
Article Google Scholar
Cuxac, P., Lamirel, C. J., & Bonvallot, V. (2013). Efficient supervised and semi-supervised approaches for affiliations disambiguation. Scientometrics,97(1), 47–58.
Article Google Scholar
De Bruin, R. E. (1990). The unification of addresses in scientific publications. Informetrics 1989/90, 6578. Amsterdam: Elsevier.
Google Scholar
French, C. J., Powell, I. A., & Schulman, E. (2000). Using clustering strategies for creating authority files. Journal of the American Society for Information Science and Technology,51(8), 774–786.
Article Google Scholar
Galvez, C., & Moya-Anegon, F. (2006). The unification of institutional addresses applying parametrized. Scientometrics,69(2), 323–345.
Article Google Scholar
Huang, S. L., Deng, H. Z., Tang, W. S., Wang, Q. W., & Chen, L. (2012). A Chinese organization’s full name and matching abbreviation algorithm based on edit-distance. Journal of Shandong University,47(5), 43–48.
Google Scholar
Huang, J., Ertekin, S., & Giles, L. C. (2006). Efficient name disambiguation for large-scale databases. In European conference on principle & practice of knowledge discovery in databases (Vol. 4213, pp. 536–544). Springer-Verlag.
Huang, Q. S., Yang, B., Yan, L. S., & Rousseau, R. (2014). Institution name disambiguation for research assessment. Scientometrics,99(3), 823–838.
Article Google Scholar
Jiang, Y., Zheng, T. H., Wang, X., Lu, B., & Wu, K. (2011). Affiliation disambiguation for constructing semantic digital libraries. Journal of the American Society for Information Science and Technology,62(6), 1029–1041.
Article Google Scholar
Juha, H. (2010). Persistent identifiers - an overview. Technology Watch Report (TWR): Standards in metadata and Interoperability.
Levenshtein, I. V. (1996). Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady,10, 707–710.
MathSciNet Google Scholar
Onodera, N., Iwasawa, M., Midorikawa, N., Yoshikane, F., Amano, K., et al. (2011). A method for eliminating articles by homobynous authors form the large number if articles retrieved by author search. Journal of the American Society for Information Science and Technology,62(4), 667–690.
Article Google Scholar
Shen, Q. Z., Zhang, Y. J., & et al. (2015, December 31). Metadata standard of NSTL unified literature 3.0 (official version). Retrieved October 8, 2018 from http://spec.nstl.gov.cn/embed/metastandard.html?parentPageId=1551943054980&metastandardid=357&base=base.
Sun, X. H., Wang, L., Wu, J. Y., Hua, N. W., & Li, L. J. (2018). Matching strategies for institution names in literature database. Data Analysis and Knowledge Discovery,2(8), 92–101.
Google Scholar
Wan, H. Y., Liu, J. L., & Huang, S. Q. (2017). Name recognition of chinese medical institutions based on cascading conditional random fields. Journal of University of Jinan (Science and Technology),31(1), 61–66.
Google Scholar
Xian, J. G., Li, J., Kou, T. Y., Luo, T. T., & Huang, W. Y. (2018). Construction and application of upper country ontology based on OWL and SKOS. In conference: The 2nd international conference (pp. 1–6).
Xiang, M. X. (2016). Research and application of the Chinese organization names recognition and disambiguation. East China normal university, MA dissertation, China.
Yang, H. K., Peng, H. T., & Jiang, Y. J. (2008). Author name disambiguation for citation using topic and web correlation. In Proceedings of the 12th Conference in the series of European digital library conferences (ECDL2008) (pp. 185–196). Aarhus.
Yang, B., Yang, W. J., & Yan, L. S. (2015). Research on rule-based normalization of institution name. New Technology of Library and Information Service,6, 57–63.
Google Scholar
Yerva, R. S., & Miklós, Z. (2010). It Was Easy, when Apples and Blackberries were only fruits. In Proceedings of the third web people search evaluation workshop. Padua.
Yoshida, M., Matsushima, S., Ono, S., & et al. (2010). Tweet categorization by query categorization for on-line reputation management. In Proceedings of the third web people search evaluation workshop. Padua.
Zhang, H. X., & Wang, L. L. (1997). Identification and analysis of Chinese organization. Journal of Chinese Information Processing,11(4), 22–33.
Google Scholar
Zhang, S., Wu, J., Zheng, D., Meng, Y., & Yu, H. (2012). An adaptive method for organization name disambiguation with feature reinforcing. In Proceedings of the 26th Pacific Asia conference on language, information and computation (pp. 237–245).
Zhao, S. (2017, October 20). What is ETL? (Extract, transform, load) | Experian. Retrieved October 18, 2018, from experian data quality https://www.webopedia.com/TERM/E/ETL.html.
Zhao, J., & Liu, F. (2008). Product named entity recognition in Chinese text. Language Resources & Evaluation,42(2), 197–217.
Article Google Scholar
Zhu, H. D., Yang, L., & Wang, B. D. (2016). Recognizing Chinese organization names based on deep learning. New Technology of Library and Information Service,12, 40–47.
Google Scholar

Download references

Acknowledgements

This work was partially supported by “Design and Research on A Next Generation of Open Knowledge Services System and Key Technologies” project (No.: 2019XM55) and “Basic Research Business Fee Project of Chinese Academy of Agricultural Science” project (No.: Y2019PT15).

Author information

Yongwen Huang, Jiao Li and Tan Sun have contributed equally to this work.

Authors and Affiliations

Agricultural Information Institution of CAAS, Beijing, 100081, China
Yongwen Huang, Jiao Li, Tan Sun & Guojian Xian
Key Laboratory of Agricultural Big Data, Ministry of Agriculture, Beijing, 100081, China
Tan Sun & Guojian Xian

Authors

Yongwen Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiao Li
View author publications
You can also search for this author in PubMed Google Scholar
Tan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Guojian Xian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiao Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Li, J., Sun, T. et al. Institution information specification and correlation based on institutional PIDs and IND tool. Scientometrics 122, 381–396 (2020). https://doi.org/10.1007/s11192-019-03268-9

Download citation

Received: 24 March 2019
Published: 18 November 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s11192-019-03268-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Institution information specification and correlation based on institutional PIDs and IND tool

Abstract

Access this article

Similar content being viewed by others

The role of unique identifiers in bibliographic information systems

Availability of digital object identifiers in publications archived by PubMed

Missing institutions in OpenAlex: possible reasons, implications, and solutions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Institution information specification and correlation based on institutional PIDs and IND tool

Abstract

Access this article

Similar content being viewed by others

The role of unique identifiers in bibliographic information systems

Availability of digital object identifiers in publications archived by PubMed

Missing institutions in OpenAlex: possible reasons, implications, and solutions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation