Skip to main content
Log in

Feasibility of activity-based expert profiling using text mining of scientific publications and patents

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Research and development (R&D) in many technological areas is characterized by growing complexity. In biomedical engineering, too, interdisciplinary collaboration is regarded as a promising way to master this challenge. Therefore, identifying suitable experts becomes crucial, which is currently being researched, amongst others, by analyzing semantic data. However, previous approaches lack clarity and traceability of the mechanisms for compiling top-n lists of recommended experts, as domain specificity in profiling is insufficient. Moreover, these recommenders are mainly based on scientific publications, while patents are rarely considered as an important outcome of R&D. Thus, we study the feasibility of profiling 16 biomedical engineering experts using both publications and patents. These documents are automatically labeled according to a three-dimensional domain model by machine learning-based classifiers. On this basis, we created various activity-based representations, including author-contribution-weighting. We evaluated the profiling through self- and external-assessments and tested the recommendation compared to scientometric measures in three case studies. All interviewed experts identify themselves among 10 pseudonymous profiles and 96% of all 51 external-assignments are correct. The recommendation over three case studies reaches a high mean average precision of 89% and contrasts with the use of scientometric measures (41%). Moreover, the activity based on patents primarily corresponds to that of publications but patents also introduce new activities. The author-contribution-weighting improves the performance. In conclusion, our findings show that exploiting publications and patents enables comprehensible profiling of biomedical engineering experts that allows visual comparisons and clear selection and ranking of potential R&D collaboration partners along the translational value chain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. AMiner: https://aminer.org.

  2. BMExpert: https://datamining-iip.fudan.edu.cn/service/BMExpert/index.html.

  3. Jane: https://jane.biosemantics.org.

  4. PubMed: https://www.ncbi.nlm.nih.gov/pubmed.

  5. Web of Science: https://wokinfo.com.

  6. EPO PATSTAT: https://www.epo.org/searching-for-patents/business/patstat.html.

  7. RapidMiner: https://rapidminer.com.

  8. scikit-learn: https://scikit-learn.org.

References

Download references

Acknowledgements

This research was supported by the Klaus Tschira Stiftung gGmbH. Furthermore, the authors would also like to thank all participants of the evaluations who were primarily associated with the Helmholtz Institute for Biomedical Engineering, Aachen (Germany).

Funding

This study was funded by the Klaus Tschira Stiftung gGmbH (Grant No. 00.263.2015).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Bukowski.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file (ZIP 21142 kb)

Appendix

Appendix

See Tables 14, 15, 16, 17, 18 and 19.

Table 14 Evaluation of the SVM text classifiers with tenfold cross-validation
Table 15 Evaluation of the ANN text classifiers with tenfold cross-validation
Table 16 Overview of the 16 experts including the main domain classes with largest relative activity extracted from the unweighted activity profiles based on scientific publications
Table 17 Overview of the 16 experts including the main domain classes with largest relative activity extracted from the unweighted activity profiles based on patents
Table 18 Protocolled main comments, questions and suggestions of the participating experts
Table 19 Ranks of 16 experts based on the ground truth expert assignment of rater R1 (see Table 9) compared to the activity-based ranking including the Euclidean distances of the experts’ activities to the ideal vector (1,1,1) according to the case studies (see "Evaluation: recommendation" section and Table 8)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bukowski, M., Geisler, S., Schmitz-Rode, T. et al. Feasibility of activity-based expert profiling using text mining of scientific publications and patents. Scientometrics 123, 579–620 (2020). https://doi.org/10.1007/s11192-020-03414-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03414-8

Keywords

JEL Classification

Mathematics Subject Classification

Navigation