Skip to main content
Log in

Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The ever growing volume of academic articles stresses the need for a new generation of knowledge management method to intelligently reuse the academic knowledge and facilitate the development of scientific research. Knowledge fusion (KF) serves a key element of such method addressing those needs, and breakthrough progress has taken place in the field of KF. This brings a great opportunity for the academic community to expedite the process of literature review and automatically retrieve the required knowledge from academic publications. Therefore, a survey reviewing the KF studies in terms of the related technologies and applications for valuable insights to reuse academic knowledge, which is missing from the state-of-the-art literature, is in need. Motivated to bridge this gap, this paper conducts a systematic survey reviewing the existing studies on KF, meanwhile discussing the opportunities and challenges of applying KF through academic articles. To this end, we revisit the definitions of knowledge and KF in the context of academic articles, and summarise the fusion patterns and their usage in existing applications. Furthermore, we review the techniques and applications of KF, especially those with academic articles as sources of knowledge. Finally, we discuss the challenges and future directions in order to bring new insights to researchers and practitioners to deepen their understanding of knowledge fusion and to develop versatile functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/ (software available from tensorflow.org).

  • Amjad, T., Daud, A., Akram, A., & Muhammed, F. (2016). Impact of mutual influence while ranking authors in a co-authorship network. Kuwait Journal of Science, 43(3), 101–109.

    Google Scholar 

  • Amjad, T., Ding, Y., Daud, A., Xu, J., & Malic, V. (2015). Topic-based heterogeneous rank. Scientometrics, 104(1), 313–334.

    Google Scholar 

  • Andrews, K. (1995). Case study. visualising cyberspace: Information visualisation in the harmony internet browser. In Proceedings of visualization 1995 conference (pp. 97–104). https://doi.org/10.1109/INFVIS.1995.528692.

  • Baldwin, C., Hughes, J., Hope, T., Jacoby, R., & Ziebland, S. (2003). Ethics and dementia: mapping the literature by bibliometric analysis. International Journal of Geriatric Psychiatry, 18(1), 41–54.

    Google Scholar 

  • Bellinger, G., Castro, D. & Mills, A. (2004). Data, information, knowledge, and wisdom. Mental model musings. http://www.systems-thinking.org/dikw/dikw.htm.

  • Bergstrom, C. T., & West, J. D. (2008). Assessing citations with the Eigenfactor Metrics. Neurology, 71(23), 1850–1851.

    Google Scholar 

  • Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with python (1st ed.). Sebastopol: O’Reilly Media Inc.

    MATH  Google Scholar 

  • Bleiholder, J., & Naumann, F. (2009). Data fusion. ACM Computing Surveys (CSUR), 41(1), 1.

    Google Scholar 

  • Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.

    Google Scholar 

  • Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37(1), 179–255.

    Google Scholar 

  • Bui, Q.-C., Nualáin, B. Ó., Boucher, C. A., & Sloot, P. M. (2010). Extracting causal relations on HIV drug resistance from literature. BMC Bioinformatics, 11(1), 1–11. https://doi.org/10.1186/1471-2105-11-101.

    Article  Google Scholar 

  • Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemsitry. Scientometrics, 22(1), 155–205.

    Google Scholar 

  • Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57.

    Google Scholar 

  • Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E. R., Jr. & Mitchell, T. M. (2010). Toward an architecture for never-ending language learning. In Proceedings of the twenty-fourth AAAI conference on artificial intelligence (pp. 1306–1313). AAAI Press. http://dl.acm.org/citation.cfm?id=2898607.2898816.

  • Carvalho, R. N., Matsumoto, S., Laskey, K. B., Costa, P. C. G., Ladeira, M., & Santos, L. L. (2013). Probabilistic ontology and knowledge fusion for procurement fraud detection in Brazil. Uncertainty reasoning for the semantic web II (pp. 19–40). Berlin: Springer.

    Google Scholar 

  • Castano, S., & Ferrara, A. (2002). Knowledge representation and transformation in ontology-based data integration. Transformation for the Semantic Web, 21, 51.

    Google Scholar 

  • Chen, J., & Diekema, A. R. (2005). Experimenting with the automatic assignment of educational standards to digital library content. In Proceedings of the 5th ACM/IEEE-CS joint conference on digital libraries (JCDL ’05) (pp. 223–224).

  • Chen, Y. , Liu, F. & Manderick, B. (2011). Extract protein-protein interactions from the literature using support vector machines with feature selection. In Biomedical engineering, trends, research and technologies. IntechOpen.

  • Chen, H. (2008). Mapping nanotechnology innovations and knowledge: Global and longitudinal patent and literature analysis (Vol. 20). New York: Springer.

    Google Scholar 

  • Chen, R.-C., Bau, C.-T., & Yeh, C.-J. (2011). Merging domain ontologies based on the wordnet system and fuzzy formal concept analysis techniques. Applied Soft Computing, 11(2), 1908–1923. https://doi.org/10.1016/j.asoc.2010.06.007.

    Article  Google Scholar 

  • Chen, H., Schuffels, C., & Orwig, R. (1996). Internet categorization and search: A self-organizing approach. Journal of Visual Communication and Image Representation, 7(1), 88–102. https://doi.org/10.1006/jvci.1996.0008.

    Article  Google Scholar 

  • Chen, C., & Song, M. (2017). Representing scientific knowledge. New York: Springer.

    Google Scholar 

  • Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics, 1(1), 8–15.

    Google Scholar 

  • Chiu, W.-T., Huang, J.-S., & Ho, Y.-S. (2004). Bibliometric analysis of severe acute respiratory syndrome-related research in the beginning stage. Scientometrics, 61(1), 69–77.

    Google Scholar 

  • Chowdhury, G. G. (2003). Natural language processing. Annual Review of Information Science and Technology, 37(1), 51–89. https://doi.org/10.1002/aris.1440370103.

    Article  Google Scholar 

  • Chowdhury, M. F. M., Abacha, A. B., Lavelli, A., & Zweigenbaum, P. (2011). Two different machine learning techniques for drug-drug interaction extraction. Challenge Task on Drug-Drug Interaction Extraction, 761, 19–26.

    Google Scholar 

  • Chung, W. & Chen, H. J. F. N. Jr. (2005). A visual framework for knowledge discovery on the web: An empirical study of business intelligence exploration. Journal of Management Information Systems, 21(4), 57–84. https://doi.org/10.1080/07421222.2005.11045821.

    Article  Google Scholar 

  • Chung, W., Zhang, Y., Huang, Z., Wang, G., Ong, T.-H., & Chen, H. (2004). Internet searching and browsing in a multilingual world: An experiment on the Chinese business intelligence portal (CBizPort). Journal of the American Society for Information Science and Technology, 55(9), 818–831. https://doi.org/10.1002/asi.20025.

    Article  Google Scholar 

  • Clarke, A., Gatineau, M., Thorogood, M., & Wyn-Roberts, N. (2007). Health promotion research literature in Europe 1995–2005. European Journal of Public Health, 17(suppl\_1), 24–28.

  • Collins, A. M., & Quillian, M. R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8(2), 240–247. https://doi.org/10.1016/S0022-5371(69)80069-1.

    Article  Google Scholar 

  • Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., et al. (2000). Learning to construct knowledge bases from the world wide web. Artificial Intelligence, 118(1–2), 69–113.

    MATH  Google Scholar 

  • Cunningham, H., Maynard, D , Bontcheva, K., & Tablan, V. (2002). GATE: An architecture for development of robust HLT applications. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 168–175).

  • Daud, A. (2012). Using time topic modeling for semantics-based dynamic research interest finding. Knowledge-Based Systems, 26, 154–163.

    Google Scholar 

  • Davenport, T. H., & Prusak, L. (1998). Working knowledge: How organizations manage what they know. Brighton: Harvard Business Press.

    Google Scholar 

  • Ding, Y. (2011). Topic-based PageRank on author cocitation networks. Journal of the Association for Information Science and Technology, 62(3), 449–466.

    Google Scholar 

  • Ding, W., & Chen, C. (2014). Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods. Journal of the Association for Information Science and Technology, 65(10), 2084–2097. https://doi.org/10.1002/asi.23134.

    Article  Google Scholar 

  • Ding, Y., Chowdhury, G. G., & Foo, S. (2001). Bibliometric cartography of information retrieval research by using co-word analysis. Information Processing and Management, 37(6), 817–842.

    MATH  Google Scholar 

  • Ding, Y., & Cronin, B. (2011). Popular and/or prestigious? Measures of scholarly esteem. Information Processing and Management, 47(1), 80–96.

    Google Scholar 

  • Dong, X. L., & Srivastava, D. (2015). Knowledge curation and knowledge fusion: Challenges, models and applications. In Proceedings of the 2015 acm sigmod international conference on management of data (pp. 2063–2066). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/2723372.2731083.

  • Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K. et al. (2014). Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 601–610).

  • Dong, X. L., Gabrilovich, E., Heitz, G., Horn, W., Murphy, K., Sun, S., et al. (2014). From data fusion to knowledge fusion. In Proceedings of the VLDB endowment (Vol. 7, pp. 881–892). https://doi.org/10.14778/2732951.2732962.

  • Eick, S. C., Steffen, J. L., & Sumner, E. E. (1992). Seesoft-a tool for visualizing line oriented software statistics. IEEE Transactions on Software Engineering, 18(11), 957–968. https://doi.org/10.1109/32.177365.

    Article  Google Scholar 

  • Ennas, G., Biggio, B., & Di Guardo, M. C. (2015). Data-driven journal meta-ranking in business and management. Scientometrics, 105(3), 1911–1929.

    Google Scholar 

  • Eppler, M. J. (2006). A comparison between concept maps, mind maps, conceptual diagrams, and visual metaphors as complementary tools for knowledge construction and sharing. Information Visualization, 5(3), 202–210. https://doi.org/10.1057/palgrave.ivs.9500131.

    Article  Google Scholar 

  • Erhardt, R. A.-A., Schneider, R., & Blaschke, C. (2006). Status of text-mining techniques applied to biomedical text. Drug Discovery Today, 11(7), 315–325. https://doi.org/10.1016/j.drudis.2006.02.011.

    Article  Google Scholar 

  • Feinerer, I., & Hornik, K. (2017). wordnet: Wordnet interface [Computer software manua]. https://CRAN.R-project.org/package=wordnet (R package version 0.1-14).

  • Fellbaum, C. (1998). Wordnet: An electronic lexical database. Cambridge: Bradford Books.

    MATH  Google Scholar 

  • Fiala, D. (2012). Time-aware PageRank for bibliographic networks. Journal of Informetrics, 6(3), 370–388.

    Google Scholar 

  • Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 363–370). Stroudsburg, PA: USAAssociation for Computational Linguistics. https://doi.org/10.3115/1219840.1219885.

  • Friedman, C., Kra, P., Yu, H., Krauthammer, M., & Rzhetsky, A. (2001). Genies: A natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics, 17, 74–82.

    Google Scholar 

  • Fu, Y., Bauer, T., Mostafa, J., Palakal, M., & Mukhopadhyay, S. (2002). Concept extraction and association from cancer literature. In Proceedings of the 4th international workshop on web information and data management (pp. 100–103). New York, NY: ACM. https://doi.org/10.1145/584931.584953.

  • Ganter, B., & Wille, R. (2012). Formal concept analysis: mathematical foundations. New York: Springer.

    MATH  Google Scholar 

  • Garfield, E. (1965). Can citation indexing be automated. In Proceedings of the statistical association methods for mechanized documentation symposium (Vol. 269, pp. 189–192).

  • Gartner, R. (2016). Metadata: Shaping knowledge from antiquity to the semantic web. Cham: Springer.

    Google Scholar 

  • Gollapalli, S. D., Mitra, P., & Giles, C. L. (2011). Ranking authors in digital libraries. In Proceedings of the 11th annual international ACM/IEEE joint conference on digital libraries (pp. 251–254).

  • Goodall, A. H. (2009). Highly cited leaders and the performance of research universities. Research Policy, 38(7), 1079–1092. https://doi.org/10.1016/j.respol.2009.04.002.

    Article  Google Scholar 

  • Gray, P. M. D., Preece, A., Fiddian, N. J., Gray, W. A., Bench-Capon, T. J. M., Shave, M. J. R., et al. (1997). KRAFT: Knowledge fusion from distributed databases and knowledge bases. In Proceedings of 8th international conference on the database and expert systems applications (pp. 682–691). https://doi.org/10.1109/DEXA.1997.617411.

  • Grebla, H. A., Cenan, C. O., & Stanca, L. (2010). Knowledge fusion in academic networks. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 1(2), 111–118.

    Google Scholar 

  • Guerrero-Bote, V. P., & Moya-Anegón, F. (2012). A further step forward in measuring journals’ scientific prestige: The SJR2 indicator. Journal of Informetrics, 6(4), 674–688.

    Google Scholar 

  • Guo, C., Chinchankar, R., & Liu, X. (2012). Knowledge retrieval for scientific literatures. Proceedings of the American Society for Information Science and Technology, 49(1), 1–7. https://doi.org/10.1002/meet.14504901152.

    Article  Google Scholar 

  • Guzmán-Arenas, A., & Cuevas, A.-D. (2010). Knowledge accumulation through automatic merging of ontologies. Expert Systems with Applications, 37(3), 1991–2005.

    Google Scholar 

  • Harrington, B. , & Wojtinnek, P. (2011). Creating a standardized markup language for semantic networks. In Proceedings of the 2011 IEEE fifth international conference on semantic computing (pp. 279–282). https://doi.org/10.1109/ICSC.2011.82.

  • He, Q. (1999). Knowledge discovery through co-word analysis. Library Trends, 48, 133–159.

    Google Scholar 

  • Hirohata, K., Okazaki, N., Ananiadou, S. & Ishizuka, M. (2008). Identifying sections in scientific abstracts using conditional random fields. In Proceedings of the third international joint conference on natural language processing (Vol. 1). https://www.aclweb.org/anthology/I08-1050.

  • Huang, S., & Wan, X. (2013). AKMiner: Domain-specific knowledge graph mining from academic literatures. In Proceedings of the web information systems engineering (wise) (pp. 241–255). Berlin: Springer.

  • Huang, M., Zhu, X., & Li, M. (2006). A hybrid method for relation extraction from biomedical literature. International Journal of Medical Informatics, 75(6), 443–455. https://doi.org/10.1016/j.ijmedinf.2005.06.010.

    Article  Google Scholar 

  • Hu, S., & Cao, Y. (2009). Knowledge fusion framework based on web page texts. Frontiers of Computer Science in China, 3(4), 457.

    Google Scholar 

  • Johnson, B., & Shneiderman, B. (1991). Tree-maps: A space-filling approach to the visualization of hierarchical information structures. In Proceedings of the visualization 1991 conferernce (pp. 284–291). https://doi.org/10.1109/VISUAL.1991.175815.

  • Kajikawa, Y., & Takeda, Y. (2009). Citation network analysis of organic LEDs. Technological Forecasting and Social Change, 76(8), 1115–1123.

    Google Scholar 

  • Kajikawa, Y., Yoshikawa, J., Takeda, Y., & Matsushima, K. (2008). Tracking emerging technologies in energy research: Toward a roadmap for sustainable energy. Technological Forecasting and Social Change, 75(6), 771–782.

    Google Scholar 

  • Kampis, G., & Lukowicz, P. (2015). Collaborative knowledge fusion by ad-hoc information distribution in crowds. Procedia Computer Science, 51, 542–551.

    Google Scholar 

  • Kim, S., Suh, E., & Hwang, H. (2003). Building the knowledge map: An industrial case study. Journal of Knowledge Management, 7(2), 34–45.

    Google Scholar 

  • Kohonen, T. (2012). Self-organization and associative memory (Vol. 8). New York: Springer.

    MATH  Google Scholar 

  • Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., et al. (1999). Self organization of a massive text document collection. In E. Oja & S. Kaski (Eds.), Kohonen maps (pp. 171–182). Amsterdam: Elsevier. https://doi.org/10.1016/B978-044450270-4/50013-9.

    Chapter  Google Scholar 

  • Koike, A., Niwa, Y., & Takagi, T. (2004). Automatic extraction of gene/protein biological functions from biomedical text. Bioinformatics, 21(7), 1227–1236. https://doi.org/10.1093/bioinformatics/bti084.

    Article  Google Scholar 

  • Kostoff, R. N., Briggs, M. B., Solka, J. L., & Rushenberg, R. L. (2008). Literature-related discovery (LRD): Methodology. Technological Forecasting and Social Change, 75(2), 186–202.

    Google Scholar 

  • Kroeze, J. H. , Matthee, M. C. & Bothma, T. J. D. (2003). Differentiating data- and text-mining terminology. In Proceedings of the 2003 annual research conference of the south african institute of computer scientists and information technologists on enablement through technology (pp. 93–101). ZAF: South African Institute for Computer Scientists and Information Technologists.

  • Kuo, T. T. , Tseng, S. S. & Lin, Y. T. (2003). Ontology-based knowledge fusion framework using graph partitioning. In Proceedings of the international conference on industrial, engineering and other applications of applied intelligent systems (pp. 11–20).

  • Laskey, K. B. , Costa, P. C. G. & Janssen, T. (2008). Probabilistic ontologies for knowledge fusion. In Proceedings of the 11th international conference on information fusion (pp. 1–8).

  • Li, X. , Dong, X. L. , Lyons, K. , Meng, W. & Srivastava, D. (2012). Truth finding on the deep web: Is the problem solved? In Proceedings of the vldb endowment (Vol. 6, pp. 97–108).

  • Liakata, M. , Teufel, S. , Siddharthan, A. & Batchelor, C. (2010). Corpora for the conceptualisation and zoning of scientific papers. In LREC 2010, 7th international conference on language resources and evaluation. http://oro.open.ac.uk/58880/.

  • Lin, J. , Karakos, D. , Demner-Fushman, D. & Khudanpur, S. (2006). Generative content models for structural analysis of medical abstracts. In Proceedings of the HLT-NAACL BioNLP workshop on linking natural language and biology (pp. 65–72). USA: Association for Computational Linguistics.

  • Lin, F. R., & Hsueh, C. M. (2006). Knowledge map creation and maintenance for virtual communities of practice. Information Processing and Management, 42(2), 551–568. https://doi.org/10.1016/j.ipm.2005.03.026.

    Article  Google Scholar 

  • Li, L., Ping, J., & Huang, D. (2010). Protein-protein interaction extraction from biomedical literatures based on a combined kernel. Journal of Information and Computational Science, 7(5), 1065–1073.

    Google Scholar 

  • Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.

    Google Scholar 

  • Liu, X., & Qin, J. (2014a). An interactive metadata model for structural, descriptive, and referential representation of scholarly output. Journal of the Association for Information Science and Technology, 65(5), 964–983. https://doi.org/10.1002/asi.23007.

    Article  Google Scholar 

  • Liu, X., & Qin, J. (2014b). An interactive metadata model for structural, descriptive, and referential representation of scholarly output. Journal of the Association for Information Science and Technology, 65(5), 964–983. https://doi.org/10.1002/asi.2300710.1002/asi.23007.

    Article  Google Scholar 

  • Liu, Z., Yin, Y., Liu, W., & Dunford, M. (2015). Visualizing the intellectual structure and evolution of innovation systems research: A bibliometric analysis. Scientometrics, 103(1), 135–158.

    Google Scholar 

  • Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis. Biodiversity and Conservation, 20(4), 807–826.

    Google Scholar 

  • Ma, N., Guan, J., & Zhao, Y. (2008). Bringing PageRank to the citation analysis. Information Processing and Management, 44(2), 800–810.

    Google Scholar 

  • Mane, K. K., & Börner, K. (2004). Mapping topics and topic bursts in PNAS. Proceedings of the National Academy of Sciences, 101, 5287–5290. https://doi.org/10.1073/pnas.0307626100.

    Article  Google Scholar 

  • Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. & McClosky, D. (2014). The stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd annual meeting of the association for computational linguistics: System demonstrations (pp. 55–60).

  • Marshall, B., McDonald, D., Chen, H., & Chung, W. (2004). EBizPort: Collecting and analyzing business intelligence information. Journal of the American Society for Information Science and Technology, 55(10), 873–891. https://doi.org/10.1002/asi.20037.

    Article  Google Scholar 

  • Masters, J. (2002). Structured knowledge source integration and its applications to information fusion. In Proceedings of the fifth international conference on information fusion (Vol. 2, pp. 1340–1346). https://doi.org/10.1109/ICIF.2002.1020968.

  • Mausam, Schmitz, M., Bart, R., Soderland, S. & Etzioni, O. (2012). Open language learning for information extraction. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 523–534). Stroudsburg, PA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=2390948.2391009.

  • McKnight, L., & Srinivasan, P. (2003). Categorization of sentence types in medical abstracts. In AMIA annual symposium proceedings (Vol. 2003, pp. 440–444).

  • Mészáros, T., Barczikay, Z., Bodon, F., Dobrowiecki, T. P. & Strausz, G. (2001). Building an information and knowledge fusion system. In Proceedings of the engineering of intelligent systems (pp. 82–91). Berlin: Springer.

  • Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database*. International Journal of Lexicography, 3(4), 235–244. https://doi.org/10.1093/ijl/3.4.235.

    Article  Google Scholar 

  • Mizuta, Y., Korhonen, A., Mullen, T., & Collier, N. (2006). Zone analysis in biology articles as a basis for information extraction. International Journal of Medical Informatics, 75(6), 468–487. https://doi.org/10.1016/j.ijmedinf.2005.06.013.

    Article  Google Scholar 

  • Moed, H. F. (2010). Measuring contextual citation impact of scientific journals. Journal of Informetrics, 4(3), 265–277. https://doi.org/10.1016/j.joi.2010.01.002.

    Article  Google Scholar 

  • Mudrak, B. (2016). AJE annual publishing review: Global data report. https://www.aje.com/dist/docs/International-scholarly-publishing-report-2016.pdf.

  • Nengfu, X., Cungen, C., & Guo, H.Y. (2005). A knowledge fusion model for web information. In Proceedings of the 2005 IEEE/WIC/ACM international conference on web intelligence (WI’05) (pp. 67–72). https://doi.org/10.1109/WI.2005.4.

  • Nengfu, X., Wensheng, W., Xiaorong, Y., & Lihua, J. (2012). Rule-based agricultural knowledge fusion in web information integration. Sensor Letters, 10(1–2), 635–638.

    Google Scholar 

  • Niu, F., Zhang, C., Ré, C., & Shavlik, J. (2012). Elementary: Large-scale knowledge-base construction via machine learning and statistical inference. International Journal on Semantic Web and Information Systems (IJSWIS), 8(3), 42–73.

    Google Scholar 

  • Nuzzolese, A. G., Peroni, S. & Recupero, D. R. (2016). ACM: Article content miner for assessing the quality of scientific output. In Semantic web evaluation challenge (pp. 281–292).

  • Nykl, M., Campr, M., & Ježek, K. (2015). Author ranking based on personalized PageRank. Journal of Informetrics, 9(4), 777–799.

    Google Scholar 

  • Page, L., Brin, S., Motwani, R. & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web (Technical Report No. 1999-66). Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/.

  • Perez-Arriaga, M. O., Estrada, T. & Abad-Mota, S. (2016). TAO: System for table detection and extraction from PDF documents. In Proceedings of the twenty-ninth international flairs conference (pp. 591–596).

  • Preece, A., Hui, K., Gray, A., Marti, P., Bench-Capon, T., Cui, Z., et al. (2001). KRAFT: An agent architecture for knowledge fusion. International Journal of Cooperative Information Systems, 10(01n02), 171–195.

    Google Scholar 

  • Preece, A., Hui, K., Gray, A., Marti, P., Bench-Capon, T., Jones, D., et al. (2000). The KRAFT architecture for knowledge fusion and transformation. Knowledge-Based Systems, 13(2), 113–120. https://doi.org/10.1016/S0950-7051(00)00052-6.

    Article  Google Scholar 

  • Priya, M., & Ch, A. K. (2019). A novel method for merging academic social network ontologies using formal concept analysis and hybrid semantic similarity measure. Library Hi Tech.

  • Quan, Thanh Tho, Hui, Siu Cheung, & Fong, A. C. M. (2006). Automatic fuzzy ontology generation for semantic help-desk support. IEEE Transactions on Industrial Informatics, 2(3), 155–164.

    Google Scholar 

  • Rindflesch, T. C., Tanabe, L., Weinstein, J. N. & Hunter, L. (1999). EDGAR: Extraction of drugs, genes and relations from the biomedical literature. In Biocomputing 2000 (pp. 517–528). World Scientific.

  • Ruta, M., Scioscia, F., Gramegna, F., Ieva, S., Di Sciascio, E., & Perez De Vera, R. (2018). A knowledge fusion approach for context awareness in vehicular networks. IEEE Internet of Things Journal, 5(4), 2407–2419. https://doi.org/10.1109/JIOT.2018.2815009.

    Article  Google Scholar 

  • Sah, M. , & Wade, V. (2011). Automatic mining of cognitive metadata using fuzzy inference. In Proceedings of the 22nd acm conference on hypertext and hypermedia (pp. 37–46). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/1995966.1995975.

  • Santos, E., Jr., Wilkinson, J. T. & Santos, E. E. (2009). Bayesian knowledge fusion. In Proceedings of the twenty-second international flairs conference.

  • Santos, E, Jr., Wilkinson, J. T., & Santos, E. E. (2011). Fusing multiple Bayesian knowledge sources. International Journal of Approximate Reasoning, 52(7), 935–947.

    MathSciNet  Google Scholar 

  • Sawaragi, T., Umemura, J., Katai, O., & Iwai, S. (1996). Fusing multiple data and knowledge sources for signal understanding by genetic algorithm. IEEE Transactions on Industrial Electronics, 43(3), 411–421.

    Google Scholar 

  • Sayyadi, H., & Getoor, L. (2009). Futurerank: Ranking scientific articles by predicting their future PageRank. In Proceedings of the 2009 siam international conference on data mining (pp. 533–544).

  • Shahaf, D., Guestrin, C. & Horvitz, E. (2012). Metro maps of science. In Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1122–1130). New York, NY: ACM. https://doi.org/10.1145/2339530.2339706.

  • Shehata, S. , Karray, F. & Kamel, M. (2007). A concept-based model for enhancing text categorization. In Proceedings of the 13th acm sigkdd international conference on knowledge discovery and data mining (pp. 629–637). New York, NY: ACM. https://doi.org/10.1145/1281192.1281260.

  • Shiffrin, R. M., & Börner, K. (2004). Mapping knowledge domains. Proceedings of the National Academy of Sciences, 101(suppl 1), 5183–5185. https://doi.org/10.1073/pnas.0307852100.

    Article  Google Scholar 

  • Shotton, D. (2009). CiTO, the citation typing ontology, and its use for annotation of reference lists and visualization of citation networks. In Bio-ontologies 2009 special interest group meeting at ISMB.

  • Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the Association for Information Science and Technology, 24(4), 265–269.

    Google Scholar 

  • Small, H., Boyack, K. W., & Klavans, R. (2014). Identifying emerging topics in science and technology. Research Policy, 43(8), 1450–1467. https://doi.org/10.1016/j.respol.2014.02.005.

    Article  Google Scholar 

  • Smart, P. R. , Shadbolt, N. R. , Carr, L. A. & Schraefel, M. C. (2005). Knowledge-based information fusion for improved situational awareness. In Proceedings of the 7th international conference on information fusion (Vol. 2, p. 8). https://doi.org/10.1109/ICIF.2005.1591969.

  • Smirnov, A., Pashkin, M., Chilov, N. & Levashova, T. (2001). Multi-agent architecture for knowledge fusion from distributed sources. In Proceedings of the international workshop of central and eastern Europe on multi-agent systems (pp. 293–302).

  • Smirnov, A., Pashkin, M., Chilov, N., Levashova, T. & Haritatos, F. (2002). A KSNet-Approach to knowledge logistics in distributed environment. In Proceedings of the international conference on human-computer interaction in aeronautics (HCI-AERO 2002) (Vol. 88, p. 93).

  • Smirnov, A., & Levashova, T. (2019). Knowledge fusion patterns: A survey. Information Fusion, 52, 31–40. https://doi.org/10.1016/j.inffus.2018.11.007.

    Article  Google Scholar 

  • Song, M., Yu, H., & Han, W.-S. (2011). Combining active learning and semi-supervised learning techniques to extract protein interaction sentences. BMC Bioinformatics, 12(12), S4. https://doi.org/10.1186/1471-2105-12-S12-S4.

    Article  Google Scholar 

  • Stillings, N. A., Chase, C. H., Feinstein, M. H., & Garfield, J. L. (1995). Cognitive science: An introduction. Cambridge: MIT Press.

    Google Scholar 

  • Takeda, I. R., , Hideaki, T. & Shinichi, H. (2001). Rule induction for concept hierarchy alignment. In Proceedings of the 2nd workshop on ontology learning at the 17th international joint conference on AI (IJCAI).

  • Tan, Z., Liu, C., Mao, Y., Guo, Y. , Shen, J. & Wang, X. (2016). AceMap: A novel approach towards displaying relationship among academic literatures. In Proceedings of the 25th international conference companion on world wide web (pp. 437–442). Republic and Canton of Geneva, Switzerland International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2872518.2890514.

  • Tang, J., Jin, R. & Zhang, J. (2008). A topic modeling approach and its integration into the random walk framework for academic search. In Proceedings of the 2008 eighth ieee international conference on data mining (pp. 1055–1060). https://doi.org/10.1109/ICDM.2008.71.

  • Tang, J., Zhang, J., Yao, L. , Li, J. , Zhang, L. & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 990–998). New York, NY: ACM. https://doi.org/10.1145/1401890.1402008.

  • Tang, J., Zhang, J., Jin, R., Yang, Z., Cai, K., Zhang, L., et al. (2011). Topic level expertise search over heterogeneous networks. Machine Learning, 82(2), 211–237.

    MathSciNet  Google Scholar 

  • Tao, S. , Wang, X. , Huang, W. , Chen, W., Wang, T. & Lei, K. (2017). From citation network to study map: A novel model to reorganize academic literatures. In Proceedings of the 26th international conference on world wide web companion (pp. 1225–1232). International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/3041021.3053059.

  • Teufel, S. , Siddharthan, A. & Batchelor, C. (2009). Towards discipline-independent argumentative zoning: Evidence from chemistry and computational linguistics. In Proceedings of the 2009 conference on empirical methods in natural language processing (Vol. 3, pp. 1493–1502). USA: Association for Computational Linguistics.

  • Teufel, S., & Moens, M. (2002). Summarizing scientific articles: Experiments with relevance and rhetorical status. Computational Linguistics, 28(4), 409–445. https://doi.org/10.1162/089120102762671936.

    Article  Google Scholar 

  • Tho, Q. T., Hui, S. C., Fong, A. C. M., & Cao, Tru Hoang. (2006). Automatic fuzzy ontology generation for semantic web. IEEE Transactions on Knowledge and Data Engineering, 18(6), 842–856.

    Google Scholar 

  • Tian, Y., Wen, C., & Hong, S. (2008). Global scientific production on GIS research by bibliometric analysis from 1997 to 2006. Journal of Informetrics, 2(1), 65–74.

    Google Scholar 

  • Tkaczyk, D., Szostek, P., Dendek, P. J., Fedoryszak, M. & Bolikowski, L. (2014). CERMINE—Automatic extraction of metadata and references from scientific literature. In Proceedings of the 11th IAPR international workshop on document analysis systems (pp. 217–221). https://doi.org/10.1109/DAS.2014.63.

  • Tonkin, E., & Muller, H. L. (2008). Semi automated metadata extraction for preprints archives. In Proceedings of the 8th acm/ieee-cs joint conference on digital libraries (pp. 157–166). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/1378889.1378917.

  • Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. https://doi.org/10.1007/s11192-009-0146-3.

    Article  Google Scholar 

  • Vega-Riveros, J. F., Marciales-Vivas, G. P., & Martínez-Melo, M. (1998). Concept maps in engineering education: A case study. Global Journal of Engineering Education (GJEE), 21, 253–273.

    Google Scholar 

  • Viedma-Del-Jesus, M. I., Perakakis, P., Muñoz, M. Á., López-Herrera, A. G., & Vila, J. (2011). Sketching the first 45 years of the journal psychophysiology (1964–2008): A co-word-based analysis. Psychophysiology, 48(8), 1029–1036.

    Google Scholar 

  • Walker, D., Xie, H., Yan, K.-K., & Maslov, S. (2007). Ranking scientific publications using a model of network traffic. Journal of Statistical Mechanics: Theory and Experiment, 2007(06), P06010.

    Google Scholar 

  • Waltman, L., van Eck, N. J., van Leeuwen, T. N., & Visser, M. S. (2013). Some modifications to the snip journal impact indicator. Journal of informetrics, 7(2), 272–285.

    Google Scholar 

  • Wang, Y., Tong, Y., & Zeng, M. (2013). Ranking scientific articles by exploiting citations, authors, journals, and time information. In Proceedings of the twenty-seventh AAAI conference on artificial, intelligence.

  • Wang, X., Zheng, X., Zhang, Q., Wang, T., & Shen, D. (2016). Crowdsourcing in its: The state of the work and the networking. IEEE Transactions on Intelligent Transportation Systems, 17(6), 1596–1605.

    Google Scholar 

  • White, H. D., Lin, X., Buzydlowski, J. W., & Chen, C. (2004). User-controlled mapping of significant literatures. Proceedings of the National Academy of Sciences, 101(suppl 1), 5297–5302. https://doi.org/10.1073/pnas.0307630100.

    Article  Google Scholar 

  • Willett, P. (1988). Recent trends in hierarchic document clustering: A critical review. Information Processing and Management, 24(5), 577–597.

    Google Scholar 

  • Woon, W. L. , Henschel, A. & Madnick, S. (2009). A framework for technology forecasting and visualization. In Proceeding of the 2009 international conference on innovations in information technology (IIT) (pp. 155–159). https://doi.org/10.1109/IIT.2009.5413768.

  • Wu, W. , Li, H. , Wang, H. & Zhu, K. Q. (2012). Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD international conference on management of data (pp. 481–492).

  • Xu, H., Martin, E., & Mahidadia, A. (2014). Contents and time sensitive document ranking of scientific literature. Journal of Informetrics, 8(3), 546–561.

    Google Scholar 

  • Yan, E., Ding, Y., & Sugimoto, C. R. (2011). P-Rank: An indicator measuring prestige in heterogeneous scholarly networks. Journal of the Association for Information Science and Technology, 62(3), 467–477.

    Google Scholar 

  • Yang, Z., Lin, H., & Li, Y. (2010). BioPPISVMExtractor: A protein-protein interaction extractor for biomedical literature using svm and rich feature sets. Journal of Biomedical Informatics, 43(1), 88–96. https://doi.org/10.1016/j.jbi.2009.08.013.

    Article  Google Scholar 

  • Ye, C., Liu, D., Chen, N. & Lin, L. (2015). Mapping the topic evolution using citation-topic model and social network analysis. In 2015 12th international conference on fuzzy systems and knowledge discovery (FSKD) (pp. 2648–2653).

  • Yoon, J., & Kim, K. (2012). Trendperceptor: A property-function based technology intelligence system for identifying technology trends from patents. Expert Systems with Applications, 39(3), 2927–2938. https://doi.org/10.1016/j.eswa.2011.08.154.

    Article  Google Scholar 

  • Yu, D., Wang, W., Zhang, S., Zhang, W., & Liu, R. (2017). A multiple-link, mutually reinforced journal-ranking model to measure the prestige of journals. Scientometrics, 111(1), 521–542.

    Google Scholar 

  • Zagzebski, L. (2017). What is knowledge? In J. Greco & E. Sosa (Eds.), The Blackwell guide to epistemology (pp. 92–116). Oxford: Wiley.

    Google Scholar 

  • Zhang, Y., Saberi, M., Wang, M., & Chang, E. (2019). K3S: Knowledge-driven solution support system. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 9873–9874).

  • Zhang, Y. (2019). Multi-layered knowledge fusion for mapping and ranking big scholarly data. Sydney: University of New South Wales.

    Google Scholar 

  • Zhang, Y., Saberi, M., & Chang, E. (2018). A semantic-based knowledge fusion model for solution-oriented information network development: A case study in intrusion detection field. Scientometrics, 117(2), 857–886. https://doi.org/10.1007/s11192-018-2904-6.

    Article  Google Scholar 

  • Zhang, Y., Wang, M., Gottwalt, F., Saberi, M., & Chang, E. (2019). Ranking scientific articles based on bibliometric networks with a weighting scheme. Journal of Informetrics, 13(2), 616–634. https://doi.org/10.1016/j.joi.2019.03.013.

    Article  Google Scholar 

  • Zhang, Y., Wang, M., Saberi, M., & Chang, E. (2019). From big scholarly data to solution-oriented knowledge repository. Frontiers in Big Data, 2, 38. https://doi.org/10.3389/fdata.2019.00038.

    Article  Google Scholar 

  • Zhao, X., Jia, Y., Li, A., Jiang, R., & Song, Y. (2020). Multi-source knowledge fusion: A survey. World Wide Web, 23(4), 2567–2592.

    Google Scholar 

  • Zhou, D. , Orshanskiy, S. A. , Zha, H. & Giles, C. L. (2007). Co-ranking authors and documents in a heterogeneous network. In Proceedings of the seventh IEEE international conference on data mining (ICDM 2007) (pp. 739–744). https://doi.org/10.1109/ICDM.2007.57.

  • Zhou, D., & He, Y. (2008). Extracting interactions between proteins from the literature. Journal of Biomedical Informatics, 41(2), 393–407. https://doi.org/10.1016/j.jbi.2007.11.008.

    Article  Google Scholar 

  • Zhu, B., & Chen, H. (2005). Information visualization. Annual Review of Information Science and Technology, 39(1), 139–177. https://doi.org/10.1002/aris.1440390111.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Wang, M., Saberi, M. et al. Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges. Scientometrics 125, 2637–2666 (2020). https://doi.org/10.1007/s11192-020-03683-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03683-3

Keywords

Navigation