Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges

Zhang, Yu; Wang, Min; Saberi, Morteza; Chang, Elizabeth

doi:10.1007/s11192-020-03683-3

Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges

Published: 28 August 2020

Volume 125, pages 2637–2666, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Yu Zhang¹,
Min Wang²,
Morteza Saberi³ &
…
Elizabeth Chang¹

1252 Accesses
4 Citations
Explore all metrics

Abstract

The ever growing volume of academic articles stresses the need for a new generation of knowledge management method to intelligently reuse the academic knowledge and facilitate the development of scientific research. Knowledge fusion (KF) serves a key element of such method addressing those needs, and breakthrough progress has taken place in the field of KF. This brings a great opportunity for the academic community to expedite the process of literature review and automatically retrieve the required knowledge from academic publications. Therefore, a survey reviewing the KF studies in terms of the related technologies and applications for valuable insights to reuse academic knowledge, which is missing from the state-of-the-art literature, is in need. Motivated to bridge this gap, this paper conducts a systematic survey reviewing the existing studies on KF, meanwhile discussing the opportunities and challenges of applying KF through academic articles. To this end, we revisit the definitions of knowledge and KF in the context of academic articles, and summarise the fusion patterns and their usage in existing applications. Furthermore, we review the techniques and applications of KF, especially those with academic articles as sources of knowledge. Finally, we discuss the challenges and future directions in order to bring new insights to researchers and practitioners to deepen their understanding of knowledge fusion and to develop versatile functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying and Representing Knowledge Delta in Scientific Literature

Insights into relevant knowledge extraction techniques: a comprehensive review

Article 03 October 2019

RDFtex in-depth: knowledge exchange between LATEX-based research publications and Scientific Knowledge Graphs

Article Open access 31 July 2023

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. https://www.tensorflow.org/ (software available from tensorflow.org).
Amjad, T., Daud, A., Akram, A., & Muhammed, F. (2016). Impact of mutual influence while ranking authors in a co-authorship network. Kuwait Journal of Science, 43(3), 101–109.
Google Scholar
Amjad, T., Ding, Y., Daud, A., Xu, J., & Malic, V. (2015). Topic-based heterogeneous rank. Scientometrics, 104(1), 313–334.
Google Scholar
Andrews, K. (1995). Case study. visualising cyberspace: Information visualisation in the harmony internet browser. In Proceedings of visualization 1995 conference (pp. 97–104). https://doi.org/10.1109/INFVIS.1995.528692.
Baldwin, C., Hughes, J., Hope, T., Jacoby, R., & Ziebland, S. (2003). Ethics and dementia: mapping the literature by bibliometric analysis. International Journal of Geriatric Psychiatry, 18(1), 41–54.
Google Scholar
Bellinger, G., Castro, D. & Mills, A. (2004). Data, information, knowledge, and wisdom. Mental model musings. http://www.systems-thinking.org/dikw/dikw.htm.
Bergstrom, C. T., & West, J. D. (2008). Assessing citations with the Eigenfactor Metrics. Neurology, 71(23), 1850–1851.
Google Scholar
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with python (1st ed.). Sebastopol: O’Reilly Media Inc.
MATH Google Scholar
Bleiholder, J., & Naumann, F. (2009). Data fusion. ACM Computing Surveys (CSUR), 41(1), 1.
Google Scholar
Bollen, J., Rodriquez, M. A., & Van de Sompel, H. (2006). Journal status. Scientometrics, 69(3), 669–687.
Google Scholar
Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37(1), 179–255.
Google Scholar
Bui, Q.-C., Nualáin, B. Ó., Boucher, C. A., & Sloot, P. M. (2010). Extracting causal relations on HIV drug resistance from literature. BMC Bioinformatics, 11(1), 1–11. https://doi.org/10.1186/1471-2105-11-101.
Article Google Scholar
Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemsitry. Scientometrics, 22(1), 155–205.
Google Scholar
Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57.
Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E. R., Jr. & Mitchell, T. M. (2010). Toward an architecture for never-ending language learning. In Proceedings of the twenty-fourth AAAI conference on artificial intelligence (pp. 1306–1313). AAAI Press. http://dl.acm.org/citation.cfm?id=2898607.2898816.
Carvalho, R. N., Matsumoto, S., Laskey, K. B., Costa, P. C. G., Ladeira, M., & Santos, L. L. (2013). Probabilistic ontology and knowledge fusion for procurement fraud detection in Brazil. Uncertainty reasoning for the semantic web II (pp. 19–40). Berlin: Springer.
Google Scholar
Castano, S., & Ferrara, A. (2002). Knowledge representation and transformation in ontology-based data integration. Transformation for the Semantic Web, 21, 51.
Google Scholar
Chen, J., & Diekema, A. R. (2005). Experimenting with the automatic assignment of educational standards to digital library content. In Proceedings of the 5th ACM/IEEE-CS joint conference on digital libraries (JCDL ’05) (pp. 223–224).
Chen, Y. , Liu, F. & Manderick, B. (2011). Extract protein-protein interactions from the literature using support vector machines with feature selection. In Biomedical engineering, trends, research and technologies. IntechOpen.
Chen, H. (2008). Mapping nanotechnology innovations and knowledge: Global and longitudinal patent and literature analysis (Vol. 20). New York: Springer.
Google Scholar
Chen, R.-C., Bau, C.-T., & Yeh, C.-J. (2011). Merging domain ontologies based on the wordnet system and fuzzy formal concept analysis techniques. Applied Soft Computing, 11(2), 1908–1923. https://doi.org/10.1016/j.asoc.2010.06.007.
Article Google Scholar
Chen, H., Schuffels, C., & Orwig, R. (1996). Internet categorization and search: A self-organizing approach. Journal of Visual Communication and Image Representation, 7(1), 88–102. https://doi.org/10.1006/jvci.1996.0008.
Article Google Scholar
Chen, C., & Song, M. (2017). Representing scientific knowledge. New York: Springer.
Google Scholar
Chen, P., Xie, H., Maslov, S., & Redner, S. (2007). Finding scientific gems with Google’s PageRank algorithm. Journal of Informetrics, 1(1), 8–15.
Google Scholar
Chiu, W.-T., Huang, J.-S., & Ho, Y.-S. (2004). Bibliometric analysis of severe acute respiratory syndrome-related research in the beginning stage. Scientometrics, 61(1), 69–77.
Google Scholar
Chowdhury, G. G. (2003). Natural language processing. Annual Review of Information Science and Technology, 37(1), 51–89. https://doi.org/10.1002/aris.1440370103.
Article Google Scholar
Chowdhury, M. F. M., Abacha, A. B., Lavelli, A., & Zweigenbaum, P. (2011). Two different machine learning techniques for drug-drug interaction extraction. Challenge Task on Drug-Drug Interaction Extraction, 761, 19–26.
Google Scholar
Chung, W. & Chen, H. J. F. N. Jr. (2005). A visual framework for knowledge discovery on the web: An empirical study of business intelligence exploration. Journal of Management Information Systems, 21(4), 57–84. https://doi.org/10.1080/07421222.2005.11045821.
Article Google Scholar
Chung, W., Zhang, Y., Huang, Z., Wang, G., Ong, T.-H., & Chen, H. (2004). Internet searching and browsing in a multilingual world: An experiment on the Chinese business intelligence portal (CBizPort). Journal of the American Society for Information Science and Technology, 55(9), 818–831. https://doi.org/10.1002/asi.20025.
Article Google Scholar
Clarke, A., Gatineau, M., Thorogood, M., & Wyn-Roberts, N. (2007). Health promotion research literature in Europe 1995–2005. European Journal of Public Health, 17(suppl\_1), 24–28.
Collins, A. M., & Quillian, M. R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8(2), 240–247. https://doi.org/10.1016/S0022-5371(69)80069-1.
Article Google Scholar
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., et al. (2000). Learning to construct knowledge bases from the world wide web. Artificial Intelligence, 118(1–2), 69–113.
MATH Google Scholar
Cunningham, H., Maynard, D , Bontcheva, K., & Tablan, V. (2002). GATE: An architecture for development of robust HLT applications. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 168–175).
Daud, A. (2012). Using time topic modeling for semantics-based dynamic research interest finding. Knowledge-Based Systems, 26, 154–163.
Google Scholar
Davenport, T. H., & Prusak, L. (1998). Working knowledge: How organizations manage what they know. Brighton: Harvard Business Press.
Google Scholar
Ding, Y. (2011). Topic-based PageRank on author cocitation networks. Journal of the Association for Information Science and Technology, 62(3), 449–466.
Google Scholar
Ding, W., & Chen, C. (2014). Dynamic topic detection and tracking: A comparison of HDP, C-word, and cocitation methods. Journal of the Association for Information Science and Technology, 65(10), 2084–2097. https://doi.org/10.1002/asi.23134.
Article Google Scholar
Ding, Y., Chowdhury, G. G., & Foo, S. (2001). Bibliometric cartography of information retrieval research by using co-word analysis. Information Processing and Management, 37(6), 817–842.
MATH Google Scholar
Ding, Y., & Cronin, B. (2011). Popular and/or prestigious? Measures of scholarly esteem. Information Processing and Management, 47(1), 80–96.
Google Scholar
Dong, X. L., & Srivastava, D. (2015). Knowledge curation and knowledge fusion: Challenges, models and applications. In Proceedings of the 2015 acm sigmod international conference on management of data (pp. 2063–2066). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/2723372.2731083.
Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K. et al. (2014). Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 601–610).
Dong, X. L., Gabrilovich, E., Heitz, G., Horn, W., Murphy, K., Sun, S., et al. (2014). From data fusion to knowledge fusion. In Proceedings of the VLDB endowment (Vol. 7, pp. 881–892). https://doi.org/10.14778/2732951.2732962.
Eick, S. C., Steffen, J. L., & Sumner, E. E. (1992). Seesoft-a tool for visualizing line oriented software statistics. IEEE Transactions on Software Engineering, 18(11), 957–968. https://doi.org/10.1109/32.177365.
Article Google Scholar
Ennas, G., Biggio, B., & Di Guardo, M. C. (2015). Data-driven journal meta-ranking in business and management. Scientometrics, 105(3), 1911–1929.
Google Scholar
Eppler, M. J. (2006). A comparison between concept maps, mind maps, conceptual diagrams, and visual metaphors as complementary tools for knowledge construction and sharing. Information Visualization, 5(3), 202–210. https://doi.org/10.1057/palgrave.ivs.9500131.
Article Google Scholar
Erhardt, R. A.-A., Schneider, R., & Blaschke, C. (2006). Status of text-mining techniques applied to biomedical text. Drug Discovery Today, 11(7), 315–325. https://doi.org/10.1016/j.drudis.2006.02.011.
Article Google Scholar
Feinerer, I., & Hornik, K. (2017). wordnet: Wordnet interface [Computer software manua]. https://CRAN.R-project.org/package=wordnet (R package version 0.1-14).
Fellbaum, C. (1998). Wordnet: An electronic lexical database. Cambridge: Bradford Books.
MATH Google Scholar
Fiala, D. (2012). Time-aware PageRank for bibliographic networks. Journal of Informetrics, 6(3), 370–388.
Google Scholar
Finkel, J. R., Grenager, T., & Manning, C. (2005). Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 363–370). Stroudsburg, PA: USAAssociation for Computational Linguistics. https://doi.org/10.3115/1219840.1219885.
Friedman, C., Kra, P., Yu, H., Krauthammer, M., & Rzhetsky, A. (2001). Genies: A natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics, 17, 74–82.
Google Scholar
Fu, Y., Bauer, T., Mostafa, J., Palakal, M., & Mukhopadhyay, S. (2002). Concept extraction and association from cancer literature. In Proceedings of the 4th international workshop on web information and data management (pp. 100–103). New York, NY: ACM. https://doi.org/10.1145/584931.584953.
Ganter, B., & Wille, R. (2012). Formal concept analysis: mathematical foundations. New York: Springer.
MATH Google Scholar
Garfield, E. (1965). Can citation indexing be automated. In Proceedings of the statistical association methods for mechanized documentation symposium (Vol. 269, pp. 189–192).
Gartner, R. (2016). Metadata: Shaping knowledge from antiquity to the semantic web. Cham: Springer.
Google Scholar
Gollapalli, S. D., Mitra, P., & Giles, C. L. (2011). Ranking authors in digital libraries. In Proceedings of the 11th annual international ACM/IEEE joint conference on digital libraries (pp. 251–254).
Goodall, A. H. (2009). Highly cited leaders and the performance of research universities. Research Policy, 38(7), 1079–1092. https://doi.org/10.1016/j.respol.2009.04.002.
Article Google Scholar
Gray, P. M. D., Preece, A., Fiddian, N. J., Gray, W. A., Bench-Capon, T. J. M., Shave, M. J. R., et al. (1997). KRAFT: Knowledge fusion from distributed databases and knowledge bases. In Proceedings of 8th international conference on the database and expert systems applications (pp. 682–691). https://doi.org/10.1109/DEXA.1997.617411.
Grebla, H. A., Cenan, C. O., & Stanca, L. (2010). Knowledge fusion in academic networks. BRAIN. Broad Research in Artificial Intelligence and Neuroscience, 1(2), 111–118.
Google Scholar
Guerrero-Bote, V. P., & Moya-Anegón, F. (2012). A further step forward in measuring journals’ scientific prestige: The SJR2 indicator. Journal of Informetrics, 6(4), 674–688.
Google Scholar
Guo, C., Chinchankar, R., & Liu, X. (2012). Knowledge retrieval for scientific literatures. Proceedings of the American Society for Information Science and Technology, 49(1), 1–7. https://doi.org/10.1002/meet.14504901152.
Article Google Scholar
Guzmán-Arenas, A., & Cuevas, A.-D. (2010). Knowledge accumulation through automatic merging of ontologies. Expert Systems with Applications, 37(3), 1991–2005.
Google Scholar
Harrington, B. , & Wojtinnek, P. (2011). Creating a standardized markup language for semantic networks. In Proceedings of the 2011 IEEE fifth international conference on semantic computing (pp. 279–282). https://doi.org/10.1109/ICSC.2011.82.
He, Q. (1999). Knowledge discovery through co-word analysis. Library Trends, 48, 133–159.
Google Scholar
Hirohata, K., Okazaki, N., Ananiadou, S. & Ishizuka, M. (2008). Identifying sections in scientific abstracts using conditional random fields. In Proceedings of the third international joint conference on natural language processing (Vol. 1). https://www.aclweb.org/anthology/I08-1050.
Huang, S., & Wan, X. (2013). AKMiner: Domain-specific knowledge graph mining from academic literatures. In Proceedings of the web information systems engineering (wise) (pp. 241–255). Berlin: Springer.
Huang, M., Zhu, X., & Li, M. (2006). A hybrid method for relation extraction from biomedical literature. International Journal of Medical Informatics, 75(6), 443–455. https://doi.org/10.1016/j.ijmedinf.2005.06.010.
Article Google Scholar
Hu, S., & Cao, Y. (2009). Knowledge fusion framework based on web page texts. Frontiers of Computer Science in China, 3(4), 457.
Google Scholar
Johnson, B., & Shneiderman, B. (1991). Tree-maps: A space-filling approach to the visualization of hierarchical information structures. In Proceedings of the visualization 1991 conferernce (pp. 284–291). https://doi.org/10.1109/VISUAL.1991.175815.
Kajikawa, Y., & Takeda, Y. (2009). Citation network analysis of organic LEDs. Technological Forecasting and Social Change, 76(8), 1115–1123.
Google Scholar
Kajikawa, Y., Yoshikawa, J., Takeda, Y., & Matsushima, K. (2008). Tracking emerging technologies in energy research: Toward a roadmap for sustainable energy. Technological Forecasting and Social Change, 75(6), 771–782.
Google Scholar
Kampis, G., & Lukowicz, P. (2015). Collaborative knowledge fusion by ad-hoc information distribution in crowds. Procedia Computer Science, 51, 542–551.
Google Scholar
Kim, S., Suh, E., & Hwang, H. (2003). Building the knowledge map: An industrial case study. Journal of Knowledge Management, 7(2), 34–45.
Google Scholar
Kohonen, T. (2012). Self-organization and associative memory (Vol. 8). New York: Springer.
MATH Google Scholar
Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., et al. (1999). Self organization of a massive text document collection. In E. Oja & S. Kaski (Eds.), Kohonen maps (pp. 171–182). Amsterdam: Elsevier. https://doi.org/10.1016/B978-044450270-4/50013-9.
Chapter Google Scholar
Koike, A., Niwa, Y., & Takagi, T. (2004). Automatic extraction of gene/protein biological functions from biomedical text. Bioinformatics, 21(7), 1227–1236. https://doi.org/10.1093/bioinformatics/bti084.
Article Google Scholar
Kostoff, R. N., Briggs, M. B., Solka, J. L., & Rushenberg, R. L. (2008). Literature-related discovery (LRD): Methodology. Technological Forecasting and Social Change, 75(2), 186–202.
Google Scholar
Kroeze, J. H. , Matthee, M. C. & Bothma, T. J. D. (2003). Differentiating data- and text-mining terminology. In Proceedings of the 2003 annual research conference of the south african institute of computer scientists and information technologists on enablement through technology (pp. 93–101). ZAF: South African Institute for Computer Scientists and Information Technologists.
Kuo, T. T. , Tseng, S. S. & Lin, Y. T. (2003). Ontology-based knowledge fusion framework using graph partitioning. In Proceedings of the international conference on industrial, engineering and other applications of applied intelligent systems (pp. 11–20).
Laskey, K. B. , Costa, P. C. G. & Janssen, T. (2008). Probabilistic ontologies for knowledge fusion. In Proceedings of the 11th international conference on information fusion (pp. 1–8).
Li, X. , Dong, X. L. , Lyons, K. , Meng, W. & Srivastava, D. (2012). Truth finding on the deep web: Is the problem solved? In Proceedings of the vldb endowment (Vol. 6, pp. 97–108).
Liakata, M. , Teufel, S. , Siddharthan, A. & Batchelor, C. (2010). Corpora for the conceptualisation and zoning of scientific papers. In LREC 2010, 7th international conference on language resources and evaluation. http://oro.open.ac.uk/58880/.
Lin, J. , Karakos, D. , Demner-Fushman, D. & Khudanpur, S. (2006). Generative content models for structural analysis of medical abstracts. In Proceedings of the HLT-NAACL BioNLP workshop on linking natural language and biology (pp. 65–72). USA: Association for Computational Linguistics.
Lin, F. R., & Hsueh, C. M. (2006). Knowledge map creation and maintenance for virtual communities of practice. Information Processing and Management, 42(2), 551–568. https://doi.org/10.1016/j.ipm.2005.03.026.
Article Google Scholar
Li, L., Ping, J., & Huang, D. (2010). Protein-protein interaction extraction from biomedical literatures based on a combined kernel. Journal of Information and Computational Science, 7(5), 1065–1073.
Google Scholar
Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Google Scholar
Liu, X., & Qin, J. (2014a). An interactive metadata model for structural, descriptive, and referential representation of scholarly output. Journal of the Association for Information Science and Technology, 65(5), 964–983. https://doi.org/10.1002/asi.23007.
Article Google Scholar
Liu, X., & Qin, J. (2014b). An interactive metadata model for structural, descriptive, and referential representation of scholarly output. Journal of the Association for Information Science and Technology, 65(5), 964–983. https://doi.org/10.1002/asi.2300710.1002/asi.23007.
Article Google Scholar
Liu, Z., Yin, Y., Liu, W., & Dunford, M. (2015). Visualizing the intellectual structure and evolution of innovation systems research: A bibliometric analysis. Scientometrics, 103(1), 135–158.
Google Scholar
Liu, X., Zhang, L., & Hong, S. (2011). Global biodiversity research during 1900–2009: A bibliometric analysis. Biodiversity and Conservation, 20(4), 807–826.
Google Scholar
Ma, N., Guan, J., & Zhao, Y. (2008). Bringing PageRank to the citation analysis. Information Processing and Management, 44(2), 800–810.
Google Scholar
Mane, K. K., & Börner, K. (2004). Mapping topics and topic bursts in PNAS. Proceedings of the National Academy of Sciences, 101, 5287–5290. https://doi.org/10.1073/pnas.0307626100.
Article Google Scholar
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. & McClosky, D. (2014). The stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd annual meeting of the association for computational linguistics: System demonstrations (pp. 55–60).
Marshall, B., McDonald, D., Chen, H., & Chung, W. (2004). EBizPort: Collecting and analyzing business intelligence information. Journal of the American Society for Information Science and Technology, 55(10), 873–891. https://doi.org/10.1002/asi.20037.
Article Google Scholar
Masters, J. (2002). Structured knowledge source integration and its applications to information fusion. In Proceedings of the fifth international conference on information fusion (Vol. 2, pp. 1340–1346). https://doi.org/10.1109/ICIF.2002.1020968.
Mausam, Schmitz, M., Bart, R., Soderland, S. & Etzioni, O. (2012). Open language learning for information extraction. In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 523–534). Stroudsburg, PA: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=2390948.2391009.
McKnight, L., & Srinivasan, P. (2003). Categorization of sentence types in medical abstracts. In AMIA annual symposium proceedings (Vol. 2003, pp. 440–444).
Mészáros, T., Barczikay, Z., Bodon, F., Dobrowiecki, T. P. & Strausz, G. (2001). Building an information and knowledge fusion system. In Proceedings of the engineering of intelligent systems (pp. 82–91). Berlin: Springer.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database*. International Journal of Lexicography, 3(4), 235–244. https://doi.org/10.1093/ijl/3.4.235.
Article Google Scholar
Mizuta, Y., Korhonen, A., Mullen, T., & Collier, N. (2006). Zone analysis in biology articles as a basis for information extraction. International Journal of Medical Informatics, 75(6), 468–487. https://doi.org/10.1016/j.ijmedinf.2005.06.013.
Article Google Scholar
Moed, H. F. (2010). Measuring contextual citation impact of scientific journals. Journal of Informetrics, 4(3), 265–277. https://doi.org/10.1016/j.joi.2010.01.002.
Article Google Scholar
Mudrak, B. (2016). AJE annual publishing review: Global data report. https://www.aje.com/dist/docs/International-scholarly-publishing-report-2016.pdf.
Nengfu, X., Cungen, C., & Guo, H.Y. (2005). A knowledge fusion model for web information. In Proceedings of the 2005 IEEE/WIC/ACM international conference on web intelligence (WI’05) (pp. 67–72). https://doi.org/10.1109/WI.2005.4.
Nengfu, X., Wensheng, W., Xiaorong, Y., & Lihua, J. (2012). Rule-based agricultural knowledge fusion in web information integration. Sensor Letters, 10(1–2), 635–638.
Google Scholar
Niu, F., Zhang, C., Ré, C., & Shavlik, J. (2012). Elementary: Large-scale knowledge-base construction via machine learning and statistical inference. International Journal on Semantic Web and Information Systems (IJSWIS), 8(3), 42–73.
Google Scholar
Nuzzolese, A. G., Peroni, S. & Recupero, D. R. (2016). ACM: Article content miner for assessing the quality of scientific output. In Semantic web evaluation challenge (pp. 281–292).
Nykl, M., Campr, M., & Ježek, K. (2015). Author ranking based on personalized PageRank. Journal of Informetrics, 9(4), 777–799.
Google Scholar
Page, L., Brin, S., Motwani, R. & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web (Technical Report No. 1999-66). Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/.
Perez-Arriaga, M. O., Estrada, T. & Abad-Mota, S. (2016). TAO: System for table detection and extraction from PDF documents. In Proceedings of the twenty-ninth international flairs conference (pp. 591–596).
Preece, A., Hui, K., Gray, A., Marti, P., Bench-Capon, T., Cui, Z., et al. (2001). KRAFT: An agent architecture for knowledge fusion. International Journal of Cooperative Information Systems, 10(01n02), 171–195.
Google Scholar
Preece, A., Hui, K., Gray, A., Marti, P., Bench-Capon, T., Jones, D., et al. (2000). The KRAFT architecture for knowledge fusion and transformation. Knowledge-Based Systems, 13(2), 113–120. https://doi.org/10.1016/S0950-7051(00)00052-6.
Article Google Scholar
Priya, M., & Ch, A. K. (2019). A novel method for merging academic social network ontologies using formal concept analysis and hybrid semantic similarity measure. Library Hi Tech.
Quan, Thanh Tho, Hui, Siu Cheung, & Fong, A. C. M. (2006). Automatic fuzzy ontology generation for semantic help-desk support. IEEE Transactions on Industrial Informatics, 2(3), 155–164.
Google Scholar
Rindflesch, T. C., Tanabe, L., Weinstein, J. N. & Hunter, L. (1999). EDGAR: Extraction of drugs, genes and relations from the biomedical literature. In Biocomputing 2000 (pp. 517–528). World Scientific.
Ruta, M., Scioscia, F., Gramegna, F., Ieva, S., Di Sciascio, E., & Perez De Vera, R. (2018). A knowledge fusion approach for context awareness in vehicular networks. IEEE Internet of Things Journal, 5(4), 2407–2419. https://doi.org/10.1109/JIOT.2018.2815009.
Article Google Scholar
Sah, M. , & Wade, V. (2011). Automatic mining of cognitive metadata using fuzzy inference. In Proceedings of the 22nd acm conference on hypertext and hypermedia (pp. 37–46). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/1995966.1995975.
Santos, E., Jr., Wilkinson, J. T. & Santos, E. E. (2009). Bayesian knowledge fusion. In Proceedings of the twenty-second international flairs conference.
Santos, E, Jr., Wilkinson, J. T., & Santos, E. E. (2011). Fusing multiple Bayesian knowledge sources. International Journal of Approximate Reasoning, 52(7), 935–947.
MathSciNet Google Scholar
Sawaragi, T., Umemura, J., Katai, O., & Iwai, S. (1996). Fusing multiple data and knowledge sources for signal understanding by genetic algorithm. IEEE Transactions on Industrial Electronics, 43(3), 411–421.
Google Scholar
Sayyadi, H., & Getoor, L. (2009). Futurerank: Ranking scientific articles by predicting their future PageRank. In Proceedings of the 2009 siam international conference on data mining (pp. 533–544).
Shahaf, D., Guestrin, C. & Horvitz, E. (2012). Metro maps of science. In Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1122–1130). New York, NY: ACM. https://doi.org/10.1145/2339530.2339706.
Shehata, S. , Karray, F. & Kamel, M. (2007). A concept-based model for enhancing text categorization. In Proceedings of the 13th acm sigkdd international conference on knowledge discovery and data mining (pp. 629–637). New York, NY: ACM. https://doi.org/10.1145/1281192.1281260.
Shiffrin, R. M., & Börner, K. (2004). Mapping knowledge domains. Proceedings of the National Academy of Sciences, 101(suppl 1), 5183–5185. https://doi.org/10.1073/pnas.0307852100.
Article Google Scholar
Shotton, D. (2009). CiTO, the citation typing ontology, and its use for annotation of reference lists and visualization of citation networks. In Bio-ontologies 2009 special interest group meeting at ISMB.
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the Association for Information Science and Technology, 24(4), 265–269.
Google Scholar
Small, H., Boyack, K. W., & Klavans, R. (2014). Identifying emerging topics in science and technology. Research Policy, 43(8), 1450–1467. https://doi.org/10.1016/j.respol.2014.02.005.
Article Google Scholar
Smart, P. R. , Shadbolt, N. R. , Carr, L. A. & Schraefel, M. C. (2005). Knowledge-based information fusion for improved situational awareness. In Proceedings of the 7th international conference on information fusion (Vol. 2, p. 8). https://doi.org/10.1109/ICIF.2005.1591969.
Smirnov, A., Pashkin, M., Chilov, N. & Levashova, T. (2001). Multi-agent architecture for knowledge fusion from distributed sources. In Proceedings of the international workshop of central and eastern Europe on multi-agent systems (pp. 293–302).
Smirnov, A., Pashkin, M., Chilov, N., Levashova, T. & Haritatos, F. (2002). A KSNet-Approach to knowledge logistics in distributed environment. In Proceedings of the international conference on human-computer interaction in aeronautics (HCI-AERO 2002) (Vol. 88, p. 93).
Smirnov, A., & Levashova, T. (2019). Knowledge fusion patterns: A survey. Information Fusion, 52, 31–40. https://doi.org/10.1016/j.inffus.2018.11.007.
Article Google Scholar
Song, M., Yu, H., & Han, W.-S. (2011). Combining active learning and semi-supervised learning techniques to extract protein interaction sentences. BMC Bioinformatics, 12(12), S4. https://doi.org/10.1186/1471-2105-12-S12-S4.
Article Google Scholar
Stillings, N. A., Chase, C. H., Feinstein, M. H., & Garfield, J. L. (1995). Cognitive science: An introduction. Cambridge: MIT Press.
Google Scholar
Takeda, I. R., , Hideaki, T. & Shinichi, H. (2001). Rule induction for concept hierarchy alignment. In Proceedings of the 2nd workshop on ontology learning at the 17th international joint conference on AI (IJCAI).
Tan, Z., Liu, C., Mao, Y., Guo, Y. , Shen, J. & Wang, X. (2016). AceMap: A novel approach towards displaying relationship among academic literatures. In Proceedings of the 25th international conference companion on world wide web (pp. 437–442). Republic and Canton of Geneva, Switzerland International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/2872518.2890514.
Tang, J., Jin, R. & Zhang, J. (2008). A topic modeling approach and its integration into the random walk framework for academic search. In Proceedings of the 2008 eighth ieee international conference on data mining (pp. 1055–1060). https://doi.org/10.1109/ICDM.2008.71.
Tang, J., Zhang, J., Yao, L. , Li, J. , Zhang, L. & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 990–998). New York, NY: ACM. https://doi.org/10.1145/1401890.1402008.
Tang, J., Zhang, J., Jin, R., Yang, Z., Cai, K., Zhang, L., et al. (2011). Topic level expertise search over heterogeneous networks. Machine Learning, 82(2), 211–237.
MathSciNet Google Scholar
Tao, S. , Wang, X. , Huang, W. , Chen, W., Wang, T. & Lei, K. (2017). From citation network to study map: A novel model to reorganize academic literatures. In Proceedings of the 26th international conference on world wide web companion (pp. 1225–1232). International World Wide Web Conferences Steering Committee. https://doi.org/10.1145/3041021.3053059.
Teufel, S. , Siddharthan, A. & Batchelor, C. (2009). Towards discipline-independent argumentative zoning: Evidence from chemistry and computational linguistics. In Proceedings of the 2009 conference on empirical methods in natural language processing (Vol. 3, pp. 1493–1502). USA: Association for Computational Linguistics.
Teufel, S., & Moens, M. (2002). Summarizing scientific articles: Experiments with relevance and rhetorical status. Computational Linguistics, 28(4), 409–445. https://doi.org/10.1162/089120102762671936.
Article Google Scholar
Tho, Q. T., Hui, S. C., Fong, A. C. M., & Cao, Tru Hoang. (2006). Automatic fuzzy ontology generation for semantic web. IEEE Transactions on Knowledge and Data Engineering, 18(6), 842–856.
Google Scholar
Tian, Y., Wen, C., & Hong, S. (2008). Global scientific production on GIS research by bibliometric analysis from 1997 to 2006. Journal of Informetrics, 2(1), 65–74.
Google Scholar
Tkaczyk, D., Szostek, P., Dendek, P. J., Fedoryszak, M. & Bolikowski, L. (2014). CERMINE—Automatic extraction of metadata and references from scientific literature. In Proceedings of the 11th IAPR international workshop on document analysis systems (pp. 217–221). https://doi.org/10.1109/DAS.2014.63.
Tonkin, E., & Muller, H. L. (2008). Semi automated metadata extraction for preprints archives. In Proceedings of the 8th acm/ieee-cs joint conference on digital libraries (pp. 157–166). New York, NY: Association for Computing Machinery. https://doi.org/10.1145/1378889.1378917.
Van Eck, N. J., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523–538. https://doi.org/10.1007/s11192-009-0146-3.
Article Google Scholar
Vega-Riveros, J. F., Marciales-Vivas, G. P., & Martínez-Melo, M. (1998). Concept maps in engineering education: A case study. Global Journal of Engineering Education (GJEE), 21, 253–273.
Google Scholar
Viedma-Del-Jesus, M. I., Perakakis, P., Muñoz, M. Á., López-Herrera, A. G., & Vila, J. (2011). Sketching the first 45 years of the journal psychophysiology (1964–2008): A co-word-based analysis. Psychophysiology, 48(8), 1029–1036.
Google Scholar
Walker, D., Xie, H., Yan, K.-K., & Maslov, S. (2007). Ranking scientific publications using a model of network traffic. Journal of Statistical Mechanics: Theory and Experiment, 2007(06), P06010.
Google Scholar
Waltman, L., van Eck, N. J., van Leeuwen, T. N., & Visser, M. S. (2013). Some modifications to the snip journal impact indicator. Journal of informetrics, 7(2), 272–285.
Google Scholar
Wang, Y., Tong, Y., & Zeng, M. (2013). Ranking scientific articles by exploiting citations, authors, journals, and time information. In Proceedings of the twenty-seventh AAAI conference on artificial, intelligence.
Wang, X., Zheng, X., Zhang, Q., Wang, T., & Shen, D. (2016). Crowdsourcing in its: The state of the work and the networking. IEEE Transactions on Intelligent Transportation Systems, 17(6), 1596–1605.
Google Scholar
White, H. D., Lin, X., Buzydlowski, J. W., & Chen, C. (2004). User-controlled mapping of significant literatures. Proceedings of the National Academy of Sciences, 101(suppl 1), 5297–5302. https://doi.org/10.1073/pnas.0307630100.
Article Google Scholar
Willett, P. (1988). Recent trends in hierarchic document clustering: A critical review. Information Processing and Management, 24(5), 577–597.
Google Scholar
Woon, W. L. , Henschel, A. & Madnick, S. (2009). A framework for technology forecasting and visualization. In Proceeding of the 2009 international conference on innovations in information technology (IIT) (pp. 155–159). https://doi.org/10.1109/IIT.2009.5413768.
Wu, W. , Li, H. , Wang, H. & Zhu, K. Q. (2012). Probase: A probabilistic taxonomy for text understanding. In Proceedings of the 2012 ACM SIGMOD international conference on management of data (pp. 481–492).
Xu, H., Martin, E., & Mahidadia, A. (2014). Contents and time sensitive document ranking of scientific literature. Journal of Informetrics, 8(3), 546–561.
Google Scholar
Yan, E., Ding, Y., & Sugimoto, C. R. (2011). P-Rank: An indicator measuring prestige in heterogeneous scholarly networks. Journal of the Association for Information Science and Technology, 62(3), 467–477.
Google Scholar
Yang, Z., Lin, H., & Li, Y. (2010). BioPPISVMExtractor: A protein-protein interaction extractor for biomedical literature using svm and rich feature sets. Journal of Biomedical Informatics, 43(1), 88–96. https://doi.org/10.1016/j.jbi.2009.08.013.
Article Google Scholar
Ye, C., Liu, D., Chen, N. & Lin, L. (2015). Mapping the topic evolution using citation-topic model and social network analysis. In 2015 12th international conference on fuzzy systems and knowledge discovery (FSKD) (pp. 2648–2653).
Yoon, J., & Kim, K. (2012). Trendperceptor: A property-function based technology intelligence system for identifying technology trends from patents. Expert Systems with Applications, 39(3), 2927–2938. https://doi.org/10.1016/j.eswa.2011.08.154.
Article Google Scholar
Yu, D., Wang, W., Zhang, S., Zhang, W., & Liu, R. (2017). A multiple-link, mutually reinforced journal-ranking model to measure the prestige of journals. Scientometrics, 111(1), 521–542.
Google Scholar
Zagzebski, L. (2017). What is knowledge? In J. Greco & E. Sosa (Eds.), The Blackwell guide to epistemology (pp. 92–116). Oxford: Wiley.
Google Scholar
Zhang, Y., Saberi, M., Wang, M., & Chang, E. (2019). K3S: Knowledge-driven solution support system. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, pp. 9873–9874).
Zhang, Y. (2019). Multi-layered knowledge fusion for mapping and ranking big scholarly data. Sydney: University of New South Wales.
Google Scholar
Zhang, Y., Saberi, M., & Chang, E. (2018). A semantic-based knowledge fusion model for solution-oriented information network development: A case study in intrusion detection field. Scientometrics, 117(2), 857–886. https://doi.org/10.1007/s11192-018-2904-6.
Article Google Scholar
Zhang, Y., Wang, M., Gottwalt, F., Saberi, M., & Chang, E. (2019). Ranking scientific articles based on bibliometric networks with a weighting scheme. Journal of Informetrics, 13(2), 616–634. https://doi.org/10.1016/j.joi.2019.03.013.
Article Google Scholar
Zhang, Y., Wang, M., Saberi, M., & Chang, E. (2019). From big scholarly data to solution-oriented knowledge repository. Frontiers in Big Data, 2, 38. https://doi.org/10.3389/fdata.2019.00038.
Article Google Scholar
Zhao, X., Jia, Y., Li, A., Jiang, R., & Song, Y. (2020). Multi-source knowledge fusion: A survey. World Wide Web, 23(4), 2567–2592.
Google Scholar
Zhou, D. , Orshanskiy, S. A. , Zha, H. & Giles, C. L. (2007). Co-ranking authors and documents in a heterogeneous network. In Proceedings of the seventh IEEE international conference on data mining (ICDM 2007) (pp. 739–744). https://doi.org/10.1109/ICDM.2007.57.
Zhou, D., & He, Y. (2008). Extracting interactions between proteins from the literature. Journal of Biomedical Informatics, 41(2), 393–407. https://doi.org/10.1016/j.jbi.2007.11.008.
Article Google Scholar
Zhu, B., & Chen, H. (2005). Information visualization. Annual Review of Information Science and Technology, 39(1), 139–177. https://doi.org/10.1002/aris.1440390111.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Business, UNSW Canberra, Canberra, Australia
Yu Zhang & Elizabeth Chang
School of Engineering and Information Technology, UNSW Canberra, Canberra, Australia
Min Wang
Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, Australia
Morteza Saberi

Authors

Yu Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Min Wang
View author publications
You can also search for this author inPubMed Google Scholar
Morteza Saberi
View author publications
You can also search for this author inPubMed Google Scholar
Elizabeth Chang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yu Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Wang, M., Saberi, M. et al. Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges. Scientometrics 125, 2637–2666 (2020). https://doi.org/10.1007/s11192-020-03683-3

Download citation

Received: 26 May 2020
Published: 28 August 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11192-020-03683-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Identifying and Representing Knowledge Delta in Scientific Literature

Insights into relevant knowledge extraction techniques: a comprehensive review

RDFtex in-depth: knowledge exchange between LATEX-based research publications and Scientific Knowledge Graphs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now