Abstract
Major countries are supporting new knowledge creation and innovative activities by opening data so that the research results carried out by government budgets can serve as public goods. The Republic of Korea has also made research and development (R&D) reports and papers available in electronic file format, which is the result of national R&D programs for the general public. However, the extraction of meaningful information among unstructured data in text format has not satisfied researchers’ expectations. In order to evolve into a customized service reflecting the opinions of researchers, we investigated the demand for necessary contents and services at the planning stage of R&D projects. This study attempts to propose a method to offer significant information which shows a bigger unit than objects based on trend information with extraction and processing demands focusing on R&D reports based on the results of questionnaire survey and interviews with researchers. This study aims to provide the integration service, tentatively named ‘element data service’, of key sentences and table/figure images with a high demand for the utilization of researchers. The main procedure of the proposed method consists of the subject classification of the R&D report, the extraction of table/figure image, and the extraction of main sentence. We used public reports of the same classification published from 2012 to 2016 for the experiment and utilized the extractive summarization method for the copyright protection of report original text. After realizing the simple prototype, we examined the service possibility through the researcher target reviews.
Similar content being viewed by others
References
Kim, J., Shon, K., Choi, K., et al. (2015). Maximize the value of national science and technology by strengthen sharing/collaboration of national R&D information. Korea: Korea Institute of Science and Technology Information.
Kim, Yong-Ki, & Jeong, Hanjo. (2016). A cloud computing-based analysis system for the national R&D information concerning with the data security. Wireless Personal Communications, 89(3), 977–992.
Chae, Cheol-Joo, Choi, Kwang-Nam, & Choi, Kiseok. (2016). Information interoperability system using multi-agent with security. Wireless Personal Communications, 89(3), 819–832.
Ahn, S.-J., et al. (2012). Trends detection of display research areas by bibliometric analysis. The Journal of the Korea Institute of Electronic Communication Sciences, 7(6), 1343–1351.
Hwang, M.-N., et al. (2011). Trend analysis of technical terms using term life cycle modeling. The KIPS Transactions: Part D, 18(6), 493–500.
Text summarization with TensorFlow. https://research.googleblog.com/2016/08/text-summarization-with-tensorflow.html.
Kupiec, J., Pedersen, J., & Chen, F. (1995). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
Luhn, Hans Peter. (1958). The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2), 159–165.
Nenkova, A., Maskey, S., & Liu, Y. (2011). Automatic summarization. In Proceedings of the 49th annual meeting of the association for computational linguistics: Tutorial abstracts of ACL 2011, Association for Computational Linguistics (p. 3).
Gupta, Vishal, & Lehal, Gurpreet Singh. (2010). A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence, 2(3), 258–268.
Lee, Sung-Jick, & Kim, Han-Joon. (2009). Keyword extraction from news corpus using modified TF-IDF. The Journal of Society for e-Business Studies, 14(4), 59–73.
Sohn, Jong-Soo, Bae, Un-Bong, & Chung, In-Jeong. (2013). Contents recommendation method using social network analysis. Wireless Personal Communications, 73(4), 1529–1546.
Manning, C. D., Raghavan, P., & Schutze, H. Introduction to information retrieval (pp. 100–123). Cambridge University Press, ISBN: 9780521865715.
Andhale, N., & Bewoor, L. A. (2016). An overview of text summarization techniques. In 2016 international conference on computing communication control and automation (ICCUBEA). IEEE.
Roelleke, T., & Wang, J. (2008). TF-IDF uncovered: A study of theories and probabilities. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM.
Matsuo, Yutaka, & Ishizuka, Mitsuru. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01), 157–169.
Lee, C.-B., et al. (2003). Text summarization using PCA and SVD. The KIPS Transactions: Part B, 10(7), 725–734.
Chatterjee, N., & Mohan, S. (2007). Extraction-based single-document summarization using random indexing. In 19th IEEE international conference on tools with artificial intelligence (Vol. 2). ICTAI 2007. IEEE.
Alguliyev, R. M., et al. (2017). A model for text summarization. International Journal of Intelligent Information Technologies (IJIIT), 13(1), 67–85.
Strobelt, H., et al. (2009). Document cards: A top trumps visualization for documents. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1145–1152.
Lee, Wongoo, Cho, Hanjin, & Shon, Kangryul. (2016). Systematical classification scheme management to provide efficient national R&D service in P2P. Wireless Personal Communications, 86(1), 21–34.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.
Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547–579.
Acknowledgements
This research was supported by ‘Maximize the Value of National Science and Technology by Strengthen Sharing/Collaboration of National R&D information’ and ‘A Study on Element Data Service of S&T information’ funded by the Korea Institute of Science and Technology Information (KISTI).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kim, Y., Joo, W., Choi, K. et al. Study of Methods for Element Data Service for Electronic Documents Related to National R&D Projects. Wireless Pers Commun 98, 3211–3226 (2018). https://doi.org/10.1007/s11277-017-5074-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-017-5074-6