Skip to main content
Log in

Study of Methods for Element Data Service for Electronic Documents Related to National R&D Projects

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Major countries are supporting new knowledge creation and innovative activities by opening data so that the research results carried out by government budgets can serve as public goods. The Republic of Korea has also made research and development (R&D) reports and papers available in electronic file format, which is the result of national R&D programs for the general public. However, the extraction of meaningful information among unstructured data in text format has not satisfied researchers’ expectations. In order to evolve into a customized service reflecting the opinions of researchers, we investigated the demand for necessary contents and services at the planning stage of R&D projects. This study attempts to propose a method to offer significant information which shows a bigger unit than objects based on trend information with extraction and processing demands focusing on R&D reports based on the results of questionnaire survey and interviews with researchers. This study aims to provide the integration service, tentatively named ‘element data service’, of key sentences and table/figure images with a high demand for the utilization of researchers. The main procedure of the proposed method consists of the subject classification of the R&D report, the extraction of table/figure image, and the extraction of main sentence. We used public reports of the same classification published from 2012 to 2016 for the experiment and utilized the extractive summarization method for the copyright protection of report original text. After realizing the simple prototype, we examined the service possibility through the researcher target reviews.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Kim, J., Shon, K., Choi, K., et al. (2015). Maximize the value of national science and technology by strengthen sharing/collaboration of national R&D information. Korea: Korea Institute of Science and Technology Information.

    Google Scholar 

  2. Kim, Yong-Ki, & Jeong, Hanjo. (2016). A cloud computing-based analysis system for the national R&D information concerning with the data security. Wireless Personal Communications, 89(3), 977–992.

    Article  Google Scholar 

  3. Chae, Cheol-Joo, Choi, Kwang-Nam, & Choi, Kiseok. (2016). Information interoperability system using multi-agent with security. Wireless Personal Communications, 89(3), 819–832.

    Article  Google Scholar 

  4. Ahn, S.-J., et al. (2012). Trends detection of display research areas by bibliometric analysis. The Journal of the Korea Institute of Electronic Communication Sciences, 7(6), 1343–1351.

    Google Scholar 

  5. Hwang, M.-N., et al. (2011). Trend analysis of technical terms using term life cycle modeling. The KIPS Transactions: Part D, 18(6), 493–500.

    Article  MathSciNet  Google Scholar 

  6. Text summarization with TensorFlow. https://research.googleblog.com/2016/08/text-summarization-with-tensorflow.html.

  7. Kupiec, J., Pedersen, J., & Chen, F. (1995). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval. ACM.

  8. Luhn, Hans Peter. (1958). The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2), 159–165.

    Article  MathSciNet  Google Scholar 

  9. Nenkova, A., Maskey, S., & Liu, Y. (2011). Automatic summarization. In Proceedings of the 49th annual meeting of the association for computational linguistics: Tutorial abstracts of ACL 2011, Association for Computational Linguistics (p. 3).

  10. Gupta, Vishal, & Lehal, Gurpreet Singh. (2010). A survey of text summarization extractive techniques. Journal of emerging technologies in web intelligence, 2(3), 258–268.

    Article  Google Scholar 

  11. Lee, Sung-Jick, & Kim, Han-Joon. (2009). Keyword extraction from news corpus using modified TF-IDF. The Journal of Society for e-Business Studies, 14(4), 59–73.

    Google Scholar 

  12. Sohn, Jong-Soo, Bae, Un-Bong, & Chung, In-Jeong. (2013). Contents recommendation method using social network analysis. Wireless Personal Communications, 73(4), 1529–1546.

    Article  Google Scholar 

  13. Manning, C. D., Raghavan, P., & Schutze, H. Introduction to information retrieval (pp. 100–123). Cambridge University Press, ISBN: 9780521865715.

  14. Andhale, N., & Bewoor, L. A. (2016). An overview of text summarization techniques. In 2016 international conference on computing communication control and automation (ICCUBEA). IEEE.

  15. Roelleke, T., & Wang, J. (2008). TF-IDF uncovered: A study of theories and probabilities. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM.

  16. Matsuo, Yutaka, & Ishizuka, Mitsuru. (2004). Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools, 13(01), 157–169.

    Article  Google Scholar 

  17. Lee, C.-B., et al. (2003). Text summarization using PCA and SVD. The KIPS Transactions: Part B, 10(7), 725–734.

    Google Scholar 

  18. Chatterjee, N., & Mohan, S. (2007). Extraction-based single-document summarization using random indexing. In 19th IEEE international conference on tools with artificial intelligence (Vol. 2). ICTAI 2007. IEEE.

  19. Alguliyev, R. M., et al. (2017). A model for text summarization. International Journal of Intelligent Information Technologies (IJIIT), 13(1), 67–85.

    Article  Google Scholar 

  20. Strobelt, H., et al. (2009). Document cards: A top trumps visualization for documents. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1145–1152.

    Article  Google Scholar 

  21. Lee, Wongoo, Cho, Hanjin, & Shon, Kangryul. (2016). Systematical classification scheme management to provide efficient national R&D service in P2P. Wireless Personal Communications, 86(1), 21–34.

    Article  Google Scholar 

  22. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523.

    Article  Google Scholar 

  23. Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547–579.

    Google Scholar 

Download references

Acknowledgements

This research was supported by ‘Maximize the Value of National Science and Technology by Strengthen Sharing/Collaboration of National R&D information’ and ‘A Study on Element Data Service of S&T information’ funded by the Korea Institute of Science and Technology Information (KISTI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to MyungSeok Yang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, Y., Joo, W., Choi, K. et al. Study of Methods for Element Data Service for Electronic Documents Related to National R&D Projects. Wireless Pers Commun 98, 3211–3226 (2018). https://doi.org/10.1007/s11277-017-5074-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-017-5074-6

Keywords

Navigation