Abstract
Open innovation is a new paradigm embraced by companies to introduce transformations. It assumes that firms can and should use external and internal ideas to innovate. Recently, commercial and research projects have undergone an exponential growth, leading the open challenge of identifying possible insights on interesting aspects to work on. The existing literature has focused on the identification of goals, topics, and keywords in a single piece of text. However, insights do not have a clear structure and cannot be validated by comparing them with a straightforward ground truth, thus making their identification particularly challenging. Besides the extraction of insights from previously existing initiatives, the issue of how to present them to a company in a ranking also emerges. To overcome these two issues, we present an approach that extracts insights from a large number of projects belonging to distinct domains, by analyzing their abstract. Then, our method is able to rank these results, to support project preparation, by presenting first the most relevant and timely/recent insights. Our evaluation on real data coming from all the Horizon 2020 European projects, shows the effectiveness of our approach in a concrete case study.
All authors equally contributed to this research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alabdulkareem, F., Cercone, N., Liaskos, S.: Goal and preference identification through natural language. In: 23rd IEEE International Requirements Engineering Conference, RE, pp. 56–65. IEEE Computer Society (2015)
Allahyari, M., et al.: A brief survey of text mining: Classification, clustering and extraction techniques (2017). CoRR abs/1707.02919
Aras, H., Hackl-Sommer, R., Schwantner, M., Sofean, M.: Applications and challenges of text mining with patents. In: Proceedings of the First International Workshop on Patent Mining and Its Applications (IPaMin 2014). CEUR Workshop Proceedings, vol. 1292. CEUR-WS.org (2014)
Bavier, A., Peterson, L., Mosberger, D.: Bert: A scheduler for best effort and realtime tasks. Technical Report (1999)
Bogers, M., Chesbrough, H., Moedas, C.: Open innovation: research, practices, and policies. Calif. Manag. Rev. 60(2), 5–16 (2018)
Boudin, F.: Unsupervised keyphrase extraction with multipartite graphs (2018). arXiv preprint arXiv:1803.08721
Dessì, D., Fenu, G., Marras, M., Reforgiato Recupero, D.: COCO: semantic-enriched collection of online courses at scale with experimental use cases. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds.) WorldCIST’18 2018. AISC, vol. 746, pp. 1386–1396. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-77712-2_133
Dessì, D., Reforgiato Recupero, D., Fenu, G., Consoli, S.: A recommender system of medical reports leveraging cognitive computing and frame semantics. In: Tsihrintzis, G.A., Sotiropoulos, D.N., Jain, L.C. (eds.) Machine Learning Paradigms. ISRL, vol. 149, pp. 7–30. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-94030-4_2
Gorinski, P.J., et al.: Named entity recognition for electronic health records: a comparison of rule-based and machine learning approaches (2019). arXiv preprint arXiv:1903.03985
Hasan, H.M., Sanyal, F., Chaki, D.: A novel approach to extract important keywords from documents applying latent semantic analysis. In: 2018 10th International Conference on Knowledge and Smart Technology (KST), pp. 117–122. IEEE (2018)
Kathait, S.S., Tiwari, S., Varshney, A., Sharma, A.: Unsupervised key-phrase extraction using noun phrases. Int. J. Comput. Appl. 162, 1–5 (2017)
Larrañaga, M., Elorriaga, J.A., Arruarte, A.: A heuristic NLP based approach for getting didactic resources from electronic documents. In: Dillenbourg, P., Specht, M. (eds.) EC-TEL 2008. LNCS, vol. 5192, pp. 197–202. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87605-2_22
Loukam, M., Hammouche, D., Mezzoudj, F., Belkredim, F.Z.: Keyphrase extraction from modern standard Arabic texts based on association rules. In: Smaïli, K. (ed.) ICALP 2019. CCIS, vol. 1108, pp. 209–220. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32959-4_15
Ramos, G., Boratto, L.: Reputation (in)dependence in ranking systems: demographics influence over output disparities. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’20, pp. 2061–2064. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3397271.3401278
Rauter, R., Globocnik, D., Perl-Vorbach, E., Baumgartner, R.J.: Open innovation and its effects on economic and sustainability innovation performance. J. Innov. Knowl. 4(4), 226–233 (2019)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks (2019). arXiv preprint arXiv:1908.10084
Rose, S., Dave, E., Nick, C., Wendy, C.: Automatic keyword extraction from individual documents. Text Min. Appl. Theory 1, 1–20 (2010)
Saúde, J., Ramos, G., Caleiro, C., Kar, S.: Reputation-based ranking systems and their resistance to bribery. In: 2017 IEEE International Conference on Data Mining, ICDM 2017, pp. 1063–1068. IEEE Computer Society (2017)
Schröder, G., Thiele, M., Lehner, W.: Setting goals and choosing metrics for recommender system evaluations. In: UCERSTI2 workshop at the 5th ACM Conference on Recommender Systems, vol. 23, p. 53 (2011)
Sifatullah, S., Sharan, A.: Keyword and keyphrase extraction techniques: a literature review. Int. J. Comput. Appl. 109(2), 18–23 (2015)
West, J., Bogers, M.: Open innovation: current status and research opportunities. Innovation 19(1), 43–50 (2017)
Wu, J., Choudhury, S.R., Chiatti, A., Liang, C., Giles, C.L.: Hesdk: a hybrid approach to extracting scientific domain knowledge entities. In: 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 1–4 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Malloci, F.M., Penadés, L.P., Boratto, L., Fenu, G. (2020). A Text Mining Approach to Extract and Rank Innovation Insights from Research Projects. In: Huang, Z., Beek, W., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2020. WISE 2020. Lecture Notes in Computer Science(), vol 12343. Springer, Cham. https://doi.org/10.1007/978-3-030-62008-0_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-62008-0_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62007-3
Online ISBN: 978-3-030-62008-0
eBook Packages: Computer ScienceComputer Science (R0)