Abstract
The analysis of large volumes and complex scientific information such as patents requires new methods and a flexible, highly interactive and easy-to-use platform in order to enable a variety of applications ranging from information search, semantic analysis to specific text- and data mining tasks for information professionals in industry and research. In this paper, we present a scalable patent analytics framework built on top of a big-data architecture and a scientific workflow system. The framework allows to seamlessly integrate essential services for patent analysis employing natural language processing as well as machine learning algorithms for deeply structuring and semantically annotating patent texts for realizing complex scientific workflows. In two case studies we will show how the framework can be utilized for querying, annotating and analyzing large amounts of patent data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
References
Hong, S.: The Magic of Patent Information. http://www.wipo.int/sme/en/documents/patent_information_fulltext.html
Yoon, J., Kim, K.: TrendPerceptor: a property function based technology intelligence system for identifying technology trends from patents. Expert Syst. Appl. 39(3), 2927–2938 (2012)
Choi, S., Park, H., Kang, D., Lee, J.Y., Kim, K.: An SAO based text mining approach to building a technology tree for technology planning. Expert. Syst. Appl. 39(13), 11443–11455 (2012)
Trappey, C.V., Wu, H.Y., Taghaboni-Dutta, F., Trappey, A.J.C.: Using patent data for technology forecasting: China RFID patent analysis. Adv. Eng. Inform. 25(1), 53–64 (2011)
Daim, T.U., Gomez, F.A., Martin, H., Sheikh, N.: Technology roadmap development process (TRDP) in the medical electronic device industry. Int. J. Bus. Innov. Res. 7(2), 228–263 (2013)
Lee, Y., Kim, S., Shin, J.: Technology opportunity identification customized to the technological capability of SMEs through two-stage patent analysis. Scientometrics 100(1), 227–244 (2014)
Abbas, A., Zhang, L., Khan, S.U.: A literature review on the state-of-the-art in patent analysis. World Pat. Inf. 37, 3–13 (2014)
Hu, J., Li, S., Yao, Y., Yu, L., Yang, G., Hu, J.: Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20, 104 (2018)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA (2013)
Beltz, H., Fueloep, A., Wadhwa, R.R., Erdi, P.: From ranking and clustering of evolving networks to patent citation analysis. In: 2017 International Joint Conference on Neural Networks (IJCNN), vol. 350. IEEE (2017)
Jun, S., Park, S.-S., Jang, D.-S.: Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert. Syst. Appl. 41(7), 3204–3212 (2014)
Du, R., Drake, B., Park, H.: Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization, arXiv preprint arXiv:1703.09646
Seo, W., Kim, N., Choi, S.: Big data framework for analyzing patents to support strategic R&D planning (2016)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, OSDI 2004 (2004)
Tseng, Y., Lin, C., Lin, Y.: Text mining techniques for patent analysis. Inf. Process. Manag. 43(5), 1216–1247 (2007)
Sofean, M.: Automatic segmentation of big data of patent texts. In: Bellatreche, L., Chakravarthy, S. (eds.) DaWaK 2017. LNCS, vol. 10440, pp. 343–351. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64283-3_25
Hackl-Sommer, R., Schwantner, M.: Patent claim structure recognition. Arch. Data Sci. Ser. A 2(1), 15 (2017)
Aras, H., Hackl-Sommer, R., Schwantner, M., Sofean, M.: Applications and challenges of text mining with patents. In: IPaMin@KONVENS (2014)
Vazquez, M., Krallinger, M., Leitner, F., Valencia, A.: Text mining for drugs and chemical compounds: methods, tools and applications. Mol. Inform. 30, 506–519 (2011)
Matos, P., Alcaentara, R., Dekker, A., Ennis, M., Steinbeck, C.: Chemical entities of biological interest: an update. Nucleic Acids Res. 38, D249–D254 (2010)
Trippe, A.: Guidelines for Preparing Patent Landscape Reports. Patinformatics, LLC, With contributions from WIPO Secretariat (2015)
Waltman, L., van Eck, N.J., Noyons, E.C.: A unified approach to mapping and clustering of bibliometric networks. J. Inform. 4(4), 629–635 (2010)
Tang, J., et al.: PatentMiner: topic-driven patent analysis and mining. In: KDD 2012 (2012)
Ankam, S., Dou, W., Strumsky, D., Zadrozny, W.: Exploring emerging technologies using patent data and patent classification. In: CHI 2012 (2012)
Chen, H., Zhang, Y., Zhang, G., Zhu, D., Lu, J.: Modeling technological topic changes in patent claims. In: Proceedings of PIC MET 2015 (2015)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sofean, M., Aras, H., Alrifai, A. (2018). A Workflow-Based Large-Scale Patent Mining and Analytics Framework. In: Damaševičius, R., Vasiljevienė, G. (eds) Information and Software Technologies. ICIST 2018. Communications in Computer and Information Science, vol 920. Springer, Cham. https://doi.org/10.1007/978-3-319-99972-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-99972-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99971-5
Online ISBN: 978-3-319-99972-2
eBook Packages: Computer ScienceComputer Science (R0)