A Workflow-Based Large-Scale Patent Mining and Analytics Framework

Sofean, Mustafa; Aras, Hidir; Alrifai, Ahmad

doi:10.1007/978-3-319-99972-2_17

A Workflow-Based Large-Scale Patent Mining and Analytics Framework

Mustafa Sofean¹¹,
Hidir Aras¹¹ &
Ahmad Alrifai¹¹

Conference paper
First Online: 29 August 2018

1237 Accesses
3 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 920))

Abstract

The analysis of large volumes and complex scientific information such as patents requires new methods and a flexible, highly interactive and easy-to-use platform in order to enable a variety of applications ranging from information search, semantic analysis to specific text- and data mining tasks for information professionals in industry and research. In this paper, we present a scalable patent analytics framework built on top of a big-data architecture and a scientific workflow system. The framework allows to seamlessly integrate essential services for patent analysis employing natural language processing as well as machine learning algorithms for deeply structuring and semantically annotating patent texts for realizing complex scientific workflows. In two case studies we will show how the framework can be utilized for querying, annotating and analyzing large amounts of patent data.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Hong, S.: The Magic of Patent Information. http://www.wipo.int/sme/en/documents/patent_information_fulltext.html
Yoon, J., Kim, K.: TrendPerceptor: a property function based technology intelligence system for identifying technology trends from patents. Expert Syst. Appl. 39(3), 2927–2938 (2012)
Article Google Scholar
Choi, S., Park, H., Kang, D., Lee, J.Y., Kim, K.: An SAO based text mining approach to building a technology tree for technology planning. Expert. Syst. Appl. 39(13), 11443–11455 (2012)
Article Google Scholar
Trappey, C.V., Wu, H.Y., Taghaboni-Dutta, F., Trappey, A.J.C.: Using patent data for technology forecasting: China RFID patent analysis. Adv. Eng. Inform. 25(1), 53–64 (2011)
Article Google Scholar
Daim, T.U., Gomez, F.A., Martin, H., Sheikh, N.: Technology roadmap development process (TRDP) in the medical electronic device industry. Int. J. Bus. Innov. Res. 7(2), 228–263 (2013)
Article Google Scholar
Lee, Y., Kim, S., Shin, J.: Technology opportunity identification customized to the technological capability of SMEs through two-stage patent analysis. Scientometrics 100(1), 227–244 (2014)
Article Google Scholar
Abbas, A., Zhang, L., Khan, S.U.: A literature review on the state-of-the-art in patent analysis. World Pat. Inf. 37, 3–13 (2014)
Article Google Scholar
Hu, J., Li, S., Yao, Y., Yu, L., Yang, G., Hu, J.: Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20, 104 (2018)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA (2013)
Google Scholar
Beltz, H., Fueloep, A., Wadhwa, R.R., Erdi, P.: From ranking and clustering of evolving networks to patent citation analysis. In: 2017 International Joint Conference on Neural Networks (IJCNN), vol. 350. IEEE (2017)
Google Scholar
Jun, S., Park, S.-S., Jang, D.-S.: Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert. Syst. Appl. 41(7), 3204–3212 (2014)
Article Google Scholar
Du, R., Drake, B., Park, H.: Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization, arXiv preprint arXiv:1703.09646
Seo, W., Kim, N., Choi, S.: Big data framework for analyzing patents to support strategic R&D planning (2016)
Google Scholar
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, OSDI 2004 (2004)
Google Scholar
Tseng, Y., Lin, C., Lin, Y.: Text mining techniques for patent analysis. Inf. Process. Manag. 43(5), 1216–1247 (2007)
Article Google Scholar
Sofean, M.: Automatic segmentation of big data of patent texts. In: Bellatreche, L., Chakravarthy, S. (eds.) DaWaK 2017. LNCS, vol. 10440, pp. 343–351. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64283-3_25
Chapter Google Scholar
Hackl-Sommer, R., Schwantner, M.: Patent claim structure recognition. Arch. Data Sci. Ser. A 2(1), 15 (2017)
Google Scholar
Aras, H., Hackl-Sommer, R., Schwantner, M., Sofean, M.: Applications and challenges of text mining with patents. In: IPaMin@KONVENS (2014)
Google Scholar
Vazquez, M., Krallinger, M., Leitner, F., Valencia, A.: Text mining for drugs and chemical compounds: methods, tools and applications. Mol. Inform. 30, 506–519 (2011)
Article Google Scholar
Matos, P., Alcaentara, R., Dekker, A., Ennis, M., Steinbeck, C.: Chemical entities of biological interest: an update. Nucleic Acids Res. 38, D249–D254 (2010)
Article Google Scholar
Trippe, A.: Guidelines for Preparing Patent Landscape Reports. Patinformatics, LLC, With contributions from WIPO Secretariat (2015)
Google Scholar
Waltman, L., van Eck, N.J., Noyons, E.C.: A unified approach to mapping and clustering of bibliometric networks. J. Inform. 4(4), 629–635 (2010)
Article Google Scholar
Tang, J., et al.: PatentMiner: topic-driven patent analysis and mining. In: KDD 2012 (2012)
Google Scholar
Ankam, S., Dou, W., Strumsky, D., Zadrozny, W.: Exploring emerging technologies using patent data and patent classification. In: CHI 2012 (2012)
Google Scholar
Chen, H., Zhang, Y., Zhang, G., Zhu, D., Lu, J.: Modeling technological topic changes in patent claims. In: Proceedings of PIC MET 2015 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

FIZ Karlsruhe, Hermann-von-Helmholtz-Platz 1, 76344, Eggenstein-Leopoldshafen, Germany
Mustafa Sofean, Hidir Aras & Ahmad Alrifai

Authors

Mustafa Sofean
View author publications
You can also search for this author in PubMed Google Scholar
Hidir Aras
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Alrifai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mustafa Sofean , Hidir Aras or Ahmad Alrifai .

Editor information

Editors and Affiliations

Kaunas University of Technology, Kaunas, Lithuania
Robertas Damaševičius
Kaunas University of Technology, Kaunas, Lithuania
Giedrė Vasiljevienė

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sofean, M., Aras, H., Alrifai, A. (2018). A Workflow-Based Large-Scale Patent Mining and Analytics Framework. In: Damaševičius, R., Vasiljevienė, G. (eds) Information and Software Technologies. ICIST 2018. Communications in Computer and Information Science, vol 920. Springer, Cham. https://doi.org/10.1007/978-3-319-99972-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-99972-2_17
Published: 29 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99971-5
Online ISBN: 978-3-319-99972-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics