Abstract
Patents are an important source of information for measuring the technological advancement of a specific knowledge domain. The volume of patents available in digital databases has grown rapidly and, in order to take advantage of existing patent knowledge, it is essential to organize information in an accessible and simple format. The classification systems groups, made available by patent offices, were given names capable of representing them and facilitating the process of searching for the information associated with its content. The purpose of this paper is to use automatic text summarization techniques to develop an automatic methodology to help the examiner to name new patent groups created by the categorization systems. We used three summarization strategies with two different approaches to choose the most representative sentence for each subgroup. The experiments were performed on the basis of abstracts and descriptions of patent documents, in order to evaluate the performance of the methodology proposed in different sections of the patent document. Validation experiments were conducted using four subgroups of the United States Patent and Trademark Office, which uses the Cooperative Patent Classification system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tseng, Y.H., Lin, C.J., Lin, Y.I.: Text mining techniques for patent analysis. Inf. Process. Manag. 43(5), 1216–1247 (2007)
Ouellette, L.L.: Who reads patents? Nat. Biotechnol. 35(5), 421–424 (2017)
Hufker, T., Alpert, F.: Patents: a managerial perspective. J. Prod. Brand Manag. 3(4), 44–54 (1994)
Codina-Filbà, J., et al.: Using genre-specific features for patent summaries. Inf. Process. Manag. 53(1), 151–174 (2017)
Kim, J., Lee, S.: Patent databases for innovation studies: a comparative analysis of USPTO, EPO, JPO and KIPO. Technol. Forecast. Soc. Change 92, 332–345 (2015)
Trappey, A.J., Trappey, C.V., Wu, C.Y.: Automatic patent document summarization for collaborative knowledge systems and services. J. Syst. Sci. Syst. Eng. 18(1), 71–94 (2009)
Camus, C., Brancaleon, R.: Intellectual assets management: from patents to knowledge. World Pat. Inf. 25, 155–159 (2003)
Markellos, K., Perdikuri, K., Markellou, P., Sirmakessis, S., Mayritsakis, G., Tsakalidis, A.: Knowledge discovery in patent databases. In: Proceedings of the eleventh International Conference on Information and Knowledge Management (CIKM 2002), pp. 672–674. ACM (2002)
Leydesdorff, L.: The university-industry knowledge relationship: analyzing patents and the science base of technologies. J. Am. Soc. Inf. Sci. Technol. 55(11), 991–1001 (2004)
Madani, F., Weber, C.: The evolution of patent mining: applying bibliometricsanalysis and keyword network analysis. World Pat. Inf. 46, 32–48 (2016)
Mille, S., Wanner, L.: Multilingual summarization in practice: the case of patent claims. In: Proceedings of the 12th European Association of Machine Translation Conference (2008)
Allahyari, M., et al.: A brief survey of text mining: classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919 (2017)
Wang, D., Zhu, S., Li, T., Chi, Y., Gong, Y.: Integrating document clustering and multidocument summarization. ACM Trans. Knowl. Discov. Data (TKDD) 5(3), 14 (2011)
Gambhir, M., Gupta, V.: Recent automatic text summarization techniques: a survey. Artif. Intell. Rev. 47(1), 1–66 (2017)
Erkan, G., Radev, D.R.: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Dokun, O., Celebi, E.: Single-document summarization using latent semantic analysis. Int. J. Sci. Res. Inf. Syst. Eng. (IJSRISE) 1(2), 57–64 (2015)
Froud, H., Lachkar, A., Ouatik, S.A.: Arabic text summarization based on latent semantic analysis to enhance Arabic documents clustering. Int. J. Data Min. Knowl. Manag. Process 3(1), 79–95 (2013)
Savyanavar, P., Mehta, B., Marathe, V., Padvi, P., Shewale, M.: Multi-document summarization using TF-IDF Algorithm. Int. J. Eng. Comput. Sci. 5(4), 16253–16256 (2016)
Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)
Singh, S.P., Kumar, A., Mangal, A., Singhal, S.: Bilingual automatic text summarization using unsupervised deep learning. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 1195–1200 (2016)
Harispe, S., Ranwez, S., Janaqi, S., Montmain, J.: Semantic Similarity from natural language and ontology analysis. Synth. Lect. Hum. Lang. Technol. 8(1), 1–254 (2015)
Al-Natsheh, H.T., Martinet, L., Muhlenbach, F., Zighed, D.A.: UdL at SemEval-2017 task 1: semantic textual similarity estimation of English sentence pairs using regression model over pairwise features. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 115–119 (2017)
Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. ACM SIGIR Forum 37(1), 10–25 (2003)
Acknowledgment
The authors would like to thank the financial support of the Pontifical Catholic University of Minas Gerais (PUC Minas), the Federal Center for Technological Education of Minas Gerais (CEFET-MG), the National Council for Scientific and Technological Development (CNPq, grant 429144/2016-4) and the Foundation for Research Support of the State of Minas Gerais (FAPEMIG, grant APQ 01454-17).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Souza, C.M., Santos, M.E., Meireles, M.R.G., Almeida, P.E.M. (2019). Using Summarization Techniques on Patent Database Through Computational Intelligence. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11805. Springer, Cham. https://doi.org/10.1007/978-3-030-30244-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-30244-3_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30243-6
Online ISBN: 978-3-030-30244-3
eBook Packages: Computer ScienceComputer Science (R0)