Skip to main content
Log in

An SDN architecture for patent prior art search system based on phrase embedding

  • Published:
Automated Software Engineering Aims and scope Submit manuscript

Abstract

Software defined network (SDN) has gained a great attention in academic field for its separation of the control plane and the data plain to get a programmable network. In the study, we propose an SDN architecture for a scenario of intelligent patent prior art search system. Different from the current mainstream patent retrieval system, where patent prior art search is executed by means of traditional keywords matches under a fixed network topology, our proposed patent prior art search system based on SDN architecture can provide systematic and security analysis of patent text when encountering big data flows of patent applications. We also propose a new Phrase-based patent text representation model (PPTR), where the whole patent text is represented as a Bag of Phrases and then embedded into vector for patent prior art search, which could maintain the integrity of semantic units of patent text. Our experiments show that the proposed PPTR model achieves the best performance compared with traditional approaches of patent prior art search, and it is also expected that SDN architecture is a promising platform framework for other patent mining tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Al-Shboul, B., Myaeng, S.H.: Wikipedia-based query phrase expansion in patent class search. Inf. Retr. 17(5–6), 430–451 (2014)

    Article  Google Scholar 

  • Atkinson, K.H.: Toward a more rational patent search paradigm. In: Proceedings of the 1st ACM workshop on Patent information retrieval, pp. 37–40. ACM (2008)

  • Bashir, S., Rauber, A.: Improving retrievability of patents in prior-art search. In: Gurrin, C. et al. (eds.) Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol. 5993. Springer, Berlin, Heidelberg (2010)

  • Blei, D.M., Ng, A., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. (2003)

  • Bouadjenek, M.R., Sanner, S., Ferraro, G.: A study of query reformulation of patent prior art search with partial patent applications. ACM (2015)

  • Cao, G., Nie, J., Gao, J., et al.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20–24 2008. ACM (2008)

  • Chen, Y., Jian, Y., Zhu, W., et al.: Novel Word Features for Keyword Extraction. Springer, Berlin (2015)

    Book  Google Scholar 

  • Cui, L., Yu, F.R., et al.: When big data meets software-defined networking: SDN for big data and big data for SDN. IEEE Netw. 30, 58–65 (2016)

    Article  Google Scholar 

  • Fafalios, P., Tzitzikas, Y.: Exploratory professional search through semantic post-analysis of search results. In: Professional Search in the Modern World, pp. 166–192. Springer (2014)

  • Far, M.G.: On term selection techniques for patent prior art search. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 803–806. ACM (2015)

  • Fei, W.A.N.G., Tieyun, Q.I.A.N., Bin, L.I.U.: Patent expanded retrieval via word embedding under composite-domain perspectives. Front. Comput. Sci. China 5, 1048–1061 (2019)

    Google Scholar 

  • Feng, W., Lin, L.: Query construction based on concept importance for effective patent retrieval. In: International Conference on Fuzzy Systems and Knowledge Discovery. IEEE (2016)

  • Fujii, A.: Enhancing patent retrieval by citation analysis. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, 23–27 July 2007. ACM (2007)

  • Ganguly, D., Leveling, J., Jones, G.J.F.: United we fall, divided we stand: a study of query segmentation and PRF for patent prior art search. ACM (2011)

  • Gobeill, J., Pasche, E., Teodoro, D., et al.: Simple Pre and Post Processing Strategies for Patent Searching in CLEF Intellectual Property Track 2009. Springer, Berlin (2009)

    Google Scholar 

  • Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. JMLR.org (2012)

  • Harper, S.: A study of query expansion methods for patent retrieval. In: Proceedings of the 4th Workshop on Patent Information Retrieval, pp. 19–24 (2011)

  • Helmers, L., Horn, F., et al.: Automating the search for a patent’s prior art with a full text similarity search. PLoS ONE 14, e0212103 (2019)

    Article  Google Scholar 

  • Hofsttter, S., Rekabsaz, N., Lupu, M., et al.: Enriching word embeddings for patent retrieval with global context (2019)

  • Hong, H., Sun, Z.: Applying SDN for data extraction and mining: an enhanced architecture. Natl. Acad. Sci. Lett. 40(3), 1–3 (2017)

    Article  Google Scholar 

  • Hu, J., Li, S., Yong, Y., et al.: Patent keyword extraction algorithm based on distributed representation for patent classification. Entropy 20(2), 104 (2018)

    Article  Google Scholar 

  • Jones, G.: Toward higher effectiveness for recall-oriented information retrieval: a patent retrieval case study. Mach. Transl. (2012)

  • Jose, A.S., Nair, L.R., Paul, V.: Data mining in software defined networking: a survey. In: International Conference on Computing Methodologies and Communication. IEEE (2017)

  • Juanzi, L.I., Fan, Q., Kuo, Z.: Keyword extraction based on tf/idf for Chinese news document. J. Wuhan Univ. Nat. Sci. Engl. Ed. 12(5), 917–921 (2007)

    Article  Google Scholar 

  • Kang, M., Lee, S., Lee, W.: Prior art search using multi-modal embedding of patent documents. In: 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE (2020)

  • Kim, Y., Croft, W.B.: Improving patent search by search result diversification. In: ICTIR’15, September 27–30, Northampton, MA, USA (2015)

  • Konishi, K.: Query terms extraction from patent document for invalidity search (2005)

  • Krestel, R., Smyth, P.: Recommending patents based on latent topics[C]. Acm Conference on Recommender Systems. ACM (2013)

  • Kreutz, D., Ramos, F., Verissimo, P.E., et al.: Software-defined networking: a comprehensive survey. Proc. IEEE 103(1), 14 (2014)

    Article  Google Scholar 

  • Krishna, A., Ye, J., Foster, C., et al.: Query expansion for patent searching using word embedding and professional crowdsourcing (2019)

  • Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. JMLR.org (2014)

  • Lee, J.S., Hsiang, J.: Prior art search and reranking for generated patent text. In: The 2nd Workshop on Patent Text Mining and Semantic Technologies, PatentSemTech202, co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval Canada, July 11–15 (2021)

  • Magdy, W., Leveling, J., Jones, G.: Exploring structured documents and query formulation techniques for patent retrieval. In: Multilingual Information Access Evaluation I. Text Retrieval Experiments, 10th Workshop of the Cross-Language Evaluation Forum, CLEF 2009, Corfu, Greece, September 30–October 2, 2009, Revised Selected Papers (2009)

  • Magdy, W., Lopez, P., Jones, G.: Simple vs. sophisticated approaches for patent prior-art search. In: Advances in Information Retrieval—33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, 18–21 April 2011. Proceedings (2011)

  • Mahdabi, P., Crestani, F.: Learning-based pseudo-relevance feedback for patent retrieval. In: Conference on Multidisciplinary Information Retrieval. Springer, Berlin (2012)

  • Mahdabi, P., Keikha, M., Gerani, S., et al.: Building queries for prior-art search. In: Proceedings of the Second international conference on Multidisciplinary information retrieval facility. DBLP (2011)

  • Mestres, A., et al.: Knowledge-defined networking. arXiv preprint arXiv:1606.06222 (2016)

  • Mihalcea, R., Tarau, P.: TextRank: bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2004)

  • Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. Comput. Sci. (2013a)

  • Mikolov, T., Sutskever, I., Chen, K., et al.: Distributed representations of words and phrases and their compositionality. Adv. Neural Inf. Process. Syst. 26 (2013b)

  • Nidhi, S., Ishan, V., Viren, G.: Catch-phrase based document representation for improved prior art search. CoDS-COMAD’19, January 3–5, Kolkata, India (2019)

  • Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing (2014)

  • Piroi, F., Lupu, M., Hanbury, A., et al.: CLEF-IP 2011: retrieval in the intellectual property domain. In: CLEF 2011 Labs and Workshop, Notebook Papers, 19–22 September 2011, Amsterdam, The Netherlands. DBLP (2011)

  • Risch, J., Alder, N., Hewel, C., et al.: PatentMatch: a dataset for matching patent claims and prior art. PatentSemTech, July 15th, online (2021)

  • Rose, S.J., Cowley, W.E., et al.: Rapid automatic keyword extraction for information retrieval and analysis. US (2009)

  • Shalaby, W., Zadrozny, W.: Innovation analytics using mined semantic analysis. In: Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference (FLAIRS-2016) (2016)

  • Shalaby, W., Zadrozny, W.: Patent retrieval: a literature review. Knowl. Inf. Syst. (2019)

  • Shalaby, W., Zadrozny, W.: Toward an interactive patent retrieval framework based on distributed representations. In: The 41st International ACM SIGIR Conference. ACM (2018)

  • Shalaby, W., Rajshekhar, K., Zadrozny, W.: A visual semantic framework for innovation analytics. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) (2016)

  • Sideris, K., Nejabati, R., Simeonidou, D.: Seer: empowering software defined networking with data analytics (2017)

  • Socher, R., Manning, C.D., Ng, A.Y.: Learning continuous phrase representations and syntactic parsing with recursive neural networks (2010)

  • Tannebaum, W., Rauber, A.: Acquiring lexical knowledge from query logs for query expansion in patent searching. In: IEEE Sixth International Conference on Semantic Computing. IEEE (2012)

  • Verma, M., Varma, V.: Applying key phrase extraction to aid invalidity search. In: The 13th International Conference on Artificial Intelligence and Law, Proceedings of the Conference, 6–10 June 2011, Pittsburgh, PA, USA. DBLP (2011a)

  • Verma, M., Varma, V.: Exploring keyphrase extraction and IPC classification vectors for prior art search. In: CLEF 2011 Labs and Workshop, Notebook Papers, 19–22 September 2011, Amsterdam, The Netherlands (2011b)

  • Witten, I.H., Paynter, G.W., Frank, E., et al.: KEA: practical automatic keyphrase extraction. In: Fourth Acm Conference on Digital Libraries. ACM (1999)

  • Wu, Y., Zhao, S., Li, W.: Phrase2Vec: phrase embedding based on parsing. Inf. Sci. 517, 100 (2019)

    Article  Google Scholar 

  • Xue, X., Croft, W.B.: Automatic query generation for patent search. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2–6 (2009)

  • Zhang, C., Wang, H., Liu, Y., et al.: Automatic keyword extraction from documents using conditional random fields. J. Comput. Inf. Syst. 4, 1169–1180 (2008)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Public Technology Research Project of Zhejiang Province (No. LGF19G010001), and the Technology Reserach Special Project of the Ministry of Public Security (No. 2019GABJC36).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boting Geng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Geng, B., Wang, F. An SDN architecture for patent prior art search system based on phrase embedding. Autom Softw Eng 29, 58 (2022). https://doi.org/10.1007/s10515-022-00360-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10515-022-00360-y

Keywords

Navigation