Abstract
Patents are not only an important aspect of intellectual property rights, but they are also one of the only ways to protect technological inventions. However, in recent years, the number of patents has been increasing dramatically and, as a result, both patent applicants and patent examiners are finding it more difficult to conduct the due diligence step of the patent registration process. Therefore, the lack of a quick and easy way to accurately measure patent similarity has become a significant obstacle to protecting intellectual property. Currently, there are three main ways to measure patent similarity: IPC code analysis, citation analysis, and keyword analysis. None of these approaches are able to fully reflect the semantics in a patent’s content. As an emerging methodology, subject–action–object (SAO) semantic analysis does reflect semantics, but most approaches treat each identified relationship as equally important, which does not necessarily provide an accurate measure of patent similarity. To offer this power to SAO analysis, this article introduces a new indicator called DWSAO as a reflection of the weight of each SAO semantic structure. Further, we present a semantic analysis framework that incorporates the DWSAO index for finding similar patents based on the weight of each SAO structure in the patent. A case study on the similarity of patents in the field of robotics was used to verify the reliability of the method. The results highlight the detailed meanings derived from the method, the accuracy of the outcomes, and the practical significance of using this approach.







Similar content being viewed by others
References
Adams, S. R. (2006). Information sources in patents (pp. 234–235). Munich: K. G. Saur.
Ahlers, C. B., Fiszman, M., Demner-Fushman, D., Lang, F.-M., & Rindflesch, T. C. (2007). Extracting semantic predications from medline citations for pharmacogenomics. Pacific Symposium on Biocomputing, 12, 209–220.
Angeli, G., Premkumar, M. J. J., & Manning, C. D. (2015). Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol. 1: Long Papers, pp. 344–354).
Bär, D., Biemann, C., Gurevych, I., & Zesch, T. (2012). Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: Proceedings of the main conference and the shared task, and volume 2: Proceedings of the sixth international workshop on semantic evaluation (pp. 435–440). Association for Computational Linguistics.
Bergmann, I., Butzke, D., Walter, L., Fuerste, J. P., Moehrle, M. G., & Erdmann, V. A. (2008). Evaluating the risk of patent infringement by means of semantic patent analysis: The case of DNA chips. R&D Management, 38(5), 550–562.
Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.
Braam, R. R., Moed, H. F., & Van Raan, A. F. (1988). Mapping of science: Critical elaboration and new approaches, a case study in agricultural biochemistry. Journal of Informetrics, 87(88), 15–28.
Finlayson, M. A. (2014). Java libraries for accessing the Princeton Wordnet: Comparison and evaluation. In Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia (pp. 78–85).
Keselman, A., Rosemblat, G., Kilicoglu, H., Fiszman, M., & Rindflesch, T. C. (2010). Adapting semantic natural language processing technology to address information overload in influenza epidemic management. Journal of the American Society for Information Science and Technology, 61(12), 2531–2543.
Kim, Y., Tian, Y., Jeong, Y., Ryu, J., & Myaeng, S. (2009). Automatic discovery of technology trends from patent text. In Proceedings of the 2009 ACM symposium on applied computing, Hawaii, USA.
Lin, D. (1998). An information-theoretic definition of similarity. In International conference on machine learning (pp. 296–304).
Magerman, T., Looy, B. V., & Song, X. (2010). Exploring the feasibility and accuracy of latent semantic analysis based text mining techniques to detect similarity between patent documents and scientific publications. Scientometrics, 82(2), 289–306.
Manning, C. D., & Surdeanu, M., et al. (2014). The Stanford CoreNLP natural language processing toolkit. In 52nd ACL: System demonstrations.
Miller, G. A. (1995). Wordnet: A lexical database for english. Communications of the Association for Computing Machinery, 38(11), 39–41.
Moehrle, M. G. (2005). How combinations of TRIZ tools are used in companies—Results of a cluster analysis. R&D Management, 35(3), 285–296.
Moehrle, M. G. (2010). Measures for textual patent similarities: A guided way to select appropriate approaches. Scientometrics, 85(1), 95–109.
Park, H., Kim, K., Choi, S., & Yoon, J. (2013a). A patent intelligence system for strategic technology planning. Expert Systems with Applications, 40(7), 2373–2390.
Park, H., Yoon, J., & Kim, K. (2012). Identifying patent infringement using SAO based semantic technological similarities. Scientometrics, 90(2), 515–529.
Park, H., Yoon, J., & Kim, K. (2013b). Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining. Scientometrics, 97(3), 883–909.
Park, I., & Yoon, B. (2014). A semantic analysis approach for identifying patent infringement based on a product–patent map. Technology Analysis & Strategic Management, 26(8), 855–874.
Saric, F., Glavas, G., Karan, M., Snajder, J., & Basic, B. D. (2012). TakeLab: Systems for measuring semantic text similarity. In SEM 2012 and (SemEval 2012) (pp. 441–448), Montreal, Canada.
Sternitzke, C., & Bergmann, I. (2009). Similarity measures for document mapping: A comparative study on the level of an individual scientist. Scientometrics, 78(1), 113–130.
Verbitsky, M. (2004). Semantic TRIZ.triz-journal.com. http://www.triz-journal.com/archives/2004/. Accessed January 18, 2013.
Wang, X., Ma, P., Huang, Y., Guo, J., Zhu, D., Porter, A. L., et al. (2017). Combining SAO semantic analysis and morphology analysis to identify technology opportunities. Scientometrics, 111(1), 3–24.
Wang, X., Qiu, P., Zhu, D., Mitkova, L., Lei, M., & Porter, A. L. (2015). Identification of technology development trends based on subject–action–object analysis: The case of dye-sensitized solar cells. Technological Forecasting and Social Change, 98, 24–46.
Yoon, B. (2008). On the development of a technology intelligence tool for identifying technology opportunity. Expert Systems with Applications, 35(1–2), 124–135.
Yoon, B., & Park, Y. (2004). A text-mining-based patent network: Analytical tool for high-technology trend. Journal of High Technology Management Research, 15(1), 37–50.
Yoon, J. (2012). Detecting signals of new technological opportunities using semantic patent analysis and outlier detection. Scientometrics, 90(2), 445–461.
Yoon, J., Park, H., & Kim, K. (2013). Identifying technological competition trends for R&D planning using dynamic patent maps: SAO-based content analysis. Scientometrics, 94(1), 313–331.
Yufeng, D. U., Duo, J. I., Lixue, J., & Guiping, Z. (2016). Patent similarity measure based on SAO structure. Journal of Chinese Information Processing, 30(1), 30–35 (in Chinese).
Zarrella, G., Henderson, J., Merkhofer, E. M., & Strickhart, L. (2015). Mitre: Seven systems for semantic similarity in tweets. In Proceedings of the 9th international workshop on semantic evaluation (semeval 2015) (pp. 12–17). Denver, CO: Association for Computational Linguistics. http://www.aclweb.org/anthology/S15-2002.
Zhang, Y., Shang, L., Huang, L., Porter, A. L., Zhang, G., Lu, J., et al. (2016). A hybrid similarity measure method for patent portfolio analysis. Journal of Informetrics, 10(4), 1108–1130.
Zhang, Y., Zhou, X., Porter, A. L., et al. (2014). How to combine term clumping and technology roadmapping for newly emerging science & technology competitive intelligence: “Problem & solution” pattern based semantic TRIZ tool and case study. Scientometrics, 101(2), 1375–1389.
Acknowledgements
This work is partly supported by the General Program of the National Natural Science Foundation of China (Grant Nos. 71774012, 71673024, 71373019) and the strategic research project of the Development Planning Bureau of the Chinese Academy of Sciences (Grant No. GHJ-ZLZX-2019-42). The findings and observations present in this paper are those of the authors and do not necessarily reflect the views of the supporters or the sponsors. The authors would like to thank the anonymous reviewers for their constructive input into this paper.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
No. | Patent number | No. | Patent number | No. | Patent number | No. | Patent number |
---|---|---|---|---|---|---|---|
1 | FR3046259A1 | 38 | WO2016078517A1 | 75 | GB2513912A | 112 | KR1179592B1 |
2 | US20170159199A1 | 39 | WO2016080615A1 | 76 | KR2014120437A | 113 | WO2012086950A2 |
3 | US9672184B1 | 40 | US20160133491A1 | 77 | KR1437778B1 | 114 | KR1151449B1 |
4 | WO2017091066A1 | 41 | KR2016050285A | 78 | FR3002804A1 | 115 | KR1146907B1 |
5 | US20170105592A1 | 42 | CN105487507A | 79 | US20140222197A1 | 116 | WO2012064009A1 |
6 | CN206115269U | 43 | WO2016045593A1 | 80 | GB2509989A | 117 | US20120060320A1 |
7 | US20170102709A1 | 44 | WO2016038919A1 | 81 | GB2509990A | 118 | KR2012001510A |
8 | CN106551659A | 45 | US20160062362A1 | 82 | GB2509991A | 119 | WO2011136974A2 |
9 | US20170086325A1 | 46 | US20160057925A1 | 83 | GB2510062A | 120 | DE102011010205A1 |
10 | US20170075962A1 | 47 | CN105361817A | 84 | PL402468A1 | 121 | DE102010013297A1 |
11 | US20170072568A1 | 48 | US20160039541A1 | 85 | WO2014105225A1 | 122 | US20110236026A1 |
12 | US20170057760A1 | 49 | KR2016008856A | 86 | KR1411200B1 | 123 | US20110238214A1 |
13 | US20170050311A1 | 50 | US20160011592A1 | 87 | US20140135991A1 | 124 | US20110214030A1 |
14 | CN106444736A | 51 | WO2016000622A1 | 88 | US20140122958A1 | 125 | SE201100582A1 |
15 | US20170037648A1 | 52 | US20150356885A1 | 89 | US20140122654A1 | 126 | KR2011056660A |
16 | US20170020064A1 | 53 | KR1569281B1 | 90 | WO2014058106A1 | 127 | KR2011041721A |
17 | US9527217B1 | 54 | US20150314437A1 | 91 | US20140100693A1 | 128 | US20110092847A1 |
18 | US20160363933A1 | 55 | US20150314453A1 | 92 | KR2014036653A | 129 | US20100324736A1 |
19 | WO2016196622A1 | 56 | WO2015166339A1 | 93 | GB2505550A | 130 | US20100324731A1 |
20 | US20160349756A1 | 57 | WO2015150529A1 | 94 | CN103576678A | 131 | US20100292884A1 |
21 | CN205734931U | 58 | SE201500381A1 | 95 | KR2014002841A | 132 | US20100228421A1 |
22 | WO2016179782A1 | 59 | KR2015105089A | 96 | EP2662742A1 | 133 | KR2010092807A |
23 | US20160327959A1 | 60 | WO2015123732A1 | 97 | FR2987689A1 | 134 | KR2010087820A |
24 | US20160325854A1 | 61 | US9114440B1 | 98 | DE102013101700A1 | 135 | US20100145236A1 |
25 | KR2016129515A | 62 | US20150228419A1 | 99 | WO2013100938A1 | 136 | US20100106298A1 |
26 | CN205612410U | 63 | DE102014201203A1 | 100 | US8373391B1 | 137 | KR2010013362A |
27 | KR1660703B1 | 64 | US20150164599A1 | 101 | US20130035793A1 | 138 | KR2010012351A |
28 | WO2016148327A1 | 65 | EP2879010A1 | 102 | US20120323365A1 | 139 | KR2010007776A |
29 | US20160268823A1 | 66 | KR2015053450A | 103 | US20120303190A1 | 140 | SE200802217A |
30 | US20160237587A1 | 67 | WO2015067225A1 | 104 | CN102789232A | 141 | US20090245930A1 |
31 | US20160240405A1 | 68 | EP2870852A1 | 105 | US20120277908A1 | 142 | US20090240370A1 |
32 | US20160236344A1 | 69 | WO2015052588A2 | 106 | KR2012117421A | 143 | PT104217A |
33 | US9411337B1 | 70 | CN104416568A | 107 | US20120265391A1 | 144 | WO2009092166A1 |
34 | JP2016134081A | 71 | US20150063959A1 | 108 | KR2012113188A | 145 | KR2009061461A |
35 | KR2016067351A | 72 | US20140379129A1 | 109 | CN102692922A | 146 | KR2009053263A |
36 | SE201451644A1 | 73 | WO2014201578A2 | 110 | CN102687620A | 147 | KR2009051319A |
37 | US20160143500A1 | 74 | KR1467887B1 | 111 | US20120229433A1 | 148 | US20090125174A1 |
No. | Patent number | No. | Patent number | No. | Patent number | No. | Patent number |
---|---|---|---|---|---|---|---|
149 | US20090117011A1 | 167 | KR2007103248A | 185 | EP1518784A2 | 203 | US6228168B1 |
150 | US20090049640A1 | 168 | JP2007272301A | 186 | US20050010330A1 | 204 | JP2001033357A |
151 | US20080275590A1 | 169 | US20070226949A1 | 187 | US20040210346A1 | 205 | US6178361B1 |
152 | WO2008106088A2 | 170 | KR2007095558A | 188 | US20040204804A1 | 206 | CA2300686A1 |
153 | EP1961358A2 | 171 | KR2007094288A | 189 | EP1435555A2 | 207 | WO2000033355A2 |
154 | KR2008073626A | 172 | US20070205215A1 | 190 | US20040055746A1 | 208 | EP997176A2 |
155 | KR2008073628A | 173 | WO2007089269A2 | 191 | US20040048550A1 | 209 | WO1999065803A1 |
156 | KR2008050278A | 174 | EP1806086A2 | 192 | US6606784B1 | 210 | US5993132A |
157 | US20080071417A1 | 175 | EP1806085A2 | 193 | US20020187024A1 | 211 | WO1999059400A1 |
158 | KR814784B1 | 176 | US20070142972A1 | 194 | EP1264935A2 | 212 | WO1999038237A1 |
159 | US20080062558A1 | 177 | KR702147B1 | 195 | US6443543B1 | 213 | WO1999017263A1 |
160 | US20080056933A1 | 178 | US20060277423A1 | 196 | WO2002055271A1 | 214 | DE19738163A1 |
161 | US20080038152A1 | 179 | US20060232236A1 | 197 | US6402846B1 | 215 | WO1998033103A1 |
162 | WO2008001275A2 | 180 | US20060090320A1 | 198 | WO2002044703A2 | 216 | RD374022A |
163 | KR782863B1 | 181 | US20060013646A1 | 199 | US20020051700A1 | 217 | US5324948A |
164 | KR2007111628A | 182 | GB2415252A | 200 | DE10033680A1 | 218 | CA2054150A1 |
165 | KR2007105477A | 183 | US20050235076A1 | 201 | WO2002005313A2 | 219 | US4792995A |
166 | US20070245511A1 | 184 | WO2005074362A2 | 202 | US6325808B1 | 220 | RD246001A |
Rights and permissions
About this article
Cite this article
Wang, X., Ren, H., Chen, Y. et al. Measuring patent similarity with SAO semantic analysis. Scientometrics 121, 1–23 (2019). https://doi.org/10.1007/s11192-019-03191-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-019-03191-z