Abstract
Quantification of knowledge technologies has long posed a challenge to the measurement of cognitive proximity. This paper proposes a method to measure cognitive proximity by mining patent description text with the LDA topic model. With the patent-topic distribution got from the LDA topic model, the cognitive proximity is measured between enterprises or within cities, which could make up for the shortage of existing measurement methods limited by the rigid IPC, industry classification system, or non-standard interview data. Our empirical studies on the ICT industry indicate that the 20 topics obtained through the topic model have a good correspondence with the technologies involved in this industry's leading products and services. And we dig out the knowledge and technology information in the patent text to depict the technology landscape, including mining the changes of technology topics over time, the difference of distribution in various cities, and the development trend of the urban innovation network. This method's effectiveness is also proved in the model that compares different measurement methods when revealing the relationship between cognitive proximity and patent productivity. Last, researchers can use this approach to delve deeper into urban innovation issues, and policymakers can use it to figure out further innovation.
Similar content being viewed by others
Data availability
The patent data and Statistical Yearbook Data used in this paper are all public data, which can be obtained from the official website. Individual data from China's Economic Census are not public.
Code availability
The semantic analysis code contains technical processing details and cannot be disclosed at this time.
References
Aharonson, B. S., & Schilling, M. A. (2016). Mapping the technological landscape: measuring technology distance, technological footprints, and technology evolution. Research Policy, 45(1), 81–96. https://doi.org/10.1016/j.respol.2015.08.001
Antonelli, C. (2000). Collective knowledge communication and innovation: The evidence of technological districts. Regional Studies, 34(6), 535–547. https://doi.org/10.1080/00343400050085657
Arts, S., Cassiman, B., & Gomez, J. C. (2018). Text matching to measure patent similarity. Strategic Management Journal, 39(1), 62–84. https://doi.org/10.1002/smj.2699
Balland, P.-A. (2012). Proximity and the evolution of collaboration networks: Evidence from research and development projects within the global navigation satellite system (GNSS) industry. Regional Studies, 46(6), 741–756. https://doi.org/10.1080/00343404.2010.529121
Balland, P.-A., Belso-Martínez, J. A., & Morrison, A. (2016). The dynamics of technical and business knowledge networks in industrial clusters: Embeddedness, status, or proximity? Economic Geography, 92(1), 35–60. https://doi.org/10.1080/00130095.2015.1094370
Balland, P.-A., Boschma, R., & Frenken, K. (2015). Proximity and innovation: From statics to dynamics. Regional Studies, 49(6), 907–920. https://doi.org/10.1080/00343404.2014.883598
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022
Boschma, R. (2005). Proximity and innovation: A critical assessment. Regional Studies, 39(1), 61–74. https://doi.org/10.1080/0034340052000320887
Boschma, R. A., & ter Wal, A. L. J. (2007). Knowledge networks and innovative performance in an industrial district: The case of a footwear district in the South of Italy. Industry & Innovation, 14(2), 177–199. https://doi.org/10.1080/13662710701253441
Boschma, R., & Frenken, K. (2009). The spatial evolution of innovation networks: A proximity perspective. 0905 Papers in Evolutionary Economic Geography (PEEG). Utrecht University, Department of Human Geography and Spatial Planning, Group Economic Geography. https://ideas.repec.org/p/egu/wpaper/0905.html.
Broekel, T., & Boschma, R. (2012). Knowledge networks in the dutch aviation industry: The proximity paradox. Journal of Economic Geography, 12(2), 409–433. https://doi.org/10.1093/jeg/lbr010
Cassi, L., & Plunket, A. (2014). Proximity, network formation, and inventive performance. In search of the proximity paradox. The Annals of Regional Science, 53(2), 395–422. https://doi.org/10.1007/s00168-014-0612-6
Cassiman, B., Colombo, M. G., Garrone, P., & Veugelers, R. (2005). The impact of M&A on the R&D process: An empirical analysis of the role of technological- and market-relatedness. Research Policy, 34(2), 195–220. https://doi.org/10.1016/j.respol.2005.01.002
Cecere, G., Corrocher, N., Gossart, C., & Ozman, M. (2014). Technological pervasiveness and variety of innovators in green ICT: A patent-based analysis. Research Policy, 43(10), 1827–1839. https://doi.org/10.1016/j.respol.2014.06.004
Coe, D. T., Helpman, E., & Hoffmaister, A. W. (1997). North-south R & D spillovers. The Economic Journal, 107(440), 134–149. https://doi.org/10.1111/1468-0297.00146
Crescenzi, R., Nathan, M., & Rodríguez-Pose, A. (2016). Do inventors talk to strangers? on proximity and collaborative knowledge creation. Research Policy, 45(1), 177–194. https://doi.org/10.1016/j.respol.2015.07.003
Davids, M., & Frenken, K. (2018). Proximity, knowledge base and the innovation process: Towards an integrated framework. Regional Studies, 52(1), 23–34. https://doi.org/10.1080/00343404.2017.1287349
Fernández, A., Ferrándiz, E., & León, M. D. (2016). Proximity dimensions and scientific collaboration among academic institutions in Europe: The closer, the better? Scientometrics, 106(3), 1073–1092. https://doi.org/10.1007/s11192-015-1819-8
Gerken, J. M., & Moehrle, M. G. (2012). A new Instrument for technology monitoring: Novelty in patents measured by semantic patent analysis. Scientometrics, 91(3), 645–670. https://doi.org/10.1007/s11192-012-0635-7
Gilsing, V., Nooteboom, B., Vanhaverbeke, W., Duysters, G., & van den Oord, Ad. (2008). Network embeddedness and the exploration of novel technologies: Technological distance, betweenness centrality and density. Research Policy, Special Section Knowledge Dynamics out of Balance: Knowledge Biased, Skewed and Unmatched, 37(10), 1717–1731. https://doi.org/10.1016/j.respol.2008.08.010
Gomaa, W. H., & Fahmy, A. A. (2013). A survey of text similarity approaches. International Journal of Computer Applications, 68(13), 13–18
Griliches, Z. (1979). Issues in assessing the contribution of research and development to productivity growth. The Bell Journal of Economics, 10(1), 92–116. https://doi.org/10.2307/3003321
Grossman, G. M., & Helpman, E. (1991). Innovation and Growth in the Global Economy. MIT Press.
Gątkowski, M., Dietl, M., Skrok, Ł, Whalen, R., & Rockett, K. (2020). Semantically-based patent thicket identification. Research Policy, 49(2), 103925. https://doi.org/10.1016/j.respol.2020.103925
Harispe, S., Ranwez, S., Janaqi, S., & Montmain, J. (2015). Semantic similarity from natural language and ontology analysis. Synthesis Lectures on Human Language Technologies, 8(1), 1–254. https://doi.org/10.2200/S00639ED1V01Y201504HLT027
Huber, F. (2012). On the role and interrelationship of spatial, social and cognitive proximity: Personal knowledge relationships of r&d workers in the cambridge information technology cluster. Regional Studies, 46(9), 1169–1182. https://doi.org/10.1080/00343404.2011.569539
Jungmittag, A. (2004). Innovations, technological specialisation and economic growth in the EU. International Economics and Economic Policy, 1(2), 247–273. https://doi.org/10.1007/s10368-004-0018-5
Kaplan, S., & Vakili, K. (2014). The double-edged sword of recombination in breakthrough innovation. Strategic Management Journal, 36(10), 1435–1457. https://doi.org/10.1002/smj.2294
Kelly, B., Papanikolaou, D., Seru, A., & Taddy, M. (2018). Measuring technological innovation over the long run. National Bureau of Economic Research. https://doi.org/10.3386/w25266
Knoben, J., & Oerlemans, L. A. (2006). Proximity and inter-organizational collaboration: A literature review International. Journal of Management Reviews., 8(2), 71–89. https://doi.org/10.1111/j.1468-2370.2006.00121.x
Kogler, D. F., Essletzbichler, J., & Rigby, D. L. (2016). The evolution of specialization in the EU15 knowledge space. Journal of Economic Geography, 17(2), 345–373. https://doi.org/10.1093/jeg/lbw024
Lang, W., Chen, T., & Li, X. (2016). A new style of urbanization in China: Transformation of urban rural communities. Habitat International, 55, 1–9. https://doi.org/10.1016/j.habitatint.2015.10.009
Li, X., Hui, E.-M., Lang, W., Zheng, S., & Qin, X. (2020). Transition from factor-driven to innovation-driven urbanization in China: A study of manufacturing industry automation in Dongguan City. China Economic Review, 59(February), 101382. https://doi.org/10.1016/j.chieco.2019.101382
Malmberg, A., & Maskell, P. (2006). Localized learning revisited. Growth and Change, 37(1), 1–18. https://doi.org/10.1111/j.1468-2257.2006.00302.x
Malmberg & Maskell. (2010). An evolutionary approach to localized learning and spatial clustering. Chapters. Edward Elgar Publishing. https://econpapers.repec.org/bookchap/elgeechap/12864_5f18.htm.
Maskell, P. (2001). Towards a knowledge-based theory of the geographical cluster. Industrial and Corporate Change, 10(4), 921–943. https://doi.org/10.1093/icc/10.4.921
Mattes, J. (2012). Dimensions of proximity and knowledge bases: innovation between spatial and non-spatial factors. Regional Studies, 46(8), 1085–1099. https://doi.org/10.1080/00343404.2011.552493
Nooteboom, B. (1999). Innovation, learning and industrial organisation. Cambridge Journal of Economics, 23(2), 127–150. https://doi.org/10.1093/cje/23.2.127
Nooteboom, B. (2000). Learning and innovation in organizations and economies. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199241002.001.0001
Nooteboom, B., Van Haverbeke, W., Duysters, G., Gilsing, V., & van den Oord, Ad. (2007). Optimal cognitive distance and absorptive capacity. Research Policy, 36(7), 1016–1034. https://doi.org/10.1016/j.respol.2007.04.003
Rallet, A., & Torre, A. (1999). Is geographical proximity necessary in the innovation networks in the era of global economy? GeoJournal, 49(4), 373–380. https://doi.org/10.1023/A:1007140329027
Rigby, D. L., & Essletzbichler, J. (2006). Technological variety, technological change and a geography of production techniques. Journal of Economic Geography, 6(1), 45–70. https://doi.org/10.1093/jeg/lbi015
Romer, P. M. (1989). Human capital and growth: theory and evidence. Working Paper 3173. National Bureau of Economic Research. https://doi.org/10.3386/w3173.
Steinmo, M., & Rasmussen, E. (2016). How firms collaborate with public research organizations: The evolution of proximity dimensions in successful innovation projects. Journal of Business Research, 69(3), 1250–1259. https://doi.org/10.1016/j.jbusres.2015.09.006
Torre, A., & Rallet, A. (2005). Proximity and localization. Regional Studies, 39(1), 47–59. https://doi.org/10.1080/0034340052000320842
Tseng, Y.-H., Lin, C.-J., & Lin, Y.-I. (2007). Text mining techniques for patent analysis. Information Processing & Management, Patent Processing, 43(5), 1216–1247. https://doi.org/10.1016/j.ipm.2006.11.011
Vicente, J., Balland, P. A., & Brossard, O. (2011). Getting into networks and clusters: Evidence from the midi-pyrenean global navigation satellite systems (GNSS) collaboration network. Regional Studies, 45(8), 1059–1078. https://doi.org/10.1080/00343401003713340
ter Wal, A. L. J. (2013). Cluster emergence and network evolution: A longitudinal analysis of the inventor network in sophia-antipolis. Regional Studies, 47(5), 651–668. https://doi.org/10.1080/00343401003614258
Wang, Bo., Liu, S., Ding, K., Liu, Z., & Jing, Xu. (2014). Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: A case study in LTE technology. Scientometrics, 101(1), 685–704. https://doi.org/10.1007/s11192-014-1342-3
Wuyts, S., Colombo, M. G., Dutta, S., & Nooteboom, B. (2005). Empirical tests of optimal cognitive distance. Journal of Economic Behavior & Organization, Theories of the Firm, 58(2), 277–302. https://doi.org/10.1016/j.jebo.2004.03.019
Yoon, J., & Kim, K. (2012). detecting signals of new technological opportunities using semantic patent analysis and outlier detection. Scientometrics, 90(2), 445–461. https://doi.org/10.1007/s11192-011-0543-2
Funding
The research leading to these results received funding from the National Natural Science Foundation of China under Grant Agreement No. 41971157.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Yawen Qin performed data collection and analysis. Yawen Qin wrote the first draft of the manuscript, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare they don't have any commercial or associative interest that represents a conflict of interest connected with the work submitted.
Appendix
Appendix
Topics and featured words
The result of the topic model includes the mapping relationship between topic and feature words. Generally, each topic contains multiple feature words, so each feature word's weight is relatively low. Table 3 lists the top 5 with the highest weight. We can name the theme according to the combination of feature words. For example, the featured words in topic0 are all related to battery technology, so we can define topic0 as battery.
Diversity and specialization of cities' innovation topics
The normalized information entropy of each city on 20 topics is calculated by using the city-topic distribution. Entropy reflects the diversity of cities in the process of innovation. The higher the entropy value is, the more diverse it is the lower the entropy value is, the more specialized it is. Figure 10 reflects the relationship between entropy value and patent output of all cities with patents in ICT industry. However, most cities haven't form ICT industry clusters, so this paper focuses on 26 cities with relatively continuous and high patent output, and the results are reflected in Fig. 11.
Among the 26 cities, Nantong, Ningbo, and other cities in the Yangtze River delta are the most specialized. In contrast, Shenzhen, Beijing, and other cities have the most diversified ICT industry innovations.
Rights and permissions
About this article
Cite this article
Qin, Y., Qin, X., Chen, H. et al. Measuring cognitive proximity using semantic analysis: A case study of China's ICT industry. Scientometrics 126, 6059–6084 (2021). https://doi.org/10.1007/s11192-021-04021-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-021-04021-x