ABSTRACT
Patenting is one of the most important ways to protect company's core business concepts and proprietary technologies. Analyzing large volume of patent data can uncover the potential competitive or collaborative relations among companies in certain areas, which can provide valuable information to develop strategies for intellectual property (IP), R&D, and marketing. In this paper, we present a novel topic-driven patent analysis and mining system. Instead of merely searching over patent content, we focus on studying the heterogeneous patent network derived from the patent database, which is represented by several types of objects (companies, inventors, and technical content) jointly evolving over time. We design and implement a general topic-driven framework for analyzing and mining the heterogeneous patent network. Specifically, we propose a dynamic probabilistic model to characterize the topical evolution of these objects within the patent network. Based on this modeling framework, we derive several patent analytics tools that can be directly used for IP and R&D strategy planning, including a heterogeneous network co-ranking method, a topic-level competitor evolution analysis algorithm, and a method to summarize the search results. We evaluate the proposed methods on a real-world patent database. The experimental results show that the proposed techniques clearly outperform the corresponding baseline methods.
Supplemental Material
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
- C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR 2004, pages 25--32, 2004. Google ScholarDigital Library
- J. Carbonell and J. Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR'98, pages 335--336, 1998. Google ScholarDigital Library
- D. Cohn and H. Chang. Learning to probabilistically identify authoritative documents. In ICML'00, pages 167--174, 2000. Google ScholarDigital Library
- N. Craswell, A. P. de Vries, and I. Soboroff. Overview of the trec-2005 enterprise track. In TREC 2005 Conference Notebook, pages 199--205, 2005.Google Scholar
- J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. In OSDI'04, pages 10--10, 2004. Google ScholarDigital Library
- T. L. Griffiths and M. Steyvers. Finding scientific topics. In PNAS'04, pages 5228--5235, 2004.Google ScholarCross Ref
- M. Hertzum and A. M. Pejtersen. The information-seeking practices of engineers: Searching for documents as well as for people. Information Processing & Management, 36(5):761--778, 2000. Google ScholarDigital Library
- T. Hofmann. Probabilistic latent semantic indexing. In SIGIR'99, pages 50--57, 1999. Google ScholarDigital Library
- A. McCallum. Multi-label text classification with a mixture model trained by em. In Proceedings of AAAI'99 Workshop on Text Learning, 1999.Google Scholar
- Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In WWW'07, pages 171--180, 2007. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report SIDL-WP-1999-0120, Stanford University, 1999.Google Scholar
- M. Steyvers, P. Smyth, and T. Griffiths. Probabilistic author-topic models for information discovery. In KDD'04, pages 306--315, 2004. Google ScholarDigital Library
- J. Tang, R. Jin, and J. Zhang. A topic modeling approach and its integration into the random walk framework for academic search. In ICDM'08, pages 1055--1060, 2008. Google ScholarDigital Library
- J. Tang, J. Sun, C. Wang, and Z. Yang. Social influence analysis in large-scale networks. In KDD'09, pages 807--816, 2009. Google ScholarDigital Library
- J. Tang, L. Yao, D. Zhang, and J. Zhang. A combination approach to web user profiling. ACM TKDD, 5(1):1--44, 2010. Google ScholarDigital Library
- J. Tang, J. Zhang, R. Jin, Z. Yang, K. Cai, L. Zhang, and Z. Su. Topic level expertise search over heterogeneous networks. Machine Learning Journal, 82(2):211--237, 2011. Google ScholarDigital Library
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su. Arnetminer: Extraction and mining of academic social networks. In KDD'08, pages 990--998, 2008. Google ScholarDigital Library
- Y.-H. Tseng, C.-J. Lin, and Y.-I. Lin. Text mining techniques for patent analysis. Inf. Process. Manage., 43:1216--1247, September 2007. Google ScholarDigital Library
- C. van Rijsbergen. Information Retrieval. But-terworths, London, 1979. Google ScholarDigital Library
- X. Wan, J. Yang, and J. Xiao. Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In ACL'07, pages 552--559, 2007.Google Scholar
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR'01, pages 334--342, 2001. Google ScholarDigital Library
- J. Zhang, J. Tang, and J. Li. Expert finding in a social network. In DASFAA'07, pages 1066--1069, 2007.Google ScholarCross Ref
- X. Zhu and J. Lafferty. Harmonic mixtures: Combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In ICML'05, pages 1052--1059, 2005. Google ScholarDigital Library
Index Terms
- PatentMiner: topic-driven patent analysis and mining
Recommendations
A personalized recommendation system for high-quality patent trading by leveraging hybrid patent analysis
AbstractPatents, as technological innovation with commercial values, play a significant role for increasing enterprise and national competitiveness. Personalized recommendation in online patent marketplace would help patent buyers effectively identify ...
PatentDom: Analyzing Patent Relationships on Multi-View Patent Graphs
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementThe fast growth of technologies has driven the advancement of our society. It is often necessary to quickly grasp the linkage between different technologies in order to better understand the technical trend. The availability of huge volumes of granted ...
Technological collaboration patterns in solar cell industry based on patent inventors and assignees analysis
This study examines technological collaboration in the solar cell industry using the information of patent assignees and inventors as defined by the United States Patent and Trademark Office. Three different collaborative types, namely local (same city),...
Comments