ABSTRACT
Detecting and monitoring competitors is fundamental to a company to stay ahead in the global market. Existing studies mainly focus on mining competitive relationships within a single data source, while competing information is usually distributed in multiple networks. How to discover the underlying patterns and utilize the heterogeneous knowledge to avoid biased aspects in this issue is a challenging problem. In this paper, we study the problem of mining competitive relationships by learning across heterogeneous networks. We use Twitter and patent records as our data sources and statistically study the patterns behind the competitive relationships. We find that the two networks exhibit different but complementary patterns of competitions. Our proposed model, Topical Factor Graph Model (TFGM), defines a latent topic layer to bridge the two networks and learns a semi-supervised learning model to classify the relationships between entities (e.g., companies or products). We test the proposed model on two real data sets and the experimental results validate the effectiveness of our model, with an average of +46\% improvement over alternative methods.
- S. Bao, R. Li, Y. Yu, and Y. Cao. Competitor mining with the web. IEEE Trans. Knowl. Data Eng., pages 1297--1310, 2008. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3:993--1022, 2003. Google ScholarDigital Library
- C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM TIST, 2:27:1--27:27, 2011. Google ScholarDigital Library
- X. Chen and Y.-F. B. Wu. Web mining from competitors' websites. In KDD'05, pages 550--555, 2005. Google ScholarDigital Library
- H. Ernst. Patent information for strategic technology management. World Patent Information, 25(3):233--242, September 2003.Google ScholarCross Ref
- F. Heider. Attitudes and cognitive organization. Journal of Psychology, 21(2):107--112, 1946.Google ScholarCross Ref
- T. Hofmann. Probabilistic latent semantic indexing. In SIGIR'99, pages 50--57, 1999. Google ScholarDigital Library
- J. E. Hopcroft, T. Lou, and J. Tang. Who will follow you back? reciprocal relationship prediction. In CIKM'11, 2011. Google ScholarDigital Library
- B. A. Huberman, D. M. Romero, and F. Wu. Social networks that matter: Twitter under the microscope. First Monday, 14(1), 2009.Google Scholar
- X. Jin, S. Spangler, Y. Chen, K. Cai, R. Ma, L. Zhang, X. Wu, and J. Han. Patent maintenance recommendation with patent information network model. In ICDM'11, 2011. Google ScholarDigital Library
- K. Kasravi and M. Risov. Patent mining - discovery of business value from patent repositor ies. Hawaii International Conference on System Sciences, 0:54b, 2007. Google ScholarDigital Library
- F. R. Kschischang, B. J. Frey, and H. andrea Loeliger. Factor graphs and the sum-product algorithm. IEEE TOIT, 47:498--519, 2001. Google ScholarDigital Library
- H. Kwak, C. Lee, H. Park, and S. B. Moon. What is twitter, a social network or a news media? In WWW, pages 591--600, 2010. Google ScholarDigital Library
- P. F. Lazarsfeld and R. K. Merton. Friendship as a social process: A substantive and methodological analysis, volume 18, pages 18--66. Van Nostrand, 1954.Google Scholar
- J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg. Predicting positive and negative links in online social networks. In WWW'10, pages 641--650, 2010. Google ScholarDigital Library
- S. Li, C.-Y. Lin, Y.-I. Song, and Z. Li. Comparable entity mining from comparative questions. In 48th AMACL, ACL'10, pages 650--658, 2010. Google ScholarDigital Library
- B. Liu, Y. Ma, and P. S. Yu. Discovering unexpected information from your competitors'; web sites. In KDD, pages 144--153, 2001. Google ScholarDigital Library
- M. Mathioudakis and N. Koudas. Twittermonitor: trend detection over the twitter stream. In SIGMOD'10, pages 1155--1158, 2010. Google ScholarDigital Library
- Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. In WWW'08, pages 101--110, 2008. Google ScholarDigital Library
- K. P. Murphy, Y. Weiss, and M. I. Jordan. Loopy belief propagation for approximate inference: An empirical study. In UAI'99, pages 467--475, 1999. Google ScholarDigital Library
- M. E. Porter. Competitive Strategy: Techniques for Analyzing Industries and Competitors. Free Press, 1 edition, June 1998.Google Scholar
- J.-T. Sun, X. Wang, D. Shen, H.-J. Zeng, and Z. Chen. Cws: a comparative web search system. In WWW'06, pages 467--476, 2006. Google ScholarDigital Library
- C. Tan, J. Tang, J. Sun, Q. Lin, and F. Wang. Social action tracking via noise tolerant time-varying factor graphs. In KDD'10, pages 1049--1058, 2010. Google ScholarDigital Library
- J. Tang, T. Lou, and J. Kleinberg. Inferring social ties across heterogenous networks. In WSDM'12, pages 743--752, 2012. Google ScholarDigital Library
- J. Tang, B. Wang, Y. Yang, P. Hu, Y. Zhao, X. Yan, B. Gao, M. Huang, P. Xu, W. Li, and A. K. Usadi. Patentminer: Topic-driven patent analysis and mining. In KDD'2012, 2012. Google ScholarDigital Library
- W. Tang, H. Zhuang, and J. Tang. Learning to infer social ties in large networks. In ECML/PKDD (3'11, pages 381--397, 2011. Google ScholarDigital Library
- H. Tong, C. Faloutsos, and Y. Koren. Fast direction-aware proximity for graph mining. In KDD'07, pages 747--756, 2007. Google ScholarDigital Library
- H. Tong, C. Faloutsos, and J.-Y. Pan. Fast random walk with restart and its applications. In ICDM'06, pages 613--622, 2006. Google ScholarDigital Library
- J. Weng, E.-P. Lim, J. Jiang, and Q. He. Twitterrank: finding topic-sensitive influential twitterers. In B. D. Davison, T. Suel, N. Craswell, and B. Liu, editors, WSDM, pages 261--270, 2010. Google ScholarDigital Library
- S. Yang and Y. Ko. Extracting comparative entities and predicates from texts using comparative type classification. In 49th AMACL, HLT'11, pages 1636--1644, 2011. Google ScholarDigital Library
- X. Zhu and J. Lafferty. Harmonic mixtures: Combining mixture models and graph-based methods for inductive and scalable semi-supervised learning. In ICML'05, pages 1052--1059, 2005. Google ScholarDigital Library
Index Terms
- Mining competitive relationships by learning across heterogeneous networks
Recommendations
Learning to Infer Competitive Relationships in Heterogeneous Networks
Special Issue (IDEA) and Regular PapersDetecting and monitoring competitors is fundamental to a company to stay ahead in the global market. Existing studies mainly focus on mining competitive relationships within a single data source, while competing information is usually distributed in ...
Is there a grand challenge or X-prize for data mining?
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningThis panel will discuss possible exciting and motivating Grand Challenge problems for Data Mining, focusing on bioinformatics, multimedia mining, link mining, text mining, and web mining.
Mining E-commerce Data from E-shop Websites
TRUSTCOM-BIGDATASE-ISPA '15: Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA - Volume 02E-commerce is a constantly growing and competitive market. Comparing product prices is an important task for online retailers as well as for e-shoppers. Online merchants compare their prices to those of their competitors for being able to adjust their ...
Comments