Bootstrapping Yahoo! Finance by Wikipedia for Competitor Mining

Ruan, Tong; Xue, Lijuan; Wang, Haofen; Pan, Jeff Z.

doi:10.1007/978-3-319-31676-5_8

Tong Ruan¹⁷,
Lijuan Xue¹⁷,
Haofen Wang¹⁷ &
…
Jeff Z. Pan¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9544))

Included in the following conference series:

Joint International Semantic Technology Conference

999 Accesses

Abstract

Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.

This work was partially supported by the Fundamental Research Funds for the Central Universities (Grant No: 22A201514045) and the Project funded by ChinaPostdoctoral Science Foundation (project No: 137763).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Mining Top-K Competitors by Eliminating the K-Least Items from Unstructured Dataset

Heterogeneous business network based interpretable competitive firm identification: a graph neural network method

Article 16 January 2025

Relation Extraction for Competitive Intelligence

Notes

References

Ma, Z., Pant, G., Sheng, O.R.: Mining competitor relationships from online news: a network-based approach. Electron. Commer. Res. Appl. 10(4), 418–427 (2011)
Article Google Scholar
Bao, S., Li, R., Yu, Y., Cao, Y.: Competitor mining with the web. IEEE Trans. Knowl. Data Eng. 20(10), 1297–1310 (2008)
Article Google Scholar
Xu, K., Liao, S.S., Li, J., Song, Y.: Mining comparative opinions from customer reviews for competitive intelligence. Decis. Support Syst. 50(4), 743–754 (2011)
Article Google Scholar
Lappas, T., Valkanas, G., Gunopulos, D.: Efficient and domain-invariant competitor mining. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 408–416. ACM (2012)
Google Scholar
Wan, Q., Wong, R.C.W., Peng, Y.: Finding top-k profitable products. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 1055–1066. IEEE (2011)
Google Scholar
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction for the web. IJCAI 7, 2670–2676 (2007)
Google Scholar
Wu, F., Weld, D.S.: Open information extraction using wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 118–127. Association for Computational Linguistics (2010)
Google Scholar
Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics (2011)
Google Scholar
Schmitz, M., Bart, R., Soderland, S., Etzioni, O., et al.: Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 523–534. Association for Computational Linguistics (2012)
Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr., E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. AAAI 5, 3 (2010)
Google Scholar
Suchanek, F.M., Sozio, M., Weikum, G.: SOFIE: a self-organizing framework for information extraction. In: Proceedings of the 18th International Conference on World Wide Web, pp. 631–640. ACM (2009)
Google Scholar
Nakashole, N., Theobald, M., Weikum, G.: Scalable knowledge harvesting with high precision and high recall. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 227–236. ACM (2011)
Google Scholar
Gentile, A.L., Zhang, Z., Ciravegna, F.: Web scale information extraction with lodie. In: 2013 AAAI Fall Symposium Series (2013)
Google Scholar
Ruan, T., Lin, Y., Wang, H., Pan, J.Z.: A multi-strategy learning approach to competitor identification. In: Supnithi, T., Yamaguchi, T., Pan, J.Z., Wuwongse, V., Buranarach, M. (eds.) JIST 2014. LNCS, vol. 8943, pp. 197–212. Springer, Heidelberg (2015)
Chapter Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pp. 1003–1011. Association for Computational Linguistics (2009)
Google Scholar
Roth, B., Barth, T., Wiegand, M., Singh, M., Klakow, D.: Effective slot filling based on shallow distant supervision methods (2014). arXiv preprint arXiv:1401.1158

Download references

Author information

Authors and Affiliations

East China University of Science and Technology, Shanghai, 200237, China
Tong Ruan, Lijuan Xue & Haofen Wang
The University of Aberdeen, Aberdeen, Scotland
Jeff Z. Pan

Authors

Tong Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Lijuan Xue
View author publications
You can also search for this author in PubMed Google Scholar
Haofen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jeff Z. Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tong Ruan .

Editor information

Editors and Affiliations

Southeast University, Nanjing, China
Guilin Qi
Osaka University, Ibaraki, Japan
Kouji Kozaki
The University of Aberdeen, Aberdeen, United Kingdom
Jeff Z. Pan
Zhongnan Hospital of Wuhan University, Wuhan, China
Siwei Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ruan, T., Xue, L., Wang, H., Pan, J.Z. (2016). Bootstrapping Yahoo! Finance by Wikipedia for Competitor Mining. In: Qi, G., Kozaki, K., Pan, J., Yu, S. (eds) Semantic Technology. JIST 2015. Lecture Notes in Computer Science(), vol 9544. Springer, Cham. https://doi.org/10.1007/978-3-319-31676-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-31676-5_8
Published: 20 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31675-8
Online ISBN: 978-3-319-31676-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics