Abstract
With the development of cloud computing technologies, eBusiness systems and applications pay more attention on customer reviews, such as commodity, customer’s emotion. These review data contain a vast amount of valuable information. It is challenging to extract knowledge from these reviews in cloud environment, because they are massive, usually distributed, and keep constantly changing. In this paper, a novel framework to extract knowledge from Chinese review data is proposed, which mainly includes building knowledge space, retrieving knowledge and optimizing results. For Chinese reviews, a skip-gram-based model is used to train review data and generate the knowledge space. To quickly build knowledge space, an algorithm based on hierarchical softmax is proposed, which does not need any feature extraction and modelization. This algorithm is applicable for massive data and conveniently extended in cloud environment. When retrieving knowledge and optimizing results, our framework uses euclidean distance to find the knowledge, closely linked to the query, and uses 2-gram algorithm to optimize the results. Experimental results show that our framework is practical and efficient.





Similar content being viewed by others
References
Swapna Gottipati, Jing Jiang (2012) Finding thoughtful comments from social media. In: Proceedings of 20th International Conference on Computational Linguistics, pages 995–1010 Citeseer
Marios Kokkodis (2012) Learning from positive and unlabeled amazon reviews towards identifying trustworthy reviewers. In: Proceedings of the 21st International Conference on World Wide Web, pages 545–546. ACM
Michele Banko, Oren Etzioni, Turing Center (2008) The tradeoffs between open and traditional relation extraction. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pages 28–36. ACL
Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, Ji-Rong Wen (2009) Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th International Conference on World Wide Web, pages 101–110. ACM
Ruiji Fu, Jiang Guo, Bing Qin, Wanxiang Che, Haifeng Wang, Ting Liu (2014) Learning semantic hierarchies via word embeddings. In: Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics, pages 1199–1209 ACL
Chen Min, Mao Shiwen, Yunhao Liu (2014) Big data: A survey. Mob Networks Appl 19(2):171–209
Niu Feng, Ce Zhang, Christopher Ré, Jude W Shavlik (2012) Deepdive Web-scale knowledge-base construction using statistical learning and inference. VLDS J 12(1):25–28
Xu Yu, Li Peng, Zhixing Huang, Hai Zhuge (2014) A framework for automated construction of resource space based on background knowledge. Futur Gener Comput Syst 32(8):222–231
Johannes Hoffart, Fabian M Suchanek, Klaus Berberich, Edwin Lewis-Kelham, Gerard De Melo, Gerhard Weikum (2011) Yago2:exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference on World Wide Web, pages 229–232 ACM
Brambilla Marco, Ceri Stefano, Halevy Alon (2013) Special issue on structured and crowd-sourced data on the web. The VLDB J 22(5):587–588
Sarkas Nikos, Paparizos Stelios, Panayiotis Tsaparas (2010) Structured annotations of web queries
Gao Yunjun, Liu Qing, Zheng Baihua, Chen Gang (2014) On efficient reverse skyline query processing. Expert Syst Appl 41(7):3237–3249
Raghunathan Rohit, De Sushovan, Kambhampati Subbarao (2014) Bayesian networks for supporting query processing over incomplete autonomous databases. J Intell Inf Syst 42(3):595–618
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, Jeff Dean (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of 26th Annual Conference on Neural Information Processing Systems, pages 3111- 3119. IEEE
Pascal Denis, Benoît Sagot (2012) Coupling an annotated corpus and a lexicon for state-of-the-art pos tagging. Lang Resour Eval 46(4):721–736
Jie Zhang, Xiaoyin Wang, Dan Hao, Bing Xie, Lu Zhang, Hong Mei (2015) A survey on bug-report analysis. Sci China Inf Sci 58(2):1–24
Gary B Huang, Honglak Lee, Erik Learned-Miller (2012) Learning hierarchical representations for face verification with convolutional deep belief networks. In: Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2518–2525, IEEE
Acknowledgments
This work was supported in part by National Natural Science Foundation of China under No.61272411.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, F., Zhu, H., Jin, H. et al. A Skip-gram-based Framework to Extract Knowledge from Chinese Reviews in Cloud Environment. Mobile Netw Appl 20, 363–369 (2015). https://doi.org/10.1007/s11036-015-0612-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-015-0612-5