Abstract
Currently, the web-based Named-Entity relationship extraction has been a new research field with a tremendous potential. The goal of web-based entity relationship extraction is to explore the relationship between a set of realistic entities. It’s a challenging research field and has a widely application value in the related fields of text mining. In this paper, we propose a newly defined framework called Snowball++ based on the traditional entity relationship extraction frameworks. In our Snowball++ framework, we focus on the many-to-many relations more than one-to-one relations. The system is also implemented in the many-to-many manner and it improves the precision and recall. It’s worth to notice that Snowball++ will assign a specific relation type to each entity-relationship pair and the whole training process only need a few manual labor. For the sake of building a efficient and scalable system, we implement the Snowball++ framework on the Hadoop platform which is a totally distributed computing system. Eventually, the experiments show that our framework and implementation are efficient and effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhou, G., Zhang, M., Ji, D.H., Zhu, Q.: Tree kernel-based relation extraction with context-sensitive structured parse tree information. EMNLP-CoNLL 2007, p. 728 (2007)
Giuliano, C., Lavelli, A., Romano, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: EACL, vol. 18, pp. 401–408. Citeseer (2006)
Harabagiu, S., Bejan, C.A., Morarescu, P.: Shallow semantics for relation extraction. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, pp. 1061–1066. Morgan Kaufmann Publishers Inc (2005)
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)
Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)
Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 85–94. ACM (2000)
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artif. Intell. 165(1), 91–134 (2005)
Nie, Z., Wen, J.R., Ma, W.Y.: Object-level vertical search. In: CIDR, pp. 235–246 (2007)
Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Reading. Addison-Wesley, New York (1989)
Cai, Y., Li, Q., Xie, H., Wang, T., Min, H.: Event relationship analysis for temporal event search. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part II. LNCS, vol. 7826, pp. 179–193. Springer, Heidelberg (2013)
Xie, H., Li, Q., Mao, X., Li, X., Cai, Y., Zheng, Q.: Mining latent user community for tag-based and content-based search in social media. Comput. J. 57(9), 1415–1430 (2014)
Xie, H.R., Li, Q., Cai, Y.: Community-aware resource profiling for personalized search in folksonomy. J. Comput. Sci. Technol. 27(3), 599–610 (2012)
Cai, Y., Li, Q.: Personalized search by tag-based user profile and resource profile in collaborative tagging systems. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 969–978. ACM (2010)
Zhu, J., Nie, Z., Liu, X., Zhang, B., Wen, J.R.: Statsnowball: a statistical approach to extracting entity relationships. In: Proceedings of the 18th International Conference on World Wide Web, pp. 101–110. ACM (2009)
Nakashole, N., Theobald, M., Weikum, G.: Scalable knowledge harvesting with high precision and high recall. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 227–236. ACM (2011)
Dean, J., Ghemawat, S.: Mapreduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)
Acknowledgement
This work is supported by National Natural Science Foundation of China (Grant NO. 61300137), the Guangdong Natural Science Foundation, China (NO. S2013010013836), Science and Technology Planning Project of Guangdong Province China NO. 2013B010406004 the Fundamental Research Funds for the Central Universities, SCUT(NO. 2014ZZ0035).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Li, J., Cai, Y., Wang, Q., Hu, S., Wang, T., Min, H. (2015). Entity Relation Mining in Large-Scale Data. In: Liu, A., Ishikawa, Y., Qian, T., Nutanong, S., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9052. Springer, Cham. https://doi.org/10.1007/978-3-319-22324-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-22324-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22323-0
Online ISBN: 978-3-319-22324-7
eBook Packages: Computer ScienceComputer Science (R0)