Abstract
In the age of Big Data, input determines output. There is a large amount of data on the internet, but little knowledge. So researchers develop different kinds of methods to automatically extract knowledge from different data platforms. The traditional methods of supervised learning cost more time and labor, which are willing to be gradually replaced by the semi-supervised and unsupervised learning methods. In this paper we proposed a new semi-supervised method to complete this task, which costs just little, called TSVM (Transductive Support Vector Machine). In order to improve the accuracy and the intelligent level, we also add the Word Embeddings to the semi-supervised method. The AP (Affinity Propagation) algorithm makes a contribution to the word clustering automatically. Experimental results demonstrate a better performance to extract the attribute information in the military transportation domain from the Wikipedia compared with the traditional supervised leaning method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhang K, Wang M, Cong X (2014) personal attributes extraction based on the combination of trigger words, dictionary and rules. In: Proceedings of the third CIPS-SIGHAN joint conference on Chinese language processing, pp 114–119
Mauge K, Rohanimanesh K, Ruvini J-D (2012) Structuring e-commerce inventory. In: proceedings of the 50th annual meeting of the association for computational linguistics, pp 805–814
Li J, Ritter A, Hovy E (2014) Weakly supervised user profile extraction from twitter. In: proceedings of the 52nd annual meeting of the association for computational linguistics, pp 165–174
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315:972–976
Vapnik V (1999) The nature of statistical learning theory. Springer, New York
Joachims T (1999) Transductive inference for text classification using support vectror machine. In: Proceedings of the sixteenth international conference on machine learning, Morgan Kaufmann, pp 148–156
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this paper
Cite this paper
Su, F. et al. (2017). Attribute Extracting from Wikipedia Pages in Domain Automatically. In: Balas, V., Jain, L., Zhao, X. (eds) Information Technology and Intelligent Transportation Systems. Advances in Intelligent Systems and Computing, vol 455. Springer, Cham. https://doi.org/10.1007/978-3-319-38771-0_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-38771-0_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38769-7
Online ISBN: 978-3-319-38771-0
eBook Packages: EngineeringEngineering (R0)