Abstract
There exist new words and error words in Chinese information of web pages. In this paper, we introduce our definition of semantic similarity between sememes and their theorems. On the base of proving the theorems, the influence of the parameter is analyzed. Moreover, this paper presents a novel definition of the word similarity based on the sememe similarity, which can be used to match the new Chinese words with the existing Chinese words and match the error Chinese words with correct Chinese words. And also, based on the novel word similarity, a matching method of information segments is presented to recognize the category of Chinese web information segments, in which new words and error words occur. In addition, the experiment of the matching methods is presented. Therefore, the novel matching method is an efficient method both in theory and from experimental results.
This work was partially supported by National Natural Science Foundation of China under Grant 60403027.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gao, M., Liu, C., Chen, F.: An ontology search engine based on semantic analysis. In: 3rd International Conference on Information Technology and Applications, Sydney, Australia, pp. 256–259 (2005)
Yang, J., Cheung, W.K., Chen, X.: Integrating element and term semantics for similarity-based XML document clustering. In: The 2005 IEEE/WIC/ACM International Conference on Web Intelligence, University of Technology of Compiegne, France, pp. 222–228. IEEE Computer Society Press, Los Alamitos (2005)
Da, L.G., Facon, J., Borges, D.L.: Visual speech recognition: a solution from feature extraction to words classification. In: Proceeding of Symposium on Computer Graphics and Image Processing, XVI Brazilian, pp. 399–405 (2003)
Shen, H.T., Shu, Y., Yu, B.: Efficient semantic-based content search in P2P network. IEEE Transactions on Knowledge and Data Engineering 16(7), 813–826 (2004)
Yi, S., Huang, B., Weng, T.: XML application schema matching using similarity measure and relaxation labeling. Information Sciences 169(1-2), 27–46 (2005)
Nakashima, T.: Classification of characteristic words of electronic newspaper based on the directed relation. In: 2001 IEEE Pacific Rim Conference on Communications, Computers and signal Processing, Victoria, BC, Canada, pp. 591–594. IEEE Computer Society Press, Los Alamitos (2001)
Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transaction on Systems, Man, and Cybernetics 9(1), 17–30 (1989)
Cross, V.: Fuzzy semantic distance measures between ontological concepts, 2004. In: Processing NAFIPS 2004. IEEE Annual Meeting of the Fuzzy Information, Alberta, Canada, pp. 635–640. IEEE Computer Society Press, Los Alamitos (2004)
Soo, V., Yang, S., Chen, S., Fu, Y.: Ontology acquisition and semantic retrieval from semantic annotated Chinese poetry. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, Tuscon, AZ, USA, pp. 345–346. IEEE Computer Society Press, Los Alamitos (2004)
Vladimir, A.O.: Ontology based semantic similarity comparison of documents. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 735–738. Springer, Heidelberg (2003)
Cheng, L., Lu, Z., Wen, K.: The exploration and application about amphibolous matching based on semantics. Journal Huazhong University of Science & Technology (Nature Science Edition) 31(2), 23–25 (2003)
Rodriguez, M.A., Egenhofer, M.J.: Determining semantic similarity among entity classes from different ontologies. IEEE Transactions on Knowledge and Data Engineering 15(2), 442–456 (2003)
Guan, Y., Wang, X., Kong, X., Zhao, J.: Quantifying semantic similarity of Chinese words from HowNet. In: Proceedings of the First International Conference on Machine Learning and Cybernetics, Beijing, China, pp. 234–239. IEEE Computer Society, Los Alamitos (2002)
Zhang, M.Y., Lu, Z.D., Zou, C.Y.: A Chinese word segmentation based on language situation in processing ambiguous words. Information Sciences 162(3–4), 275–285 (2004)
Zhang, M.Y., Lu, Z.D.: A fuzzy classification based on feature selection for web pages. In: The 2004 IEEE/WIC/ACM International Conference on Web intelligence, Beijing, China, pp. 469–472. IEEE Computer Society Press, Los Alamitos (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, M., Zou, C., Lu, Z., Wang, Z. (2006). A Semantic Matching of Information Segments for Tolerating Error Chinese Words. In: Aberer, K., Peng, Z., Rundensteiner, E.A., Zhang, Y., Li, X. (eds) Web Information Systems – WISE 2006. WISE 2006. Lecture Notes in Computer Science, vol 4255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11912873_8
Download citation
DOI: https://doi.org/10.1007/11912873_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48105-8
Online ISBN: 978-3-540-48107-2
eBook Packages: Computer ScienceComputer Science (R0)