Semantic separator learning and its applications in unsupervised Chinese text parsing

Wu, Yuming; Luo, Xiaodong; Yang, Zhen

doi:10.1007/s11704-013-2072-z

Semantic separator learning and its applications in unsupervised Chinese text parsing

Research Article
Published: 09 January 2013

Volume 7, pages 55–68, (2013)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Yuming Wu^1,2,
Xiaodong Luo³ &
Zhen Yang⁴

86 Accesses
2 Citations
Explore all metrics

Abstract

Grammar learning has been a bottleneck problem for a long time. In this paper, we propose a method of semantic separator learning, a special case of grammar learning. The method is based on the hypothesis that some classes of words, called semantic separators, split a sentence into several constituents. The semantic separators are represented by words together with their part-of-speech tags and other information so that rich semantic information can be involved. In the method, we first identify the semantic separators with the help of noun phrase boundaries, called subseparators. Next, the argument classes of the separators are learned from corpus by generalizing argument instances in a hypernym space. Finally, in order to evaluate the learned semantic separators, we use them in unsupervised Chinese text parsing. The experiments on a manually labeled test set show that the proposed method outperforms previous methods of unsupervised text parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GRAM: Grammar-Based Refined-Label Representing Mechanism in the Hierarchical Semantic Parsing Task

Construction Grammar Based Annotation Framework for Parsing Tamil

Automatic Labeling of Semantic Roles with a Dependency Parser in Hungarian Economic Texts

References

Manning C, Raghavan P, Schutze H. Introduction to information retrieval. Cambridge University Press, 2008
Book MATH Google Scholar
Croce D, Moschitti A, Basili R. Semantic convolution kernels over dependency trees: smoothed partial tree kernel. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011, 2013–2016
Google Scholar
Zhang C, Cao C, Sui Y, Wu X. A Chinese time ontology for the semantic web. Knowledge-Based Systems, 2011, 24(7): 1057–1074
Article Google Scholar
Liu Y, Lü Y, Liu Q. Improving tree-to-tree translation with packed forests. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. 2009, 558–566
Google Scholar
Zhang H, Yu H, Xiong D, Liu Q. HHMM-based Chinese lexical analyzer ICTCLAS. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing. 2003, 184–187
Chapter Google Scholar
Gold E. Language identification in the limit. Information and Control, 1967, 10(5): 447–474
Article MATH Google Scholar
Liang P, Petrov S, Jordan M, Klein D. The infinite PCFG using hierarchical Dirichlet processes. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 2007, 688–697
Google Scholar
Klein D. The unsupervised learning of natural language structure. PhD thesis, Stanford University, 2005
Google Scholar
Yoshinaka R. Identification in the limit of k, l-substitutable contextfree languages. Grammatical Inference: Algorithms and Applications, 2008, 266–279
Chapter Google Scholar
Clark A, Eyraud R, Habrard A. A polynomial algorithm for the inference of context free languages. Grammatical Inference: Algorithms and Applications, 2008, 29–42
Chapter Google Scholar
Clark A, Florêncio C, Watkins C, Serayet M. Planar languages and learnability. Grammatical Inference: Algorithms and Applications, 2006, 148–160
Chapter Google Scholar
Clark A, Costa Florêncio C, Watkins C. Languages as hyperplanes: grammatical inference with string kernels. Machine Learning, 2011, 82(3): 351–373
Article MATH Google Scholar
Berg-Kirkpatrick T, Bouchard-Côté A, DeNero J, Klein D. Painless unsupervised learning with features. In: Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2010, 582–590
Google Scholar
Iwata T, Mochihashi D, Sawada H. Learning common grammar from multilingual corpus. In: Proceedings of the ACL 2010 Conference Short Papers. 2010, 184–188
Google Scholar
Berg-Kirkpatrick T, Klein D. Phylogenetic grammar induction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010, 1288–1297
Google Scholar
Slonneger K, Kurtz B. Formal syntax and semantics of programming languages. Addison-Wesley, 1995
MATH Google Scholar
Abney S. Stochastic attribute-value grammars. Computational Linguistics, 1997, 23(4): 597–618
MathSciNet Google Scholar
Eisele A. Towards probabilistic extensions of constraint-based grammars. Computational Aspects of Constraint-based Linguistic Description, 1994, 3–21
Google Scholar
Brew C. Stochastic HPSG. In: Proceedings of the 7th conference on European chapter of the Association for Computational Linguistics. 1995, 83–89
Google Scholar
Clark A, Eyraud R. Identification in the limit of substitutable contextfree languages. In: Proceedings of the 16th International Conference on Algorithmic Learning Theory. 2005, 283–296
Chapter Google Scholar
Naseem T, Chen H, Barzilay R, Johnson M. Using universal linguistic knowledge to guide grammar induction. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010, 1234–1244
Google Scholar
Naseem T, Barzilay R. Using semantic cues to learn syntax. In: Proceedings of the 25th International Conference on Artificial Intelligence. 2011
Google Scholar
Boonkwan P, Steedman M. Grammar induction from text using small syntactic prototypes. In: Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011
Google Scholar
Muresan S. Learning for deep language understanding. In: Proceedings of the 22nd International Joint conference on Artificial Intelligence. 2011, 1858–1865
Google Scholar
Gavaldà M, Waibel A. Growing semantic grammars. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 1. 1998, 451–456
Chapter Google Scholar
Abisha P, Thomas D, Kumaar S. Learning subclasses of pure pattern languages. Grammatical Inference: Algorithms and Applications, 2008, 280–282
Chapter Google Scholar
Santamaria J, Araujo L. Identifying patterns for unsupervised grammar induction. In: Proceedings of the 14th Conference on Computational Natural Language Learning. 2010, 38–45
Google Scholar
Liu L, Zhang S, Diao L, Yan S, Cao C. Acquiring ISA relations from Chinese free text based on multiple patterns. In: Proceedings of the 5th International Conference on Fuzzy Systems and Knowledge Discovery. 2008, 160–164
Google Scholar
Chen C. Propositon and Its Function. Anhui Education Press, 2002 (in Chinese)
Google Scholar
Wang S, Cao Y, Cao X, Cao C. Learning concepts from text based on the inner-constructive model. Knowledge Science, Engineering and Management, 2007, 255–266
Chapter Google Scholar
Miao T. Encyclpedia of Music. People’s Music Press, 1998 (in Chinese)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Yuming Wu
Graduate University of the Chinese Academy of Sciences, Beijing, 100049, China
Yuming Wu
China Telecom Corporation Limited Shanghai Branch, Shanghai, 200120, China
Xiaodong Luo
Shanghai Research Institute of China Telecom Corporation Limited, Shanghai, 200120, China
Zhen Yang

Authors

Yuming Wu
View author publications
You can also search for this author inPubMed Google Scholar
Xiaodong Luo
View author publications
You can also search for this author inPubMed Google Scholar
Zhen Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yuming Wu.

Additional information

Yuming Wu received his BS from Dalian Jiaotong University in 2002 and hisMS from Capital Normal University in 2008. Now he is a PhD candidate in the Institute of Computing Technology, Chinese Academy of Science. His research interests include grammar learning, natural language processing, large-scale knowledge processing, and topic modeling.

Xiaodong Luo received his MS in Telecom Engineering from the University of South Australia in 2003. He has more than 15 years of experience in telecommunications operations, maintenance, and engineering. He has served as a technical support engineer for network planning and platforms establishment in Shanghai Telecom, especially in the field of next generation call center (NGCC) area.

Zhen Yang is a senior engineer at Shanghai Research Institute of China Telecom Corporation Limited. He received his BS from Harbin Institute of Technology, his MS from Chinese Academy of Science, his PhD from Dalian University of Technology. His research interests include the characteristics, conception, methods, and algorithms of individuality or personal information retrieval, the theory of personal data mining and personal information pattern recognition, and the application and development of search engine technologies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Y., Luo, X. & Yang, Z. Semantic separator learning and its applications in unsupervised Chinese text parsing. Front. Comput. Sci. 7, 55–68 (2013). https://doi.org/10.1007/s11704-013-2072-z

Download citation

Received: 08 March 2012
Accepted: 08 April 2012
Published: 09 January 2013
Issue Date: February 2013
DOI: https://doi.org/10.1007/s11704-013-2072-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic separator learning and its applications in unsupervised Chinese text parsing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GRAM: Grammar-Based Refined-Label Representing Mechanism in the Hierarchical Semantic Parsing Task

Construction Grammar Based Annotation Framework for Parsing Tamil

Automatic Labeling of Semantic Roles with a Dependency Parser in Hungarian Economic Texts

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now