Scaling Conditional Random Fields by One-Against-the-Other Decomposition

Zhao, Hai; Kit, Chunyu

doi:10.1007/s11390-008-9157-4

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

Short Paper
Published: 05 August 2008

Volume 23, pages 612–619, (2008)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Hai Zhao¹ &
Chunyu Kit¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40–50% without any significant performance loss on various large-scale data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sequence Segmentation Using Semi-Markov Conditional Random Fields

Article 20 March 2019

Research on Chinese Word Segmentation Based on Conditional Random Fields

A Latent Variable CRF Model for Labeling Prediction

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. the Eighteenth International Conference on Machine Learning, ICML’01, Williams College: Morgan Kaufmann Publishers Inc., USA, 2001, pp.282–289.
Rosenfeld B, Feldman R, Fresko M. A systematic cross-comparison of sequence classifiers. In Proc. SDM 2006, Bethesda, Maryland, 2006, pp.563–567.
Sha F, Pereira F. Shallow parsing with conditional random fields. In Proc. the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Edmonton, Canada, Vol. 1, 2003, pp.134–141.
Wallach H M. Efficient training of conditional random fields [Thesis]. Division of Informatics, University of Edinburgh, 2002.
Viterbi A J. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 1967, 13(2): 260–269.
Article MATH Google Scholar
Cohn T, Smith A, Osborne M. Scaling conditional random fields using error-correcting codes. In Proc. the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, Michigan: Association for Computational Linguistics, June 2005, pp.10–17.
Google Scholar
Hsu C W, Lin C J. A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 2002, 13(2): 415–425.
Article Google Scholar
Sutton C, McCallum A. Piecewise pseudolikelihood for efficient training of conditional random fields. In Proc. the 24th International Conference on Machine Learning, Corvalis, Oregon, ACM Press, June 20–24 2007, pp.863–870.
Toutanova K, Klein D, Manning C, Singer Y. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proc. HLT-NAACL’03, Edmonton, Canada, May 27–June 1, 2003, pp.252–259.
V Punyakanok, D Roth, W tau Yih, D Zimak. Learning and inference over constrained output. In Proc. IJCAI 2005, Edinburgh, Scotland, July 30–August 5, 2005, pp.1124–1129.
Abbeel P, Koller D, Ng A Y. Learning factor graphs in polynomial time and sample complexity. The Journal of Machine Learning Research, 2006, 7: 1743–1788.
MathSciNet Google Scholar
McCallum A, Sutton C. Piecewise training with parameter independence diagrams: Comparing globally- and locally-trained linear-chain CRFs. Tech. Rep. IR-383, Center for Intelligent Information Retrieval, University of Massachusetts, 2004, presented at NIPS 2004 Workshop on Learning with Structured Outputs.
Xue N. Chinese word segmentation as character tagging. Computational Linguistics and Chinese Language Processing, 2003, 8(1): 29–48.
Google Scholar
Peng F, Feng F, McCallum A. Chinese segmentation and new word detection using conditional random fields. In Proc. COLING 2004, Geneva, Switzerland, August 23–27, 2004, pp.562–568.
Tseng H, Chang P, Andrew G, Jurafsky D, Manning C. A conditional random field word segmenter for SIGHAN bakeoff 2005. In Proc. the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, October 14–15, 2005, pp.168–171.
Tsai R T H, Hung H C, Sung C L, Dai H J, Hsu W L. On closed task of Chinese word segmentation: An improved CRF model coupled with character clustering and automatically generated template matching. In Proc. the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, July 22–23, 2006, pp.108–117.
Zhao H, Huang C N, Li M. An improved Chinese word segmentation system with conditional random field. In Proc. the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney, Australia, July 22–23, 2006, pp.162–165.
Zhang R, Kikui G, Sumita E. Subword-based tagging by conditional random fields for Chinese word segmentation. In Proc. Human Language Technology Conference/North American Chapter of the Association for Computational Linguistics Annual Meeting (HLT/NAACL - 2006), New York, 2006, pp.193–196.
Zhou G D. A chunking strategy towards unknown word detection in Chinese word segmentation. In Proc. the 2nd International Joint Conference on Natural Language Processing (IJCNLP-2005), Dale R, Wong K F, Su J, Kwong O Y (eds.), Jeju Island, Korea, Lecture Notes in Computer Science, Vol. 3651. Springer, October 11–13, 2005, pp.530–541.
Low J K, Ng H T, Guo W. A maximum entropy approach to Chinese word segmentation. In Proc. the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, October 14–15, 2005, pp.161–164.
Zhao H, Huang C N, Li M, Lu B L. Effective tag set selection in Chinese word segmentation via conditional random field modeling. In Proc. the 20th Asian Pacific Conference on Language, Information and Computation, Wuhan, China, November 1–3, 2006, pp.87–94.
Emerson T. The second international Chinese word segmentation bakeoff. In Proc. the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, October 14–15, 2005, pp.123–133.
Asahara M, Fukuoka K, Azuma A, Goh C L, Watanabe Y, Matsumoto Y, Tsuzuki T. Combination of machine learning methods for optimum Chinese word segmentation. In Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, October 14–15, 2005, pp.134–137.
Chen A, Zhou Y, Zhang A, Sun G, Unigram language model for Chinese word segmentation. In Proc. the Fourth SIGHAN Workshop on Chinese Language Processing, Jeju Island, Korea, October 14–15, 2005, pp.138–141.

Download references

Author information

Authors and Affiliations

Department of Chinese, Translation and Linguistics, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, China
Hai Zhao & Chunyu Kit

Authors

Hai Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Chunyu Kit
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Chunyu Kit.

Additional information

The research in this paper was supported by the Research Grants Council of Hong Kong S.A.R., China, through the CERG under Grant No. 9040861 (CityU 1318/03H) and by City University of Hong Kong through the Strategic Research under Grant No. 7002037.

*Dr. Hai Zhao was supported by a Postdoctoral Research Fellowship in the Department of Chinese, Translation and Linguistics, City University of Hong Kong.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 65.1 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, H., Kit, C. Scaling Conditional Random Fields by One-Against-the-Other Decomposition. J. Comput. Sci. Technol. 23, 612–619 (2008). https://doi.org/10.1007/s11390-008-9157-4

Download citation

Received: 04 August 2007
Revised: 29 December 2007
Published: 05 August 2008
Issue Date: July 2008
DOI: https://doi.org/10.1007/s11390-008-9157-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Sequence Segmentation Using Semi-Markov Conditional Random Fields

Research on Chinese Word Segmentation Based on Conditional Random Fields

A Latent Variable CRF Model for Labeling Prediction

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

(PDF 65.1 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now