skip to main content
10.1145/1557019.1557046acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Constrained optimization for validation-guided conditional random field learning

Authors Info & Claims
Published:28 June 2009Publication History

ABSTRACT

Conditional random fields(CRFs) are a class of undirected graphical models which have been widely used for classifying and labeling sequence data. The training of CRFs is typically formulated as an unconstrained optimization problem that maximizes the conditional likelihood. However, maximum likelihood training is prone to overfitting. To address this issue, we propose a novel constrained nonlinear optimization formulation in which the prediction accuracy of cross-validation sets are included as constraints. Instead of requiring multiple passes of training, the constrained formulation allows the cross-validation be handled in one pass of constrained optimization.

The new formulation is discontinuous, and classical Lagrangian based constraint handling methods are not applicable. A new constrained optimization algorithm based on the recently proposed extended saddle point theory is developed to learn the constrained CRF model. Experimental results on gene and stock-price prediction tasks show that the constrained formulation is able to significantly improve the generalization ability of CRF training.

Skip Supplemental Material Section

Supplemental Material

p189-chen.mp4

mp4

78.1 MB

References

  1. M. Avriel. Nonlinear Programming: Analysis and Methods. Prentice Hall, Englewood Cliffs, N.J., 1976.Google ScholarGoogle Scholar
  2. S. J. Benson, L. McInnes, J. More, and J. Sarich. TAO user manual (revision 1.8). Technical Report ANL/MCS-TM-242, Mathematics and Computer Science Division, Argonne National Laboratory, 2005.Google ScholarGoogle Scholar
  3. C. Burge. Identification of genes in human genomic DNA. PhD thesis, Stanford Univerisity, 1997.Google ScholarGoogle Scholar
  4. M. Chen, Y. Chen, and M. Brent. CRF-OPT: An efficient high-quality conditional random field solver. In Proc. AAAI08, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Chen and R. Rosenfeld. A gaussian prior for smoothing maximum entropy models. Technical Report CMUCS-99-108, Carnegie Mellon University, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. Culotta, D. Kulp, and A. McCallum. Gene prediction with conditional random fields. Technical Report UM-CS-2005-028, University of Massachusetts, Amherst, Apr. 2005.Google ScholarGoogle Scholar
  7. G. GRIMMETT. A theorem about random fields. Bulletin of the London Mathematical Society, 5:81--84, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  8. S. S. Gross, C. B. Do, M. Sirota, and S. Batzoglou. CONTRAST: a disriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biology, 8:R269, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  9. R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proc. IJCAI, pages 1137--1145, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML01, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. I. Newman. Extension to the maximum entropy method. IEEE Trans. on Information Theory, (1):89--93, 1977.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. IEEE Int Conference on Computer Vision, 2:1150--1157, Jun. 2003.Google ScholarGoogle Scholar
  13. B. Roark, M. Saraclar, M. Collins, and M. Johnson. Discriminative language modeling with conditional random fields and the perceptron algorithm. The 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Rosenfeld. A maximum entropy approach to adaptive statistical language modeling. Computer, Speech, and Language, pages 187--228, 1996.Google ScholarGoogle Scholar
  15. S. Sarawagi and W. Cohen. Semi-markov conditional random fields for information extraction. In Proc. NIPS, 2004.Google ScholarGoogle Scholar
  16. F. Sha and F. Pereira. Shallow parsing with conditional random fields. Human Language Technology, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas. Conditional models for contextual human motion recognition. In IEEE International Conference on Computer Vision, pages 1808--1815. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. Sutton and A. McCallum. An introduction to conditional random fields for relational learning. In Introduction to Statistical Relational Learning. MIT Press, 2006.Google ScholarGoogle Scholar
  19. D. L. Vail, J. D. Lafferty, and M. M. Veloso. Feature selection in conditional random fields for activity recognition. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3379--3384, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  20. B. Wah and Y. Chen. Solving large-scale nonlinear programming problems by constraint partitioning. In Proc. Constraint Programming, pages 697--711, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. B. Wah and Y. Chen. Constrained partitioning in penalty formulations for solving temporal planning problems. Artificial Intelligence, 170(3):187--231, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. B. Wah and M. Qian. Constrained formulations for neural network training and their applications to solve the two-spiral problem. In Proc. Fifth International Conference on Computer Science and Informatics, volume 1, pages 598--601, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Constrained optimization for validation-guided conditional random field learning

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
                  June 2009
                  1426 pages
                  ISBN:9781605584959
                  DOI:10.1145/1557019

                  Copyright © 2009 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 28 June 2009

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  Overall Acceptance Rate1,133of8,635submissions,13%

                  Upcoming Conference

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader