ABSTRACT
Conditional random fields(CRFs) are a class of undirected graphical models which have been widely used for classifying and labeling sequence data. The training of CRFs is typically formulated as an unconstrained optimization problem that maximizes the conditional likelihood. However, maximum likelihood training is prone to overfitting. To address this issue, we propose a novel constrained nonlinear optimization formulation in which the prediction accuracy of cross-validation sets are included as constraints. Instead of requiring multiple passes of training, the constrained formulation allows the cross-validation be handled in one pass of constrained optimization.
The new formulation is discontinuous, and classical Lagrangian based constraint handling methods are not applicable. A new constrained optimization algorithm based on the recently proposed extended saddle point theory is developed to learn the constrained CRF model. Experimental results on gene and stock-price prediction tasks show that the constrained formulation is able to significantly improve the generalization ability of CRF training.
Supplemental Material
- M. Avriel. Nonlinear Programming: Analysis and Methods. Prentice Hall, Englewood Cliffs, N.J., 1976.Google Scholar
- S. J. Benson, L. McInnes, J. More, and J. Sarich. TAO user manual (revision 1.8). Technical Report ANL/MCS-TM-242, Mathematics and Computer Science Division, Argonne National Laboratory, 2005.Google Scholar
- C. Burge. Identification of genes in human genomic DNA. PhD thesis, Stanford Univerisity, 1997.Google Scholar
- M. Chen, Y. Chen, and M. Brent. CRF-OPT: An efficient high-quality conditional random field solver. In Proc. AAAI08, 2008. Google ScholarDigital Library
- S. Chen and R. Rosenfeld. A gaussian prior for smoothing maximum entropy models. Technical Report CMUCS-99-108, Carnegie Mellon University, 1999.Google ScholarCross Ref
- A. Culotta, D. Kulp, and A. McCallum. Gene prediction with conditional random fields. Technical Report UM-CS-2005-028, University of Massachusetts, Amherst, Apr. 2005.Google Scholar
- G. GRIMMETT. A theorem about random fields. Bulletin of the London Mathematical Society, 5:81--84, 1973.Google ScholarCross Ref
- S. S. Gross, C. B. Do, M. Sirota, and S. Batzoglou. CONTRAST: a disriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biology, 8:R269, 2007.Google ScholarCross Ref
- R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proc. IJCAI, pages 1137--1145, 1995. Google ScholarDigital Library
- J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. ICML01, 2001. Google ScholarDigital Library
- W. I. Newman. Extension to the maximum entropy method. IEEE Trans. on Information Theory, (1):89--93, 1977.Google ScholarDigital Library
- A. Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. IEEE Int Conference on Computer Vision, 2:1150--1157, Jun. 2003.Google Scholar
- B. Roark, M. Saraclar, M. Collins, and M. Johnson. Discriminative language modeling with conditional random fields and the perceptron algorithm. The 42nd Annual Meeting of the Association for Computational Linguistics, 2004. Google ScholarDigital Library
- R. Rosenfeld. A maximum entropy approach to adaptive statistical language modeling. Computer, Speech, and Language, pages 187--228, 1996.Google Scholar
- S. Sarawagi and W. Cohen. Semi-markov conditional random fields for information extraction. In Proc. NIPS, 2004.Google Scholar
- F. Sha and F. Pereira. Shallow parsing with conditional random fields. Human Language Technology, 2003. Google ScholarDigital Library
- C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas. Conditional models for contextual human motion recognition. In IEEE International Conference on Computer Vision, pages 1808--1815. Google ScholarDigital Library
- C. Sutton and A. McCallum. An introduction to conditional random fields for relational learning. In Introduction to Statistical Relational Learning. MIT Press, 2006.Google Scholar
- D. L. Vail, J. D. Lafferty, and M. M. Veloso. Feature selection in conditional random fields for activity recognition. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3379--3384, 2007.Google ScholarCross Ref
- B. Wah and Y. Chen. Solving large-scale nonlinear programming problems by constraint partitioning. In Proc. Constraint Programming, pages 697--711, 2005.Google ScholarDigital Library
- B. Wah and Y. Chen. Constrained partitioning in penalty formulations for solving temporal planning problems. Artificial Intelligence, 170(3):187--231, 2006. Google ScholarDigital Library
- B. Wah and M. Qian. Constrained formulations for neural network training and their applications to solve the two-spiral problem. In Proc. Fifth International Conference on Computer Science and Informatics, volume 1, pages 598--601, 2000.Google Scholar
Index Terms
- Constrained optimization for validation-guided conditional random field learning
Recommendations
Factored Latent-Dynamic Conditional Random Fields for single and multi-label sequence modeling
Highlights- We propose a single and multi-label generalization of LDCRF (Morency et al., 2007), called the Factored LDCRF.
Graphical abstractDisplay Omitted
AbstractConditional Random Fields (CRF) are frequently applied for labeling and segmenting sequence data. Morency et al. (2007) introduced hidden state variables in a labeled CRF structure in order to model the latent dynamics within class ...
The echo state conditional random field model for sequential data modeling
Sequential data labeling is a fundamental task in machine learning applications, with speech and natural language processing, activity recognition in video sequences, and biomedical data analysis being characteristic such examples, to name just a few. ...
Gradual transition detection with conditional random fields
MM '07: Proceedings of the 15th ACM international conference on MultimediaIn this paper, we view gradual transition detection as a sequence labeling problem and propose to use Conditional Random Fields (CRFs) for this purpose. CRFs is a state-of-the-art sequence labeling approach. It provides a unified way to integrate ...
Comments