ABSTRACT
In real sequence labeling tasks, statistics of many higher order features are not sufficient due to the training data sparseness, very few of them are useful. We describe Sparse Higher Order Conditional Random Fields (SHO-CRFs), which are able to handle local features and sparse higher order features together using a novel tractable exact inference algorithm. Our main insight is that states and transitions with same potential functions can be grouped together, and inference is performed on the grouped states and transitions. Though the complexity is not polynomial, SHO-CRFs are still efficient in practice because of the feature sparseness. Experimental results on optical character recognition and Chinese organization name recognition show that with the same higher order feature set, SHO-CRFs significantly outperform previous approaches.
- Collins, M. (2002a). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. Proceedings of Empirical Methods in Natural Language Processing (pp. 1--8). Google ScholarDigital Library
- Collins, M. (2002b). Ranking algorithms for named entity extraction: Boosting and the voted perceptron. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (pp. 489--496). Google ScholarDigital Library
- Galassi, U., Giordana, A., & Saitta, L. (2007). Structured hidden markov model: A general framework for modeling complex sequences. AI*IA 2007: Artificial Intelligence and Human-Oriented Computing (pp. 290--301). Google ScholarDigital Library
- Jin, G., & Chen, X. (2008). The fourth international chinese language processing bakeoff: Chinese word segmentation, named entity recognition and chinese pos tagging. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 69--81).Google Scholar
- Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th International Conference on Machine Learning (pp. 282--289). Google ScholarDigital Library
- Roth, D., & tau Yih, W. (2005). Integer linear programming inference for conditional random fields. Proceedings of the 22nd International Conference on Machine learning. (pp. 736--743). Google ScholarDigital Library
- Sarawagi, S., & Cohen, W. (2004). Semi-markov conditional random fields for information extraction. Advances in Neural Information Processing Systems (pp. 1185--1192).Google Scholar
- Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin markov networks. Advances in Neural Information Processing Systems (pp. 25--32).Google Scholar
- Yang, F., Zhao, J., & Zou, B. (2008). CRFs-based named entity recognition incorporated with heuristic entity list searching. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 171--174).Google Scholar
- Yu, X., Lam, W., Chan, S.-K., Wu, Y., & Chen, B. (2008). Chinese NER using CRFs and logic for the fourth sighan bakeoff. Proceedings of Sixth Special Interest Group on Chinese Language Processing Workshop (pp. 102--105).Google Scholar
Index Terms
Sparse higher order conditional random fields for improved sequence labeling
Recommendations
Semi-supervised conditional random fields for improved sequence segmentation and labeling
ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational LinguisticsWe present a new semi-supervised training procedure for conditional random fields (CRFs) that can be used to train sequence segmentors and labelers from a combination of labeled and unlabeled training data. Our approach is based on extending the minimum ...
Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data
In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when long-range dependencies exist. We present dynamic conditional random fields (...
Sequence labeling with non-negative weighted higher order features
AAAI'12: Proceedings of the Twenty-Sixth AAAI Conference on Artificial IntelligenceIn sequence labeling, using higher order features leads to high inference complexity. A lot of studies have been conducted to address this problem. In this paper, we propose a new exact decoding algorithm under the assumption that weights of all higher ...
Comments