Abstract
Sequence prediction problem has been traditionally identified in the literature with sequence labeling. This approach typically corresponds to the classification of a label sequence associated to observed input sequence. However, another interpretation of sequence prediction may be considered where a label sequence (sequential output) is classified based only on the independent set of attributes. The paper presents a new, based on boosting, ensemble approach, performing such sequential output prediction. The sequential nature of the classified structure is reflected on the applied cost function. The experimental results reported in the paper revealed a high validity and competitiveness of the proposed approach.
Similar content being viewed by others
References
Altun Y., Hofmann T., Johnson M.: “Discriminative Learning for Label Sequences via Boosting”. Advances in Neural Information Processing Systems 15, 1001–1008 (2009)
Boutell M., Luo J., Shen X., Brown C.: “Learning multi-label scene classification”. Pattern Recognition 37, 1757–1771 (2004)
Collins M.: “Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms”. in Conference on Empirical Methods in Natural Language Processing 2002(10), 1–8 (2002)
Daume, H., “Practical Structured Learning Techniques for Natural Language Processing,” Ph.D. thesis, University of Southern California, Los Angeles, CA, USA, 2006.
Daume H., Langford J., Marcu D.: “Search-based structured prediction”. Machine Learning 75, 297–325 (2009)
Elisseeff, A., Weston, J. (2001). “A kernel method for multi-labelled classification,” Advances in Neural Information Processing Systems, 14, 2001.
Freund Y., Schapire R.: “A decision-theoretic generalization of on-line learning and an application to boosting”. Journal of Computer and System Sciences 55, 119–139 (1997)
Friedman J., Hastie T., Tibshirani R.: “Additive logistic regression: a statistical view of boosting”. The Annals of Statistics 28(2), 337–407 (2000)
Ghamrawi, N., McCallum, A., “Collective multi-label classification,” in Proc. of the 3005 ACM Conference on Information and Knowledge Management 2005, pp. 195–200, 2005.
Kajdanowicz, T., Kazienko, P., Kraszewski, J., “Boosting Algorithm with Sequence-Loss Cost Function for Structured Prediction,” in 5th International Conference on Hybrid Artificial Intelligence Systems HAIS 2010, Lecture Notes in Artificial Intelligence (LNAI) 6076, Springer, pp.573–580, 2010.
Kajdanowicz, T., Kazienko, P., “Incremental Prediction for Sequential Data,” in The 2nd Asian Conference on Intelligent Information and Database Systems 2010, Lecture Notes in Artificial Intelligence (LNAI) 5991, Springer, pp. 359– 367, 2010.
Kajdanowicz, T., Kazienko, P., “Hybrid Repayment Prediction for Debt Portfolio,” in The 1st International Conference on Computational Collective Intelligence - Semantic Web, Social Networks and Multiagent Systems 2009, Lecture Notes in Artificial Intelligence (LNAI) 5796, Springer, pp. 850–857, 2009.
Kajdanowicz, T., Kazienko, P., “Prediction of Sequential Values for Debt Recovery,” in The 14th Iberoamerican Congress on Pattern Recognition CIARP 2009, Lecture Notes in Computer Science (LNCS) 5856, Springer, pp. 337–344, 2009.
Lafferty, J., McCallum, A., Pereira, F., “Conditional random fields: Probabilistic models for segmenting and labeling sequence data,” in Proc. of the 18th International Conference on Machine Learning ICML 2001, Morgan Kaufmann, pp. 282–289, 2001.
McCallum, A., Freitag, D., Pereira, F., “Maximum entropy Markov models for information extraction and segmentation,” in Proc. of the 17th International Conference on Machine Learning ICML 2000, Morgan Kaufmann, pp. 591–598, 2000.
Nguyen, N., Guo, Y., “Comparisons of Sequence Labeling Algorithms and Extensions,” in Proc. of the 24th International Conference on Machine Learning ICML 2007, Morgan Kaufmann, pp. 681–688, 2007.
Punyakanok V., Roth D.: “The use of classifiers in sequential inference”. Advances in Neural Information Processing Systems 13, 995–1001 (2001)
Snoek, C. G., Worring, M., van Gemert, J. C., Geusebroek, J. M., Smeulders, A. W., “The challenge problem for automated detection of 101 semantic concepts in multimedia,” in Proc. of ACM Multimedia, pp. 421–430, 2006.
Schapire R. E., Singer Y.: “Boostexter: a boosting-based system for text categorization”. Machine Learning 39, 135–168 (2000)
Taskar B., Guestrin C., Koller D.: “Max-margin Markov networks”. Advances in Neural Information Processing Systems 16, 25–32 (2004)
heodoris, S., Koutroumbas, K., Pattern Recognition, Elsevier, 2009.
Tsoumakas, G., Katakis, I., Vlahavas, I., “Effective and Efficient multi-label Classification in Domains with Large Number of Labels,” in ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD’08), 2008.
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I., “Multi-label Classification of Music into Emotions,” in Proc. of the 9th International Conference on Music Information Retrieval ISMIR 2008, pp. 325–330, 2008.
Tsochantaridis I., Hofmann T., Thorsten J., Altun Y.: “Large margin methods for structured and interdependent output variables”. Journal of Machine Learning Research 6, 1453–1484 (2005)
Tsoumakas G., Katakis I.: “Multi-label classification: An overview”. International Journal of Data Warehousing and Mining 3, 1–13 (2007)
Tsoumakas, G., Vlahavas, I., “Random k-Labelsets: An Ensemble Method for multi-label Classification,” in Proc. of the 18th European Conference on Machine Learning ECML 2007, LNCS 4701, Springer, pp. 406–417, 2007.
Zhang M.L., Zhou Z.H.: “Multi-label neural networks with applications to functional genomics and text categorization”. IEEE Transactions on Knowledge and Data Engineering 18, 1338–1351 (2006)
Zhang M.L., Zhou Z.H.: “Ml-knn: A lazy learning approach to multi-label learning”. Pattern Recognition 40, 2038–2048 (2007)
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Kajdanowicz, T., Kazienko, P. Boosting-based Sequential Output Prediction. New Gener. Comput. 29, 293–307 (2011). https://doi.org/10.1007/s00354-010-0304-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-010-0304-4