Abstract
Classifier design often faces a lack of sufficient labeled data because the class labels are identified by experienced analysts and therefore collecting labeled data often costs much. To mitigate this problem, several learning methods have been proposed to effectively use unlabeled data that can be inexpensively collected. In these methods, however, only static data have been considered; time series unlabeled data cannot be dealt with by these methods. Focusing on Hidden Markov Models (HMMs), in this paper we first present an extension of HMMs, named Extended Tied-Mixture HMMs (ETM-HMMs), in which both labeled and unlabeled time series data can be utilized simultaneously. We also formally derive a learning algorithm for the ETM-HMMs based on the maximum likelihood framework. Experimental results using synthetic and real time series data show that we can obtain a certainly better classification accuracy when unlabeled time series data are added to labeled training data than the case only labeled data are used.
Similar content being viewed by others
References
T. Joachims, “Transductive Inference for Text Classification Using Support Vector Machines,” in Proceedings of the 16th International Conference on Machine Learning, San Francisco, June 1999, pp. 200-209.
B. Shahshahani and D. Landgrebe, “The Effect of Unlabeled Samples in Reducing the Small Sample Size Problem and Mitigating the Hughes Phenomenon,” IEEE Trans. Geoscience and Remote Sensing, vol. 32, no. 5, 1994, pp. 1087-1095.
D.J. Miller and H.S. Uyar, “A Mixture of Experts Classifier with Learning Based on Both Labelled and Unlabelled Data,” in Advances in Neural Information Processing Systems, vol. 9, M.C. Mozer, M.I. Jordan, and T. Petsche (Eds.), Cambridge, MA: MIT Press, 1997, pp. 571-577.
K. Nigam, A.K. Mccallum, S. Thrun, and T. Mitchell, “Text Classification from Labeled and Unlabeled Documents Using EM,” Machine Learning, vol. 39, 2000, pp. 103-134.
J. Larsen, A. Szymkowiak, and L.K. Hansen, “Probabilistic Hierarchical Clustering with Labeled and Unlabeled Data,” International Journal of Knowledge-Based Intelligent Engineering Systems, vol. 6, no. 1, 2002, pp. 56-62.
A. Blum and T. Mitchell, “Combining Labeled and Unlabeled Data with Co-training,” in Proceedings of the 11th Annual Conference on Computational Learning Theory, P. Bartlett and Y. Mansour (Eds.), Madison, WI, 1998, pp. 92-100.
X.D. Huang, Y. Ariki, and M.A. Jack, Hidden Markov Models for Speech Recognition, Edinburgh: Edinburgh University Press, 1990.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Ueda, N., Inoue, M. Extended Tied-Mixture HMMs for Both Labeled and Unlabeled Time Series Data. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 37, 189–197 (2004). https://doi.org/10.1023/B:VLSI.0000027484.54541.bd
Published:
Issue Date:
DOI: https://doi.org/10.1023/B:VLSI.0000027484.54541.bd