Ensemble Based Positive Unlabeled Learning for Time Series Classification

Nguyen, Minh Nhut; Li, Xiao-Li; Ng, See-Kiong

doi:10.1007/978-3-642-29038-1_19

Minh Nhut Nguyen²²,
Xiao-Li Li²² &
See-Kiong Ng²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7238))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1767 Accesses
14 Citations

Abstract

Many real-world applications in time series classification fall into the class of positive and unlabeled (PU) learning. Furthermore, in many of these applications, not only are the negative examples absent, the positive examples available for learning can also be rather limited. As such, several PU learning algorithms for time series classification have recently been developed to learn from a small set P of labeled seed positive examples augmented with a set U of unlabeled examples. The key to these algorithms is to accurately identify the likely positive and negative examples from U, but it has remained a challenge, especially for those uncertain examples located near the class boundary. This paper presents a novel ensemble based approach that restarts the detection phase several times to probabilistically label these uncertain examples more robustly so that a reliable classifier can be built from the limited positive training examples. Experimental results on time series data from different domains demonstrate that the new method outperforms existing state-of-the art methods significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Olszewski, R.T.: Generalized Feature Extraction for Structural Pattern Recognition in Time-Series Data, PhD thesis, Carnegie Mellon University, Pittsburgh, PA (2001)
Google Scholar
Rath, T.M., Manmatha, R.: Word image matching using dynamic time warping. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. II-521–II-527 (2003)
Google Scholar
Xi, X., Keogh, E., Shelton, C., Wei, L., Ratanamahatana, C.A.: Fast time series classification using numerosity reduction. In: Proceedings of the 23rd International Conference on Machine Learning. ACM, Pittsburgh (2006)
Google Scholar
Chapelle, O., Scholkopf, B., Zien, A.: Semi-Supervised Learning. MIT Press (2006) (in Press)
Google Scholar
Li, M., Zhou, Z.-H.: SETRED: Self-Training with Editing. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 611–621. Springer, Heidelberg (2005)
Chapter Google Scholar
Zhu, X.: Semi-supervised learning literature survey, Technical report, no.1530, Computer Sciences, University of Wisconsin-Madison (2008)
Google Scholar
Liu, T., Du, X., Xu, Y., Li, M.-H., Wang, X.: Partially Supervised Text Classification with Multi-Level Examples. In: AAAI (2011)
Google Scholar
Gabriel Pui Cheong, F., Yu, J.X., Hongjun, L., Yu, P.S.: Text classification without negative examples revisit. IEEE Transactions on Knowledge and Data Engineering 18, 6–20 (2006)
Article Google Scholar
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., Acapulco (2003)
Google Scholar
Li, X., Liu, B., Ng, S.-K.: Learning to Identify Unexpected Instances in the Test Set. In: Proceedings of Twentieth International Joint Conference on Artificial Intelligence, India (IJCAI 2007), pp. 2802–2807 (2007)
Google Scholar
Li, X., Yu, P., Liu, B., Ng, S.-K.: Positive Unlabeled Learning for Data Stream Classification. In: SDM, pp. 257–268 (2009)
Google Scholar
Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially Supervised Classification of Text Documents. In: ICML (2002)
Google Scholar
Elkan, C., Noto, K.: Learning Classifiers from Only Positive and Unlabeled Data. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2008)
Google Scholar
Wei, L., Keogh, E.: Semi-supervised time series classification. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Philadelphia (2006)
Google Scholar
Ratanamahatana, C., Wanichsan, D.: Stopping Criterion Selection for Efficient Semi-supervised Time Series Classification. In: Lee, R. (ed.) Soft. Eng., Arti. Intel., Net. & Para./Distri. Comp. SCI, vol. 149, pp. 1–14. Springer, Heidelberg (2008)
Chapter Google Scholar
Nguyen, M.N., Li, X., Ng, S.-K.: Positive Unlabeled Learning for Time Series Classification. In: Proceedings of International Joint Conference on Artificial Intelligence, IJCAI (2011)
Google Scholar
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 881–892 (2002)
Article Google Scholar
Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery 7, 349–371 (2003)
Article MathSciNet Google Scholar
Yoon, H., Yang, K., Shahabi, C.: Feature subset selection and feature ranking for multivariate time series. IEEE Transactions on Knowledge and Data Engineering 17, 1186–1198 (2005)
Article Google Scholar
Wilson, D.L.: Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man and Cybernetics 2, 408–421 (1972)
Article MATH Google Scholar
Wei, L.: Self Training dataset (2007), http://alumni.cs.ucr.edu/~wli/selfTraining/
Keogh, E.: The UCR Time Series Classification/Clustering Homepage (2008), http://www.cs.ucr.edu/~eamonn/time_series_data/

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, Singapore
Minh Nhut Nguyen, Xiao-Li Li & See-Kiong Ng

Authors

Minh Nhut Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Li Li
View author publications
You can also search for this author in PubMed Google Scholar
See-Kiong Ng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, Seoul National University, Gwanak-ro, Gwanak-gu, 151747, Seoul, South Korea
Sang-goo Lee
Computer School, Wuhan University, Luo-jia-shan, Wuchang, 430081, Wuhan, Hubei Province, China
Zhiyong Peng
School of Information Technology and Electrical Engineering, University of Queensland, QLD 4072, Brisbane, Australia
Xiaofang Zhou
Department of Computer Science, Kangwon National University, 192-1, Hyoja2-Dong, Chuncheon, 200701, Kangwon, South Korea
Yang-Sae Moon
Institute for Computer Science and Business Information, University of Duisburg-Essen, Schützenbahn 70, 45117, Essen, Germany
Rainer Unland
School of Information and Communication Engineering, Chungbuk National University, 52 Naesudong-ro, Heungdeok-gu, Cheongju, 4072, Chungbuk, South Korea
Jaesoo Yoo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, M.N., Li, XL., Ng, SK. (2012). Ensemble Based Positive Unlabeled Learning for Time Series Classification. In: Lee, Sg., Peng, Z., Zhou, X., Moon, YS., Unland, R., Yoo, J. (eds) Database Systems for Advanced Applications. DASFAA 2012. Lecture Notes in Computer Science, vol 7238. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29038-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-29038-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29037-4
Online ISBN: 978-3-642-29038-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics