ABSTRACT
In plenty of real-world applications, data are collected in a streaming fashion, and their accurate labels are hard to obtain. For instance, in the environmental monitoring task, sensors are collecting the data all the time. Still, their labels are scarce because the labeling process requires human effort and can conceal annotation errors. This paper investigates the problem of learning with weakly labeled data streams, in which data are continuously collected, and only a limited subset of streaming data is labeled but potentially with noise. This setting is challenging and of great importance but rarely studied in the literature. When the data are constantly gathered with unknown noise on labels, it is quite challenging to design algorithms to obtain a well-generalized classifier. To address this difficulty, we propose a novel noise transition matrix estimation approach for data streams with scarce noisy labels by online anchor points identification. Based on that, we propose an adaptive learning algorithm for weakly labeled data streams via model reuse and effectively alleviate the negative influence of label noise with unlabeled data. Both theoretical analysis and extensive experiments justify and validate the effectiveness of the proposed approach.
- Alekh Agarwal. Selective sampling algorithms for cost-sensitive multiclass prediction. In Proceedings of the 30th International Conference on Machine Learning (ICML), pages 1220--1228, 2013.Google Scholar
- Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2(4):343--370, 1988.Google ScholarCross Ref
- Sanjeev Arora, Rong Ge, and Ankur Moitra. Learning topic models--going beyond svd. In 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS), pages 1--10, 2012.Google ScholarDigital Library
- Joseph K Bradley and Robert E Schapire. Filterboost: Regression and classification on large datasets. Advances in Neural Information Processing Systems, 20, 2007.Google Scholar
- Nicolo Cesa-Bianchi, Gábor Lugosi, and Gilles Stoltz. Minimizing regret with label efficient prediction. IEEE Transactions on Information Theory, 51(6):2152--2162, 2005.Google ScholarDigital Library
- Nicolo Cesa-Bianchi, Shai Shalev Shwartz, and Ohad Shamir. Online learning of noisy data with kernels. In Proceedings of the 23rd Annual Conference Computational of Learning Theory (COLT), pages 218--230, 2010.Google Scholar
- Jiacheng Cheng, Tongliang Liu, Kotagiri Ramamohanarao, and Dacheng Tao. Learning with bounded instance and label-dependent label noise. In Proceedings of the 37th International Conference on Machine Learning (ICML), pages 1789--1799, 2020.Google ScholarDigital Library
- Fang Chu, Yizhou Wang, and Carlo Zaniolo. An adaptive learning approach for noisy data streams. In Proceedings of the 4th International Conference on Data Mining (ICDM), pages 351--354, 2004.Google Scholar
- Yoav Freund. A more robust boosting algorithm. arXiv preprint arXiv:0905.2138, 2009.Google Scholar
- Andrew B Goldberg, Ming Li, and Xiaojin Zhu. Online manifold regularization: A new learning setting and empirical study. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 393--407, 2008.Google ScholarCross Ref
- Chen Gong, Hengmin Zhang, Jian Yang, and Dacheng Tao. Learning with inadequate and incorrect supervision. In Proceedings of the 17th International Conference on Data Mining (ICDM), pages 889--894, 2017.Google ScholarCross Ref
- Bo Han, Jiangchao Yao, Gang Niu, Mingyuan Zhou, Ivor W Tsang, Ya Zhang, and Masashi Sugiyama. Masking: a new perspective of noisy supervision. In Advances in Neural Information Processing Systems 32 (NeurIPS), pages 5841--5851, 2018.Google Scholar
- Elad Hazan. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3--4):157--325, 2016.Google ScholarCross Ref
- Bo-Jian Hou, Lijun Zhang, and Zhi-Hua Zhou. Learning with feature evolvable streams. In Advances in Neural Information Processing Systems 31 (NeurIPS), pages 1416--1426, 2017.Google Scholar
- Chenping Hou and Zhi-Hua Zhou. One-pass learning with incremental and decremental features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(11):2776--2792, 2018.Google ScholarDigital Library
- Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on Challenges in Representation Learning, volume 3, page 896, 2013.Google Scholar
- Junnan Li, Richard Socher, and Steven CH Hoi. Dividemix: Learning with noisy labels as semi-supervised learning. In International Conference on Learning Representations, 2020.Google Scholar
- Ming Li and Zhi-Hua Zhou. Setred: Self-training with editing. In Proceedings of the 9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD), pages 611--621, 2005.Google ScholarDigital Library
- Tongliang Liu and Dacheng Tao. Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3):447--461, 2016.Google ScholarDigital Library
- Xin Mu, Kai Ming Ting, and Zhi-Hua Zhou. Classification under streaming emerging new classes: A solution using completely-random trees. IEEE Transactions on Knowledge and Data Engineering, 29(8):1605--1618, 2017.Google ScholarDigital Library
- Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. Learning with noisy labels. In Advances in Neural Information Processing Systems 26 (NeurIPS), pages 1196--1204, 2013.Google Scholar
- Alexander Rakhlin and Karthik Sridharan. Online learning with predictable sequences. In Proceedings of the 26th Conference On Learning Theory (COLT), volume 30, pages 993--1019, 2013.Google Scholar
- Harish Ramaswamy, Clayton Scott, and Ambuj Tewari. Mixture proportion estimation via kernel embeddings of distributions. In Proceedings of the 33rd International Conference on Machine Learning (ICML), pages 2052--2060, 2016.Google Scholar
- Hwanjun Song, Minseok Kim, and Jae-Gil Lee. SELFIE: Refurbishing unclean samples for robust deep learning. In Proceedings of the 37th International Conference on Machine Learning (ICML), pages 5907--5915, 2019.Google Scholar
- W Nick Street and YongSeog Kim. A streaming ensemble algorithm (sea) for large-scale classification. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 377--382, 2001.Google Scholar
- Tim van Erven, Sarah Sachs, Wouter M Koolen, and Wojciech Kotlowski. Robust online convex optimization in the presence of outliers. In roceedings of the 34th Annual Conference Computational of Learning Theory (COLT), pages 4174--4194, 2021.Google Scholar
- Tal Wagner, Sudipto Guha, Shiva Kasiviswanathan, and Nina Mishra. Semi-supervised learning on data streams via temporal label propagation. In Proceedings of the 35th International Conference on Machine Learning (ICML), pages 5095--5104, 2018.Google Scholar
- Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor Tsang, and Masashi Sugiyama. How does disagreement help generalization against label corruption? In Proceedings of the 36th International Conference on Machine Learning (ICML), pages 7164--7173, 2019.Google Scholar
- Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou. Learning with feature and distribution evolvable streams. In Proceedings of the 37th International Conference on Machine Learning (ICML), pages 11317--11327, 2020.Google ScholarDigital Library
- Zhen-Yu Zhang, Peng Zhao, Yuan Jiang, and Zhi-Hua Zhou. Learning from incomplete and inaccurate supervision. IEEE Transactions on Knowledge and Data Engineering, 2021.Google Scholar
- Peng Zhao, Guanghui Wang, Lijun Zhang, and Zhi-Hua Zhou. Bandit convex optimization in non-stationary environments. Journal of Machine Learning Research, 22(125): 1--45, 2021.Google Scholar
- Peng Zhao, Xinqiang Wang, Siyu Xie, Lei Guo, and Zhi-Hua Zhou. Distribution-free one-pass learning. IEEE Transaction on Knowledge and Data Engineering, 33: 951--963, 2021.Google Scholar
- Peng Zhao, Yu-Jie Zhang, Lijun Zhang, and Zhi-Hua Zhou. Dynamic regret of convex and smooth functions. In Advances in Neural Information Processing Systems 33 (NeurIPS), pages 12510--12520, 2020.Google Scholar
- Peng Zhao, Yu-Jie Zhang, and Zhi-Hua Zhou. Exploratory machine learning with unknown unknowns. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI), pages 10999--11006, 2021.Google ScholarCross Ref
- Zhi-Hua Zhou. A brief introduction to weakly supervised learning. National Science Review, 5(1):44--53, 2017.Google ScholarCross Ref
- Zhi-Hua Zhou. Machine Learning. Springer Nature Singapore, 2021.Google ScholarCross Ref
- Zhi-Hua Zhou. Open environment machine learning. arXiv preprint arXiv:2206.00423, 2022.Google Scholar
Index Terms
- Adaptive Learning for Weakly Labeled Streams
Recommendations
Non-linear dictionary learning with partially labeled data
While recent techniques for discriminative dictionary learning have demonstrated tremendous success in image analysis applications, their performance is often limited by the amount of labeled data available for training. Even though labeling images is ...
Learning safe multi-label prediction for weakly labeled data
In this paper we study multi-label learning with weakly labeled data, i.e., labels of training examples are incomplete, which commonly occurs in real applications, e.g., image classification, document categorization. This setting includes, e.g., (i) ...
Mining Data Streams with Labeled and Unlabeled Training Examples
ICDM '09: Proceedings of the 2009 Ninth IEEE International Conference on Data MiningIn this paper, we propose a framework to build prediction models from data streams which contain both labeled and unlabeled examples. We argue that due to the increasing data collection ability but limited resources for labeling, stream data collected ...
Comments