ABSTRACT
In the multi-instance learning (MIL) setting instances are grouped together into bags. Labels are provided only for the bags and not on the level of individual instances. A positive bag label means that at least one instance inside the bag is positive, while a negative bag label restricts all the instances in the bag to be negative. MIL data naturally arises in many contexts, such as anomaly detection, where labels are rare and costly, and one often ends up annotating the label for sets of instances. Moreover, in many real-world anomaly detection problems, only positive labels are collected because they usually represent critical events. Such a setting, where only positive labels are provided along with unlabeled data, is called Positive and Unlabeled (PU) learning. Despite being useful for several use cases, there is no work dedicated to learning from positive and unlabeled data in a multi-instance setting for anomaly detection. Therefore, we propose the first method that learns from PU bags in anomaly detection. Our method uses an autoencoder as an underlying anomaly detector. We alter the autoencoder's objective function and propose a new loss that allows it to learn from positive and unlabeled bags of instances. We theoretically analyze this method. Experimentally, we evaluate our method on 30 datasets and show that it performs better than multiple baselines adapted to work in our setting.
Supplemental Material
- Pierre Baldi. 2012. Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, PMLR, Bellevue, Washington, USA, 37--49.Google Scholar
- Han Bao, Tomoya Sakai, Issei Sato, and Masashi Sugiyama. 2018. Convex formulation of multiple instance learning from positive and unlabeled bags. Neural Networks, Vol. 105 (2018), 132--141.Google ScholarDigital Library
- Teresa Basile, Nicola Di Mauro, Floriana Esposito, Stefano Ferilli, and Antonio Vergari. 2017. Density estimators for positive-unlabeled learning. In International Workshop on New Frontiers in Mining Complex Patterns. Springer, Springer International Publishing, Cham, 49--64.Google Scholar
- Jessa Bekker and Jesse Davis. 2020. Learning from positive and unlabeled data: A survey. Machine Learning, Vol. 109 (2020), 719--760.Google ScholarDigital Library
- Forrest Briggs, Xiaoli Z Fern, and Raviv Raich. 2012. Rank-loss support instance machines for MIML instance annotation. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, 534--542. https://doi.org/10.1145/2339530.2339616Google ScholarDigital Library
- Guilherme O Campos, Arthur Zimek, Jörg Sander, Ricardo JGB Campello, Barbora Micenková, Erich Schubert, Ira Assent, and Michael E Houle. 2016. On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study. Data mining and knowledge discovery, Vol. 30 (2016), 891--927.Google Scholar
- Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM computing surveys (CSUR), Vol. 41, 3 (2009), 1--58.Google ScholarDigital Library
- Sneha Chaudhari and Shirish Shevade. 2012. Learning from positive and unlabelled examples using maximum margin clustering. In Neural Information Processing: 19th International Conference, ICONIP 2012, Doha, Qatar, November 12-15, 2012, Proceedings, Part III 19. Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 465--473.Google ScholarDigital Library
- Francesco Del Buono, Francesca Calabrese, Andrea Baraldi, Matteo Paganelli, and Francesco Guerra. 2022. Novelty detection with autoencoders for system health monitoring in industrial environments. Applied Sciences, Vol. 12, 10 (2022), 4931.Google ScholarCross Ref
- Thomas G Dietterich, Richard H Lathrop, and Tomás Lozano-Pérez. 1997. Solving the multiple instance problem with axis-parallel rectangles. Artificial intelligence, Vol. 89, 1--2 (1997), 31--71.Google Scholar
- Marthinus Christoffel Du Plessis, Gang Niu, and Masashi Sugiyama. 2013. Clustering unclustered data: Unsupervised binary labeling of two datasets having different class balances. In 2013 Conference on Technologies and Applications of Artificial Intelligence. IEEE, 1--6.Google ScholarDigital Library
- Charles Elkan and Keith Noto. 2008. Learning classifiers from only positive and unlabeled data. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, 213--220.Google ScholarDigital Library
- Abdelali Elmoufidi, Khalid El Fahssi, Said Jai-andaloussi, Abderrahim Sekkaki, Quellec Gwenole, and Mathieu Lamard. 2018. Anomaly classification in digital mammography based on multiple-instance learning. IET Image Processing, Vol. 12, 3 (2018), 320--328.Google ScholarCross Ref
- Songhe Feng and De Xu. 2010. Transductive multi-instance multi-label learning algorithm with application to automatic image annotation. Expert Systems with Applications, Vol. 37, 1 (2010), 661--670.Google ScholarDigital Library
- Jun Han and Claudio Moraga. 1995. The influence of the sigmoid function parameters on the speed of backpropagation learning. In International workshop on artificial neural networks. Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 195--201.Google ScholarCross Ref
- Songqiao Han, Xiyang Hu, Hailiang Huang, Minqi Jiang, and Yue Zhao. 2022. Adbench: Anomaly detection benchmark. Advances in Neural Information Processing Systems, Vol. 35 (2022), 32142--32159.Google Scholar
- Jing Huo, Yang Gao, Wanqi Yang, and Hujun Yin. 2012. Abnormal event detection via multi-instance dictionary learning. In International conference on intelligent data engineering and automated learning. Springer, Springer Berlin Heidelberg, Berlin, Heidelberg, 76--83.Google ScholarDigital Library
- Jing Huo, Yang Gao, Wanqi Yang, and Hujun Yin. 2014. Multi-instance dictionary learning for detecting abnormal events in surveillance videos. International journal of neural systems, Vol. 24, 03 (2014), 1430010.Google ScholarCross Ref
- Tomoharu Iwata, Machiko Toyoda, Shotaro Tora, and Naonori Ueda. 2020. Anomaly detection with inexact labels. Machine Learning, Vol. 109 (2020), 1617--1633.Google ScholarDigital Library
- Kristen Jaskie, Charles Elkan, and Andreas Spanias. 2019. A Modified Logistic Regression for Positive and Unlabeled Learning. In 2019 53rd Asilomar Conference on Signals, Systems, and Computers. IEEE, 2007--2011. https://doi.org/10.1109/IEEECONF44664.2019.9048765Google ScholarCross Ref
- Weixin Li and Nuno Vasconcelos. 2015. Multiple instance learning for soft bags via top instances. In Proceedings of the IEEE conference on computer vision and pattern recognition.Google ScholarCross Ref
- Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining. IEEE, 413--422. https://doi.org/10.1109/ICDM.2008.17Google ScholarDigital Library
- Guoqing Liu, Jianxin Wu, and Zhi-Hua Zhou. 2012. Key instance detection in multi-instance learning. In Asian Conference on Machine Learning. PMLR.Google Scholar
- Oded Maron and Tomás Lozano-Pérez. 1997. A framework for multiple-instance learning. Advances in neural information processing systems (1997).Google Scholar
- Lorenzo Perini, Paul Buerkner, and Arto Klami. 2022a. Estimating the Contamination Factor's Distribution in Unsupervised Anomaly Detection. arXiv preprint arXiv:2210.10487 (2022).Google Scholar
- Lorenzo Perini, Vincent Vercruyssen, and Jesse Davis. 2020. Class prior estimation in active positive and unlabeled learning. In Proceedings of the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence. IJCAI-PRICAI.Google ScholarCross Ref
- Lorenzo Perini, Vincent Vercruyssen, and Jesse Davis. 2021. Quantifying the confidence of anomaly detectors in their example-wise predictions. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14-18, 2020, Proceedings, Part III. Springer, 227--243.Google ScholarDigital Library
- Lorenzo Perini, Vincent Vercruyssen, and Jesse Davis. 2022b. Transferring the contamination factor between anomaly detection domains by shape similarity. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 4128--4136.Google ScholarCross Ref
- Sang Phan, Duy-Dinh Le, and Shin'ichi Satoh. 2015. Multimedia event detection using event-driven multiple instance learning. In Proceedings of the 23rd ACM international conference on Multimedia.Google ScholarDigital Library
- John Platt et al. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers (1999).Google Scholar
- John C Platt. 2000. 5 Probabilities for SV Machines. Advances in Large Margin Classifiers (2000).Google Scholar
- Kenneth V Price. 2013. Differential evolution. In Handbook of optimization. Springer.Google Scholar
- Gwenolé Quellec, Guy Cazuguel, Béatrice Cochener, and Mathieu Lamard. 2017. Multiple-instance learning for medical image and video analysis. IEEE reviews in biomedical engineering (2017).Google ScholarCross Ref
- Gwenolé Quellec, Mathieu Lamard, Michel Cozic, Gouenou Coatrieux, and Guy Cazuguel. 2016. Multiple-instance learning for anomaly detection in digital mammography. Ieee transactions on medical imaging (2016).Google Scholar
- Leonard Sabetti and Ronald Heijmans. 2021. Shallow or deep? Training an autoencoder to detect anomalous flows in a retail payment system. Latin American Journal of Central Banking (2021).Google Scholar
- Beomjo Shin, Junsu Cho, Hwanjo Yu, and Seungjin Choi. 2021. Sparse Network Inversion for Key Instance Detection in Multiple Instance Learning. In 2020 25th International Conference on Pattern Recognition (ICPR).Google Scholar
- Jonas Soenen, Elia Van Wolputte, Lorenzo Perini, Vincent Vercruyssen, Wannes Meert, Jesse Davis, and Hendrik Blockeel. 2021. The Effect of Hyperparameter Tuning on the Comparative Evaluation of Unsupervised Anomaly Detection Methods. In Proceedings of the KDD'21 Workshop on Outlier Detection and Description. Outlier Detection and Description Organising Committee.Google Scholar
- Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, Marthinus Christoffel Du Plessis, Song Liu, and Ichiro Takeuchi. 2013. Density-difference estimation. Neural Computation, Vol. 25, 10 (2013), 2734--2775.Google ScholarDigital Library
- Vincent Vercruyssen, Lorenzo Perini, Wannes Meert, and Jesse Davis. 2023. Multi-domain Active Learning for Semi-supervised Anomaly Detection. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part IV. Springer, 485--501.Google Scholar
- Vincent Vercruyssen, Meert Wannes, Verbruggen Gust, Maes Koen, Bäumer Ruben, and Davis Jesse. 2018. Semi-supervised anomaly detection with an application to water analytics. In Proceedings of 18th IEEE International Conference on Data Mining. IEEE.Google ScholarCross Ref
- Junxiang Wang, Liang Zhao, and Yanfang Ye. 2018. Semi-supervised multi-instance interpretable models for flu shot adverse event detection. In 2018 IEEE International Conference on Big Data (Big Data). IEEE.Google ScholarCross Ref
- Jia Wu, Xingquan Zhu, Chengqi Zhang, and Zhihua Cai. 2014. Multi-instance learning from positive and unlabeled bags. In Advances in Knowledge Discovery and Data Mining: 18th Pacific-Asia Conference, PAKDD 2014, Tainan, Taiwan, May 13-16, 2014. Proceedings, Part I 18. Springer, 237--248.Google ScholarCross Ref
- Xin-Shun Xu, Yuan Jiang, Xiangyang Xue, and Zhi-Hua Zhou. 2012. Semi-supervised multi-instance multi-label learning for video annotation task. In Proceedings of the 20th ACM international conference on Multimedia.Google ScholarDigital Library
- Cha Zhang, John Platt, and Paul Viola. 2005. Multiple instance boosting for object detection. Advances in neural information processing systems (2005).Google Scholar
- Lijun Zhang, Kai Liu, Yufeng Wang, and Zachary Bosire Omariba. 2018. Ice detection model of wind turbine blades based on random forest classifier. Energies (2018).Google Scholar
- Hongshan Zhao, Huihai Liu, Wenjing Hu, and Xihui Yan. 2018. Anomaly detection and fault analysis of wind turbine components based on deep learning network. Renewable energy (2018).Google Scholar
- Yu Zhou and Anlong Ming. 2016. Semi-Supervised Multiple Instance Learning and its application in visual tracking. In 2016 8th International Conference on Wireless Communications & Signal Processing (WCSP). IEEE.Google ScholarCross Ref
- Zhi-Hua Zhou. 2004. Multi-instance learning: A survey. Department of Computer Science & Technology, Nanjing University, Tech. Rep (2004).Google Scholar
- Zhi-Hua Zhou, Yu-Yin Sun, and Yu-Feng Li. 2009. Multi-instance learning by treating instances as non-iid samples. In Proceedings of the 26th annual international conference on machine learning.Google ScholarDigital Library
Index Terms
- Learning from Positive and Unlabeled Multi-Instance Bags in Anomaly Detection
Recommendations
Positive and Unlabeled Learning for Anomaly Detection with Multi-features
MM '17: Proceedings of the 25th ACM international conference on MultimediaAnomaly detection is of great interest to big data applications, and both supervised and unsupervised learning have been applied for anomaly detection. However, it still remains a challenging problem because: (1) for supervised learning, it is difficult ...
Multiple-Instance Learning from Similar and Dissimilar Bags
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningMultiple-instance learning (MIL) is an important weakly supervised binary classification problem, where training instances are arranged in bags, and each bag is assigned a positive or negative label. Most of the previous studies for MIL assume that ...
Multiple-Instance Learning From Unlabeled Bags With Pairwise Similarity
In <italic>multiple-instance learning</italic> (MIL), each training example is represented by a bag of instances. A training bag is either negative if it contains no positive instances or positive if it has at least one positive instance. Previous MIL ...
Comments