ABSTRACT
StreamToxWatch, or ToxWatch for short, is an early-stage ensemble architecture for data poisoning detection and monitoring in online learning systems over streams. Detecting data poisoning is difficult, especially in distributed streaming systems where statistical baselines change on the fly and across the system. For that reason, ToxWatch employs a combination of input, (adversarial) conceptual drift, and model performance monitors intended to observe anomalous behaviors and phenomena across the system and to offer targeted detection signals to downstream applications.
- Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha. 2017. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262 (2017), 134--147.Google ScholarCross Ref
- Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, and Qian Wang. 2021. Recent advances in adversarial training for adversarial robustness. arXiv preprint arXiv:2102.01356 (2021).Google Scholar
- Indradumna Banerjee, Dinesh Ghanta, Girish Nautiyal, Pradeep Sanchana, Prateek Katageri, and Atin Modi. 2023. MLOps with enhanced performance control and observability. arXiv preprint arXiv:2302.01061 (2023).Google Scholar
- Edmon Begoli. 2023. StreamToxWatch. Google ScholarCross Ref
- Andrey Besedin, Pierre Blanchart, Michel Crucianu, and Marin Ferecatu. 2017. Evolutive deep models for online learning on data streams with no storage. In ECML/PKDD 2017 Workshop on Large-scale Learning from Data Streams in Evolving Environments.Google Scholar
- Nicholas Carlini and David Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM workshop on artificial intelligence and security. 3--14.Google ScholarDigital Library
- Jiaxin Fan, Qi Yan, Mohan Li, Guanqun Qu, and Yang Xiao. 2022. A Survey on Data Poisoning Attacks and Defenses. In 2022 7th IEEE International Conference on Data Science in Cyberspace (DSC). IEEE, 48--55.Google ScholarCross Ref
- Ranwa Al Mallah, David Lopez, Godwin Badu Marfo, and Bilal Farooq. 2021. Untargeted poisoning attack detection in federated learning via behavior attestation. arXiv preprint arXiv:2101.10904 (2021).Google Scholar
- Sanjay Seetharaman, Shubham Malaviya, Rosni Vasu, Manish Shukla, and Sachin Lodha. 2022. Influence based defense against data poisoning attacks in online learning. In 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS). IEEE, 1--6.Google ScholarCross Ref
- Tegjyot Singh Sethi and Mehmed Kantardzic. 2018. Handling adversarial concept drift in streaming data. Expert systems with applications 97 (2018), 18--40.Google Scholar
- Jacob Steinhardt, Pang Wei W Koh, and Percy S Liang. 2017. Certified defenses for data poisoning attacks. Advances in neural information processing systems 30 (2017).Google Scholar
- Ioannis Tzanettis, Christina-Maria Androna, Anastasios Zafeiropoulos, Eleni Fotopoulou, and Symeon Papavassiliou. 2022. Data Fusion of Observability Signals for Assisting Orchestration of Distributed Applications. Sensors 22, 5 (2022), 2061.Google ScholarCross Ref
- Shenghui Wang, Stefan Schlobach, and Michel Klein. 2011. Concept drift and how to identify it. Journal of Web Semantics 9, 3 (2011), 247--265.Google ScholarDigital Library
- Zicheng Wang. 2021. Can "micro VM" become the next generation computing platform?: Performance comparison between light weight Virtual Machine, container, and traditional Virtual Machine. In 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE). IEEE, 29--34.Google ScholarCross Ref
Index Terms
- Poster: StreamToxWatch – Data Poisoning Detector in Distributed, Event-based Environments
Recommendations
Defending Against Adversarial Denial-of-Service Data Poisoning Attacks
DYNAMICS '20: Proceedings of the 2020 Workshop on DYnamic and Novel Advances in Machine Learning and Intelligent Cyber SecurityData poisoning is one of the most relevant security threats against machine learning and data-driven technologies. Since many applications rely on untrusted training data, an attacker can easily craft malicious samples and inject them into the training ...
Classification Auto-Encoder Based Detector Against Diverse Data Poisoning Attacks
Data and Applications Security and Privacy XXXVIIAbstractPoisoning attacks are a category of adversarial machine learning threats in which an adversary attempts to subvert the outcome of the machine learning systems by injecting crafted data into training data set, thus increasing the resulting model’s ...
Stronger data poisoning attacks break data sanitization defenses
AbstractMachine learning models trained on data from the outside world can be corrupted by data poisoning attacks that inject malicious points into the models’ training sets. A common defense against these attacks is data sanitization: first filter out ...
Comments