skip to main content
10.1145/3581754.3584128acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
poster

PORDE: Explaining Data Poisoning Attacks Through Visual Analytics with Food Delivery App Reviews

Published: 27 March 2023 Publication History

Abstract

Artificial intelligence (AI) gives many benefits to our lives. However, biased AI models created by receiving data poisoning attacks may induce social problems. Therefore, developers must consider carefully whether the training data received a poison attack when developing an AI model. Data visualization is one of the methods to facilitate the analysis of the data required for checking if the training data received a poisoning attack. However, prior studies did not visualize real-world AI training data. Restaurant reviews in delivery apps are one of the cases of a poisoned dataset. Restaurants hold review events on delivery apps to encourage customers to write a positive review in return for certain rewards, thereby creating reviews with bias. In this study, we propose POisoned Real-world Data Explainer (PORDE) that explains data poisoning attacks through visual analytics with food delivery app reviews. The findings of this study suggest implications for securing safe training data and developing less biased AI models.

References

[1]
Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. 2011. D³ Data-Driven Documents. IEEE Transactions on Visualization and Computer Graphics 17, 12(2011), 2301–2309. https://doi.org/10.1109/TVCG.2011.185
[2]
Sam De Craemer, Karen Driesen, and Bart Ghesquière. 2022. TraVis Pies: A Guide for Stable Isotope Metabolomics Interpretation Using an Intuitive Visualization. Metabolites 12, 7 (June 2022), 593. https://doi.org/10.3390/metabo12070593
[3]
Li Deng. 2012. The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web]. IEEE Signal Processing Magazine 29, 6 (2012), 141–142. https://doi.org/10.1109/MSP.2012.2211477
[4]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[5]
Jim Gao. 2014. Machine Learning Applications for Data Center Optimization.
[6]
Bhavya Ghai, Md Naimul Hoque, and Klaus Mueller. 2021. WordBias: An Interactive Visual Tool for Discovering Intersectional Biases Encoded in Word Embeddings. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI EA ’21). Association for Computing Machinery, New York, NY, USA, Article 429, 7 pages. https://doi.org/10.1145/3411763.3451587
[7]
Bhavya Ghai and Klaus Mueller. 2023. D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias. IEEE Transactions on Visualization and Computer Graphics 29, 1(2023), 473–482. https://doi.org/10.1109/TVCG.2022.3209484
[8]
David Gotz, Shun Sun, and Nan Cao. 2016. Adaptive Contextualization: Combating Bias During High-Dimensional Visualization and Data Selection. In Proceedings of the 21st International Conference on Intelligent User Interfaces (Sonoma, California, USA) (IUI ’16). Association for Computing Machinery, New York, NY, USA, 85–95. https://doi.org/10.1145/2856767.2856779
[9]
David Gotz, Shun Sun, Nan Cao, Rita Kundu, and Anne-Marie Meyer. 2017. Adaptive Contextualization Methods for Combating Selection Bias during High-Dimensional Visualization. ACM Trans. Interact. Intell. Syst. 7, 4, Article 17 (Nov. 2017), 23 pages. https://doi.org/10.1145/3009973
[10]
Henry Han and Ke Men. 2018. How does normalization impact RNA-seq disease diagnosis?Journal of Biomedical Informatics 85 (2018), 80–92. https://doi.org/10.1016/j.jbi.2018.07.016
[11]
Jinbin Huang, Aditi Mishra, Bum Chul Kwon, and Chris Bryan. 2023. ConceptExplainer: Interactive Explanation for Deep Neural Networks from a Concept Perspective. IEEE Transactions on Visualization and Computer Graphics 29, 1(2023), 831–841. https://doi.org/10.1109/TVCG.2022.3209384
[12]
Renpei Huang, Quan Li, Li Chen, and Xiaoru Yuan. 2022. A Probability Density-Based Visual Analytics Approach to Forecast Bias Calibration. IEEE Transactions on Visualization and Computer Graphics 28, 4(2022), 1732–1744. https://doi.org/10.1109/TVCG.2020.3025072
[13]
Mohaiminul Islam and Shangzhu Jin. 2019. An Overview of Data Visualization. In 2019 International Conference on Information Science and Communications Technologies (ICISCT). IEEE, IEEE, Tashkent, Uzbekistan, 1–7. https://doi.org/10.1109/ICISCT47635.2019.9012031
[14]
Alina Lazar, Ling Jin, C. Anna Spurlock, Kesheng Wu, Alex Sim, and Annika Todd. 2019. Evaluating the Effects of Missing Values and Mixed Data Types on Social Sequence Clustering Using T-SNE Visualization. J. Data and Information Quality 11, 2, Article 7 (Mar. 2019), 22 pages. https://doi.org/10.1145/3301294
[15]
Mohan Li, Yanbin Sun, Hui Lu, Sabita Maharjan, and Zhihong Tian. 2020. Deep Reinforcement Learning for Partially Observable Data Poisoning Attack in Crowdsensing Systems. IEEE Internet of Things Journal 7, 7 (2020), 6266–6278. https://doi.org/10.1109/JIOT.2019.2962914
[16]
Yuxin Ma, Tiankai Xie, Jundong Li, and Ross Maciejewski. 2020. Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. IEEE Transactions on Visualization and Computer Graphics 26, 1(2020), 1075–1085. https://doi.org/10.1109/TVCG.2019.2934631
[17]
Naiara Muro, Nekane Larburu, Jacques Bouaud, and Brigitte Seroussi. 2019. Gathering Real World Evidence Through the Evaluation of Decision History. Stud Health Technol Inform 262 (July 2019), 134–137. https://doi.org/10.3233/SHTI190035
[18]
Rangeet Pan and Hridesh Rajan. 2022. Decomposing Convolutional Neural Networks into Reusable and Replaceable Modules. In Proceedings of the 44th International Conference on Software Engineering (Pittsburgh, Pennsylvania) (ICSE ’22). Association for Computing Machinery, New York, NY, USA, 524–535. https://doi.org/10.1145/3510003.3510051
[19]
Andrea Paudice, Luis Muñoz-González, and Emil C. Lupu. 2019. Label Sanitization Against Label Flipping Poisoning Attacks. In ECML PKDD 2018 Workshops. Springer International Publishing, Dublin, Ireland, 5–15. https://doi.org/10.1007/978-3-030-13453-2_1
[20]
Andrea Paudice, Luis Muñoz-González, Andras Gyorgy, and Emil C. Lupu. 2018. Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection. https://doi.org/10.48550/ARXIV.1802.03041
[21]
Periklis Perikleous, Andreas Kafkalias, Zenonas Theodosiou, Pinar Barlas, Evgenia Christoforou, Jahna Otterbacher, Gianluca Demartini, and Andreas Lanitis. 2022. How Does the Crowd Impact the Model? A Tool for Raising Awareness of Social Bias in Crowdsourced Training Data. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (Atlanta, GA, USA) (CIKM ’22). Association for Computing Machinery, New York, NY, USA, 4951–4954. https://doi.org/10.1145/3511808.3557178
[22]
Yasamin Salimi, Daniel Domingo-Fernández, Carlos Bobis-Álvarez, Martin Hofmann-Apitius, Colin Birkenbihl, and The Japanese Alzheimer’s Disease Neuroimaging Initiative for the Alzheimer’s Disease Neuroimaging Initiative, for the Aging Brain: Vasculature. 2022. ADataViewer: exploring semantically harmonized Alzheimer’s disease cohort datasets. Alzheimer’s Research & Therapy 14, 1 (May 2022), 69. https://doi.org/10.1186/s13195-022-01009-4
[23]
Guido Van Rossum and Fred L. Drake. 2009. Python 3 Reference Manual. CreateSpace, Scotts Valley, CA.
[24]
Emily Wall, Leslie Blaha, Celeste Paul, and Alex Endert. 2019. A Formative Study of Interactive Bias Metrics in Visual Analytics Using Anchoring Bias. In Human-Computer Interaction – INTERACT 2019, David Lamas, Fernando Loizides, Lennart Nacke, Helen Petrie, Marco Winckler, and Panayiotis Zaphiris (Eds.). Springer International Publishing, Cham, 555–575. https://doi.org/10.1007/978-3-030-29384-0_34
[25]
Samira Yeasmin. 2019. Benefits of Artificial Intelligence in Medicine. In 2019 2nd International Conference on Computer Applications & Information Security (ICCAIS). IEEE, IEEE, Riyadh, Saudi Arabia, 1–6. https://doi.org/10.1109/CAIS.2019.8769557
[26]
Kyra Yee, Uthaipon Tantipongpipat, and Shubhanshu Mishra. 2021. Image Cropping on Twitter: Fairness Metrics, Their Limitations, and the Importance of Representation, Design, and Agency. Proc. ACM Hum.-Comput. Interact. 5, CSCW2, Article 450 (Oct. 2021), 24 pages. https://doi.org/10.1145/3479594

Cited By

View all
  • (2023)Visualizing the Carbon Intensity of Machine Learning Inference for Image Analysis on TensorFlow HubCompanion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing10.1145/3584931.3606959(206-211)Online publication date: 14-Oct-2023

Index Terms

  1. PORDE: Explaining Data Poisoning Attacks Through Visual Analytics with Food Delivery App Reviews

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IUI '23 Companion: Companion Proceedings of the 28th International Conference on Intelligent User Interfaces
      March 2023
      266 pages
      ISBN:9798400701078
      DOI:10.1145/3581754
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 March 2023

      Check for updates

      Author Tags

      1. Data Poisoning Attack
      2. Data Visualization
      3. Food Delivery App Reviews

      Qualifiers

      • Poster
      • Research
      • Refereed limited

      Funding Sources

      Conference

      IUI '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 746 of 2,811 submissions, 27%

      Upcoming Conference

      IUI '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)71
      • Downloads (Last 6 weeks)5
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Visualizing the Carbon Intensity of Machine Learning Inference for Image Analysis on TensorFlow HubCompanion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing10.1145/3584931.3606959(206-211)Online publication date: 14-Oct-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media