skip to main content
10.1145/3589334.3648138acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly Ranking

Published: 13 May 2024 Publication History

Abstract

In the global food industry, where the line between legitimate and illicit manufacturing is increasingly blurred by the scale and complexity of the supply chain, safeguarding consumer health and trust necessitates innovative detection methods. Addressing this, this paper presents Graph-aware Self-supervised Contrastive Anomaly Ranking (GraphCAR), a novel unsupervised learning model, devised to identify illicit food factories through the scrutiny of chemical declaration data. GraphCAR tackles the scarcity of labeled data and the intricacies inherent in the vast array of declared chemicals, leveraging a Graph Autoencoder fused with a self-supervised contrastive learning mechanism. This fusion not only simplifies the feature space by embedding chemical declarations within a bipartite graph but also adeptly flags subtle, potentially illicit patterns through contrastively inspecting the learned factory representations. Through rigorous evaluations conducted on real-world factory's chemical declaration data, GraphCAR has demonstrated superior performance over conventional methods on unsupervised outlier detection and one-class classification tasks, showcasing its accuracy, robustness and reliability in flagging potential malpractice. With its successful application in food safety, GraphCAR stands as a testament to the potential of AI-driven solutions to address multifaceted challenges for the greater good.

Supplemental Material

MP4 File
Video presentation
MP4 File
Supplemental video

References

[1]
Fabrizio Angiulli and Clara Pizzuti. 2002. Fast outlier detection in high dimensional spaces. In European conference on principles of data mining and knowledge discovery. Springer, 15--27.
[2]
Yamine Bouzembrak, B Steen, Rabin Neslo, Jens Linge, Vahid Mojtahed, and HJP Marvin. 2018. Development of food fraud media monitoring system based on text mining. Food Control, Vol. 93 (2018), 283--296.
[3]
Leo Breiman. 2001. Random forests. Machine learning, Vol. 45 (2001), 5--32.
[4]
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. 93--104.
[5]
Fernando P Carvalho. 2017. Pesticides, environment, and food safety. Food and energy security, Vol. 6, 2 (2017), 48--60.
[6]
Wan-Tzu Chang, Yen-Po Yeh, Hong-Yi Wu, Yu-Fen Lin, Thai Son Dinh, and Ie-bin Lian. 2020. An automated alarm system for food safety by using electronic invoices. Plos one, Vol. 15, 1 (2020), e0228035.
[7]
Sylvain Charlebois, Anita Schwab, Raphael Henn, and Christian W Huck. 2016. Food fraud: An exploratory study for measuring consumer perception towards mislabeled food products and influence on self-authentication intentions. Trends in food science & technology, Vol. 50 (2016), 211--218.
[8]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785--794.
[9]
Zhe Chen and Aixin Sun. 2020. Anomaly detection on dynamic bipartite graph with burstiness. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 966--971.
[10]
Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1724--1734.
[11]
Yingtong Dou, Guixiang Ma, Philip S Yu, and Sihong Xie. 2020. Robust spammer detection by nash reinforcement learning. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 924--933.
[12]
M Esteki, J Regueiro, and J Simal-Gándara. 2019. Tackling fraudsters with global strategies to expose fraud in the food chain. Comprehensive Reviews in Food Science and Food Safety, Vol. 18, 2 (2019), 425--440.
[13]
Lanting Fang, Kaiyu Feng, Jie Gui, Shanshan Feng, and Aiqun Hu. 2023. Anonymous Edge Representation for Inductive Anomaly Detection in Dynamic Bipartite Graph. Proceedings of the VLDB Endowment, Vol. 16, 5 (2023), 1154--1167.
[14]
VJ Feron and JP Groten. 2002. Toxicological evaluation of chemical mixtures. Food and chemical toxicology, Vol. 40, 6 (2002), 825--839.
[15]
Jakub Fibigr, Dalibor vS at'inskỳ, and Petr Solich. 2018. Current trends in the analysis and quality control of food supplements based on plant extracts. Analytica chimica acta, Vol. 1036 (2018), 1--15.
[16]
Boyan Gao, Stephen E Holroyd, Jeffrey C Moore, Kristie Laurvick, Steven M Gendel, and Zhuohong Xie. 2019. Opportunities and challenges using non-targeted methods for food fraud detection. Journal of agricultural and food chemistry, Vol. 67, 31 (2019), 8425--8430.
[17]
Adam Goodge, Bryan Hooi, See-Kiong Ng, and Wee Siong Ng. 2022. Lunar: Unifying local outlier detection methods via graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 6737--6745.
[18]
Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. 2009. The elements of statistical learning: data mining, inference, and prediction. Vol. 2. Springer.
[19]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729--9738.
[20]
Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. beta-vae: Learning basic visual concepts with a constrained variational framework. In International conference on learning representations.
[21]
Chuanbo Hu, Bin Liu, Yanfang Ye, and Xin Li. 2023. Fine-grained classification of drug trafficking based on Instagram hashtags. Decision Support Systems, Vol. 165 (2023), 113896.
[22]
Chuanbo Hu, Minglei Yin, Bin Liu, Xin Li, and Yanfang Ye. 2021. Detection of illicit drug trafficking events on instagram: A deep multimodal multilabel learning approach. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 3838--3846.
[23]
Lauren S Jackson. 2009. Chemical food safety issues in the United States: past, present, and future. Journal of agricultural and food chemistry, Vol. 57, 18 (2009), 8161--8170.
[24]
Zeren Jiao, Pingfan Hu, Hongfei Xu, and Qingsheng Wang. 2020. Machine learning and deep learning in chemical health and safety: a systematic review of techniques and applications. ACS Chemical Health & Safety, Vol. 27, 6 (2020), 316--334.
[25]
Ana M Jiménez-Carvelo, Antonio González-Casado, M Gracia Bagur-González, and Luis Cuadros-Rodr'iguez. 2019. Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity--A review. Food research international, Vol. 122 (2019), 25--39.
[26]
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, Vol. 30 (2017).
[27]
Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR).
[28]
Diederik P Kingma and Max Welling. 2014. Auto-encoding variational bayes. In International conference on learning representations.
[29]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
[30]
Indrajeet Kumar, Jyoti Rawat, Noor Mohd, and Shahnawaz Husain. 2021. Opportunities of artificial intelligence and machine learning in the food industry. Journal of Food Quality, Vol. 2021 (2021), 1--10.
[31]
Jiawei Li, Qing Xu, Neal Shah, and Tim K Mackey. 2019. A machine learning approach for the detection and characterization of illicit drug dealers on instagram: model evaluation study. Journal of medical Internet research, Vol. 21, 6 (2019), e13803.
[32]
Zheng Li, Yue Zhao, Nicola Botta, Cezar Ionescu, and Xiyang Hu. 2020. COPOD: copula-based outlier detection. In 2020 IEEE international conference on data mining (ICDM). IEEE, 1118--1123.
[33]
Zheng Li, Yue Zhao, Xiyang Hu, Nicola Botta, Cezar Ionescu, and George Chen. 2022. Ecod: Unsupervised outlier detection using empirical cumulative distribution functions. IEEE Transactions on Knowledge and Data Engineering (2022).
[34]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation forest. In 2008 eighth ieee international conference on data mining. IEEE, 413--422.
[35]
Ningjing Liu, Yamine Bouzembrak, Leonieke M Van den Bulk, Anand Gavai, Lukas J van den Heuvel, and Hans JP Marvin. 2022. Automated food safety early warning system in the dairy supply chain using machine learning. Food Control, Vol. 136 (2022), 108872.
[36]
Yezheng Liu, Zhe Li, Chong Zhou, Yuanchun Jiang, Jianshan Sun, Meng Wang, and Xiangnan He. 2019. Generative adversarial active learning for unsupervised outlier detection. IEEE Transactions on Knowledge and Data Engineering, Vol. 32, 8 (2019), 1517--1528.
[37]
Tim Mackey, Janani Kalyanam, Josh Klugman, Ella Kuzmenko, and Rashmi Gupta. 2018. Solution to detect, classify, and report illicit online marketing and sales of controlled substances via Twitter: using machine learning and web forensics to combat digital opioid access. Journal of medical Internet research, Vol. 20, 4 (2018), e10029.
[38]
Tim K Mackey, Janani Kalyanam, Takeo Katsuki, and Gert Lanckriet. 2017. Twitter-based detection of illegal online sale of prescription opioid. American journal of public health, Vol. 107, 12 (2017), 1910--1915.
[39]
Adyasha Maharana, Kunlin Cai, Joseph Hellerstein, Yulin Hswen, Michael Munsell, Valentina Staneva, Miki Verma, Cynthia Vint, Derry Wijaya, and Elaine O Nsoesie. 2019. Detecting reports of unsafe foods in consumer product reviews. JAMIA open, Vol. 2, 3 (2019), 330--338.
[40]
Georgios Makridis, Philip Mavrepis, and Dimosthenis Kyriazis. 2023. A deep learning approach using natural language processing and time-series forecasting towards enhanced food safety. Machine Learning, Vol. 112, 4 (2023), 1287--1313.
[41]
Louise Manning and Jan Mei Soon. 2016. Food safety, food fraud, and food defense: a fast evolving literature. Journal of food science, Vol. 81, 4 (2016), R823--R834.
[42]
Hans JP Marvin, Yamine Bouzembrak, Esmée M Janssen, HJ van van der Fels-Klerx, Esther D van Asselt, and Gijs A Kleter. 2016. A holistic approach to food safety risks: Food fraud as an example. Food research international, Vol. 89 (2016), 463--470.
[43]
NN Misra, Yash Dixit, Ahmad Al-Mallahi, Manreet Singh Bhullar, Rohit Upadhyay, and Alex Martynenko. 2020. IoT, big data, and artificial intelligence in agriculture and food industry. IEEE Internet of things Journal, Vol. 9, 9 (2020), 6305--6324.
[44]
Thomas G Neltner, Heather M Alger, Jack E Leonard, and Maricel V Maffini. 2013. Data gaps in toxicity testing of chemicals allowed in food in the United States. Reproductive Toxicology, Vol. 42 (2013), 85--94.
[45]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
[46]
Pramuditha Perera, Poojan Oza, and Vishal M Patel. 2021. One-class classification: A survey. arXiv preprint arXiv:2101.03064 (2021).
[47]
Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: unbiased boosting with categorical features. Advances in neural information processing systems, Vol. 31 (2018).
[48]
Yuxiang Ren, Hao Zhu, Jiawei Zhang, Peng Dai, and Liefeng Bo. 2021. Ensemfdet: An ensemble approach to fraud detection based on bipartite graph. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2039--2044.
[49]
Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In International conference on machine learning. PMLR, 4393--4402.
[50]
Anamika Paul Rupa and Aryya Gangopadhyay. 2020. Multi-modal Deep Learning Based Fusion Approach to Detect Illicit Retail Networks from Social Media. In 2020 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, 238--243.
[51]
RH Schmidt. 2000. Declaration of ingredients and additives: United States. In Food Labelling. Woodhead Publishing, 81--100.
[52]
Bernhard Schölkopf, John C. Platt, John C. Shawe-Taylor, Alex J. Smola, and Robert C. Williamson. 2001. Estimating the Support of a High-Dimensional Distribution. Neural Comput., Vol. 13, 7 (jul 2001), 1443--1471.
[53]
Karandeep Singh, Yu-Che Tsai, Cheng-Te Li, Meeyoung Cha, and Shou-De Lin. 2023. GraphFC: Customs Fraud Detection with Label Scarcity. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. 4829--4835.
[54]
John Spink and Douglas C Moyer. 2011. Defining the public health threat of food fraud. Journal of food science, Vol. 76, 9 (2011), R157--R163.
[55]
Dandan Tao, Pengkun Yang, and Hao Feng. 2020. Utilization of text mining as a big data analysis tool for food science and nutrition. Comprehensive reviews in food science and food safety, Vol. 19, 2 (2020), 875--894.
[56]
Saskia M van Ruth, Wim Huisman, and Pieternel A Luning. 2017. Food fraud vulnerability and its key factors. Trends in Food Science & Technology, Vol. 67 (2017), 70--75.
[57]
K Verhaelen, A Bauer, F Günther, B Müller, M Nist, B Ülker Celik, C Weidner, H Küchenhoff, and P Wallner. 2018. Anticipation of food safety and fraud issues: ISAR-A new screening tool to monitor food prices and commodity flows. Food Control, Vol. 94 (2018), 93--101.
[58]
Pierina Visciano and Maria Schirone. 2021. Food frauds: Global incidents and misleading situations. Trends in Food Science & Technology, Vol. 114 (2021), 424--442.
[59]
Andrew Z Wang, Rex Ying, Pan Li, Nikhil Rao, Karthik Subbian, and Jure Leskovec. 2021b. Bipartite dynamic representations for abuse detection. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3638--3648.
[60]
Xinxin Wang, Yamine Bouzembrak, AGJM Oude Lansink, and HJ van der Fels-Klerx. 2022. Application of machine learning to the monitoring and prediction of food safety: A review. Comprehensive Reviews in Food Science and Food Safety, Vol. 21, 1 (2022), 416--434.
[61]
Zhiwei Wang, Zhengzhang Chen, Jingchao Ni, Hui Liu, Haifeng Chen, and Jiliang Tang. 2021a. Multi-scale one-class recurrent neural networks for discrete event sequence anomaly detection. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 3726--3734.
[62]
Hongzuo Xu, Guansong Pang, Yijie Wang, and Yongjun Wang. 2023. Deep isolation forest for anomaly detection. IEEE Transactions on Knowledge and Data Engineering (2023).
[63]
Jianke Yu, Hanchen Wang, Xiaoyang Wang, Zhao Li, Lu Qin, Wenjie Zhang, Jian Liao, and Ying Zhang. 2023 a. Group-based fraud detection network on e-commerce platforms. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5463--5475.
[64]
Wei Yu, Wenkai Wang, Guangquan Xu, Huaming Wu, Hongyan Li, Jun Wang, Xiaoming Li, and Juan Liu. 2023 b. MRFS: Mining Rating Fraud Subgraph in Bipartite Graph for Users and Products. IEEE Transactions on Computational Social Systems (2023).
[65]
Fengpan Zhao, Pavel Skums, Alex Zelikovsky, Eric L Sevigny, Monica Haavisto Swahn, Sheryl M Strasser, Yan Huang, and Yubao Wu. 2020. Computational approaches to detect illicit drug ads and find vendor communities within social media platforms. IEEE/ACM transactions on computational biology and bioinformatics, Vol. 19, 1 (2020), 180--191.
[66]
Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 665--674. io

Index Terms

  1. Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly Ranking

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '24: Proceedings of the ACM Web Conference 2024
    May 2024
    4826 pages
    ISBN:9798400701719
    DOI:10.1145/3589334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. biparite graphs
    2. chemical declaration data
    3. food technology
    4. fraud detection
    5. illicit food factories
    6. social good

    Qualifiers

    • Research-article

    Funding Sources

    • National Science and Technology Council, Taiwan

    Conference

    WWW '24
    Sponsor:
    WWW '24: The ACM Web Conference 2024
    May 13 - 17, 2024
    Singapore, Singapore

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 145
      Total Downloads
    • Downloads (Last 12 months)145
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media