Harmful ingredient detection from cosmetic products using optical character recognition and bi-LSTM model

Sayallar, Çağrı; Sayar, Ahmet

doi:10.1007/s11760-025-03923-0

Harmful ingredient detection from cosmetic products using optical character recognition and bi-LSTM model

Original Paper
Published: 24 February 2025

Volume 19, article number 338, (2025)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Çağrı Sayallar¹ &
Ahmet Sayar²

102 Accesses
Explore all metrics

Abstract

As the variety of cosmetic products increases day by day, the controllability of their ingredients decreases. It is becoming difficult for users to obtain information about the potential risks of the products they purchase. The main reasons for this are that ingredients that can cause health problems, such as allergens and harmful chemicals in cosmetic products, have complicated names on their labels, and the text on the labels is difficult to read. This makes it difficult for consumers to accurately assess potential health risks. The aim of this project is to identify harmful ingredients in cosmetic products and to inform the user about the potential risks of these products. Our project includes a mobile application developed as a solution to the difficulties of making informed choices in the cosmetics industry. Users can use the application by taking photos of the labels of cosmetic products with their smartphones. These photos pass through an OCR engine in the background. The OCR engine detects the text on the label and converts it into digital text format. This text data is sent to a pre-trained LSTM-based information extraction model. Thus, the information extraction model identifies the ingredients from all the text on the label. This list of detected ingredients is compared against a database of harmful ingredients using a word similarity algorithm. The database consists of harmful ingredients identified in light of reports and articles previously published by various organizations. After all these steps, the user is provided with information about the harmful ingredients in the product, the potential risks associated with these substances, and the names of all the ingredients found in the product. As a result of the project, an application developed for smartphones allows users to learn the ingredients of cosmetic products instantly by taking a photo. The application identifies ingredients with high accuracy rates and detects harmful ingredients. In this way, the objective is to protect consumers from potential health risks in cosmetic products by making informed choices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

OCR-Based Ingredient Recognition for Consumer Well-Being

A Novel Meta-analysis and Classification of Herbal Medicinal Plant Raw Materials for Food Consumption Prediction Using Hybrid Deep Learning Techniques Based on Augmented Reality in Computer Vision

An Automated Computer Vision System for Extraction of Retail Food Product Metadata

Data Availability

Due to the nature of the investigation, no supporting data is available for ethical reasons.

References

Mithe, R., Indalkar, S., Divekar, N.: Optical character recognition. Int. J. Recent Technol. Eng. IJRTE 2(1), 72–75 (2013)
Google Scholar
Sun, P., Yang, X., Zhao, X., Wang, Z.: An overview of named entity recognition. In: 2018 International Conference on Asian Language Processing (IALP), pp. 273–278 (2018). https://ieeexplore.ieee.org/abstract/document/8629225. Accessed 19 Sept 2024
Özger, Z.B., DiRi, B.: Türkçe Dokümanlar İçin Kural Tabanlı Varlık İsmi Tanıma (Named Entity Recognition for Turkish Text)
Van Houdt, G., Mosquera, C., Nápoles, G.: A review on the long short-term memory model. Artif. Intell. Rev. 53(8), 5929–5955 (2020)
MATH Google Scholar
Dursun, B., Sonmez, A.C.: Türkçe metin benzerlik hesaplamasi için yeni bir yöntem. In: 2008 IEEE 16th Signal Processing, Communication and Applications Conference, pp. 1–4 (2008). ISSN: 2165-0608. https://ieeexplore.ieee.org/abstract/document/4632581. Accessed 19 Sept 2024
Alwis, K., Udayangi, T.: WellnessCare: OCR-based web application for cosmetic product safety assurance (2022)
Rohini, B., Pavuluri, D.M., Naresh Kumar, L., Soorya, V., Aravinth, J.: A framework to identify allergen and nutrient content in fruits and packaged food using deep learning and OCR. In: 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, pp. 72–77 (2021). ISSN: 2575-7288. https://ieeexplore.ieee.org/abstract/document/9441800. Accessed 19 Sept 2024
Hivarkar, P., Shrivastava, V., Bhattacharya, P., Jawade, S., Jain, A., Thakre, D., Harkare, N.: Product analysis using computer vision. Int. J. Comput. Sci. Mob. Comput. 11(12), 48–52 (2022)
Yuniarti, A., Kuswardayan, I., Hariadi, R.R., Arifiani, S., Mursidah, E.: Design of integrated Latext: Halal detection text using OCR (Optical character recognition) and web service. In: 2017 International Seminar on Application for Technology of Information and Communication (iSemantic), pp. 137–141 (2017). https://ieeexplore.ieee.org/abstract/document/8251858
Fadhilah, H., Djamal, E.C., Ilyas, R., Najmurrokhman, A.: Non-halal ingredients detection of food packaging image using convolutional neural networks. In: 2018 International Symposium on Advanced Intelligent Informatics (SAIN), pp. 131–136 (2018). https://doi.org/10.1109/SAIN.2018.8673376
Khairani, D., Bangkit, D.A., Rozi, N.F., Masruroh, S.U., Oktaviana, S., Rosyadi, T.: Named-entity recognition and optical character recognition for detecting halal food ingredients: Indonesian case study. In: 2022 10th International Conference on Cyber and IT Service Management (CITSM), pp. 01–05 (2022). ISSN: 2770-159X. https://ieeexplore.ieee.org/abstract/document/9935966. Accessed 19 Sept 2024
Bagal, V., Gaykar, K., Ahirao, P.: Image based text translation using firebase ML kit
Andersen, F.A.: Annual review of cosmetic ingredient safety assessments: 2007–2010. Int. J. Toxicol. 30(5–suppl), 73–127 (2011)
MATH Google Scholar

Download references

Funding

This work has been supported by the Turkish Scientific and Technological Research Council (TÜBITAK) under Grant Number 1919B012322132.

Author information

Authors and Affiliations

Department of Computer Engineering, Çukurova University, 01330, Balcalı, Adana, Sarıçam, Turkey
Çağrı Sayallar
Department of Computer Engineering, Kocaeli University, 41001, Umuttepe, Kocaeli, İzmit, Turkey
Ahmet Sayar

Authors

Çağrı Sayallar
View author publications
You can also search for this author inPubMed Google Scholar
Ahmet Sayar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Çağrı Sayallar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sayallar, Ç., Sayar, A. Harmful ingredient detection from cosmetic products using optical character recognition and bi-LSTM model. SIViP 19, 338 (2025). https://doi.org/10.1007/s11760-025-03923-0

Download citation

Received: 18 October 2024
Revised: 26 November 2024
Accepted: 05 February 2025
Published: 24 February 2025
DOI: https://doi.org/10.1007/s11760-025-03923-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Harmful ingredient detection from cosmetic products using optical character recognition and bi-LSTM model

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

OCR-Based Ingredient Recognition for Consumer Well-Being

A Novel Meta-analysis and Classification of Herbal Medicinal Plant Raw Materials for Food Consumption Prediction Using Hybrid Deep Learning Techniques Based on Augmented Reality in Computer Vision

An Automated Computer Vision System for Extraction of Retail Food Product Metadata

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now