Abstract
Shopping is an indispensable part of daily life. It is an easy task for people with healthy eyes. However, it remains a big problem for the visually impaired individuals. Today, the visually impaired individuals have to be accompanied by their family or guided by the store escort when shopping. It is difficult for them to shop alone. This research develops an artificial intelligence image recognition auxiliary device utilizing the artificial intelligence technology Convolutional Neural Network (CNN), providing smart image recognition modules to assist the visually impaired individuals while shopping. CNN is the most effective deep learning algorithm in the field of machine vision, its ability to compare details of product exterior features makes product recognition more efficient via accurate model training.
This study experiments task-oriented shopping in three shopping models, (1) self-shopping, (2) accompanied shopping, (3) device assisted shopping. It measures through three indicators: shopping time, accuracy in choosing the correct product, and device satisfaction. The research subjects are 18 college students, 8 male students and 10 female students. The subjects are blindfolded, simulating the visually impaired individuals to perform experiments in a state without any vision. one-way repeated-measures ANOVA is used to explore the differences among the three shopping models. Surveys are collected at the end of the experiment to analyze the degrees of satisfaction for the AIoT device. The results of this study are: (1) task operation time of the three shopping models are significantly different p = .000, and gender difference has no significant impact. (2) the task operation accuracy rate of the three shopping models are significantly different p = .000, gender difference also has significant impact p = .000. The accuracy rate for self-shopping is 39%, 12.5% for men and 60% for women. The accuracy rate for companied shopping is 97.25%, 93.75% for men and 100% for women. The accuracy rate for device assisted shopping is 90.25%, 87.5% for men and 92.5% for women. (3) highest score in satisfactory rating is the extensiveness of product information audio at an average at 4.5. Satisfactory rating of product information audio accurateness averages at 4.44 points. As to device effective for shopping assistance, the average satisfactory rating is 4.17. And the average satisfactory rating for device operation usability is 4.00.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
According to statistics from the World Health Organization (WHO) in 2019, at least 220 million people worldwide suffer from visual impairment [1]. Most people rely on vision in their lives, whereas visually impaired people face various difficulties in their daily lives. Shopping is a basic human behavior to meet the physiological needs. It is a breeze for the people with healthy eyes, but the visually impaired individuals face many frustrations and restrictions in daily shopping because of visual defects. The visually impaired cannot obtain information through ways such as visual observation, reading text and images. They can only rely on touch and hearing, but these conditions are not sufficient in correctly recognizing the product. Shopping thus poses a major problem for the visually impaired individuals.
Visually impaired individuals usually have to rely on friends, family members or shopping guides in the store to make shopping runs. Availability of an accompany is a major issue for the visually impaired individuals. The visually impaired individuals must ask about product information, and such repeated request may cause impatience from the accompanying person on top of not being able to meet the needs of the visually impaired individuals, [2] inevitably causing the physical and mental burden. If the shopping guide of the store encounters a situation of limited manpower, the visually impaired individuals must wait patiently. The store may even decline to provide the visually impaired individuals with shopping guide service. The visually impaired individuals risk having misguided information as the many varieties of products made it difficult for them to read the contents of the goods. As a result, the visually impaired individual may not be able to gather important nutrition facts from the products, or they ended up purchasing the wrong product [3]. Research on the daily shopping behaviors of severely visually impaired people also points out that the visually impaired people are most worried not being able to compare product information in details while shopping. They need to rely on the assistance of others [4]. After shopping, the visually impaired individuals must use memory to evoke relevant information about the product they purchased, such as touching the material and packaging size of the product, shaking the product to hear the sound and smell the package to identify the product [5]. The visually impaired individuals must also remember the placement of various items at home. The family members cannot move said items. It increases burden in their lives during communications between the visually impaired individuals and the family members.
With the advent of Artificial Intelligence (AI), the development and application of artificial intelligence technology take root in all areas of human society, contributing to technology advancement in areas of transportation, finance, medical treatment, entertainment, education, and so on. Image recognition has notably flourished in recent years, resulting in unmanned stores, face recognition, etc. AI is also used in the development of image recognition for mobile devices, such as Aipoly Vision, VocalEyes AI, SeeingAI, using mobile phone lenses to capture people, events, and objects around them, and describing the captured contents through audio to the visually impaired individuals. These innovations increase the visual impaired individuals’ efficiency with daily chores. Through providing AIoT device product image recognition, this study aims to assist visually impaired people when shopping, and to improve the independent and autonomous life of the said individuals.
Due to limitations of impaired vision, visually impaired people are unable to obtain product information while shopping. They worry about asking others for assistance, which may cause psychological pressure and real life difficulties. In order to help the visually impaired individuals to become more independent, the research will assist the visually impaired in areas of not being able to differentiate products and not being able to read product information while shopping. The study develops a device combining artificial intelligence technology with hardware and software of AIoT, so that the visually impaired can use this device to identify the product by name, product composition, price, and expiration date. The purpose of the study is to explore visually impaired people and different gender in using artificial intelligence combined with the Internet of Things device while making purchase. The actual measurement is conducted with three indicators: shopping time, accuracy in choosing the correct product, and device satisfaction.
2 Literature Review
2.1 Convolutional Neural Network
Convolutional neural network (CNN) is a kind of feedforward neural network. It is most commonly seen and most suitable for image recognition. Convolutional neural network is composed of three neural layers, which are the convolution layer, the pooling layer and the fully connected layers. The convolution layer uses the convolution kernel to capture picture features, calculating the weights based on the numeral values to generate feature map. Due to the huge amount of feature parameter data after convolution, the most important features are left through the pooling layer, which reduces the inference complexity. After multi-layer convolution and pooling operations, it finally enters the fully connected layer to combine and classify the features, and eventually outputs. Convolutional neural networks have developed vigorously in the vision field, such as image classification [7] and super-resolution imaging [8]. All of these are moving towards high accuracy and high efficiency.
2.2 AIOT (The Artificial Intelligence of Things)
The Internet of Things (IoT) was proposed by Kevin Ashton of the Automatic Identification Center at the Massachusetts Institute of Technology in 1999 [9]. It is an idea of connecting an object to the Internet so that the item can sense, detect, and respond, and communicate data with other objects. While using the collected information as big data analysis and inference, it can fulfill intelligent identification and management through the Internet of Things technology. The three-level infrastructure of the Internet of Things is divided into the Perception Layer (Sensors & Actuators), the Network Layer (Interconnection) and the Application Layer (Models & Analytics). The perception layer has sense and recognition. The network layer passes the data collected by the perception layer to the network and data storage. The application layer integrates technology for real world applications. The diversity nature of the Internet of Things enables it to be used in many areas, man to object, object to object, facilitating different aspects of human society. The combination of AI and IoT is AIoT (The Artificial Intelligence of Things, AIoT). Through the Internet of Everything, the IoT collects and accumulates a large amount of data stored in the cloud through sensors, and then uses AI to perform big data analysis. The combination of strengths of AI and IoT has become the hottest topic and trend today, such as unmanned storage, drone delivery, and unmanned stores. In 2018, the Amazon Go smart retail store officially operated in Seattle, USA. Through artificial Computer Vision and Just Walk Out Technology, a dense camera system was installed on the ceiling to track each customer. It identifies whether the consumer takes out or places the goods, before checking out with the customer’s linked credit card. AIoT applications will greatly affect human shopping behavior and bring convenience to human life.
This study will use Google AutoML Vision machine to learn product identification, to integrate with IoT devices by generating a product image dataset through a large number of product images to achieve accurate product predictions, to build a model that is unique to the visually impaired shopping, and to integrate with AIoT hardware devices to help identify prototype designs.
2.3 The Shopping Model for the Visually Impaired Individuals
The visually impaired individuals need to have the ability to take care of themselves independently, including shopping skills. Due to limited vision, shopping alone is a difficult challenge. Facing a wide range of products in the store, the visually impaired individuals have no way of knowing the product information unless relying on the assistance of others to complete shopping. The visually impaired individuals may be able to obtain the necessary supplies, such as food or daily necessities, with the help of family or friends. However, they encounter inevitable psychological burden and distress in purchasing more personal products. The visually impaired individuals may also be afraid of bothering others, and they usually choose to buy same products over and over, depriving them of the option to find alternative. When the visually impaired people shop in actual stores, they need to draw up a shopping list, memorize it with recording equipment or braille, reserve the date and time of store service in advance, and go to the store where the store assistant is available. The assistant may help the visually impaired individuals to get the designated goods directly, or accompany the visually impaired individuals through the entire shopping process, potentially causing many inconveniences.
2.4 The Current Shopping Assistance for the Visually Impaired Individual
Stores today do not provide the visually impaired individuals with an optimized shopping environment, nor can stores offer them appropriate assistance. The NextVPU company launched a smart eyewear Angel Eye in 2017 using computer vision and artificial intelligence technology, and having visual recognition, navigation and positioning functions. In 2004, the RoboCart, a robot shopping cart, was equipped with a computer and rangefinder to guide the visually impaired individuals to find the product location and obtain product information with a handheld RFID barcode scanner [10]. ShopTalk wearable devices use a barcode scanner to scan the barcode on the product, and use headphones for voice feedback to inform the visually impaired of product information [11]. The Lin (2019) research on the three shopping models for the visually impaired individuals with smart wearable devices indicates that the average accuracy rates of the three modes reach a significant level. The accuracy rate of device shopping is 95%, significantly higher than 75% for shopping alone and 75% for accompanied shopping. Time involved in effective performance carries no prominent differences [12]. This study explores the effectiveness of the device by implementing the three shopping models. The handheld mobile devices have developed ever increasing Apps that can help visually impaired identify products. Such Apps, including Be My Eyes, Aipoly Vision, Envision, Visualize, VocalEyes AI, Seeing AI, etc., can all assist blind people in recognizing the surroundings.
Based on the current situation of the visually impaired individuals, this research develops AIoT shopping devices to assist the visually impaired individuals with a goal of improving their shopping experience.
3 Research Method
This study conducts the research of assisting the visually impaired individuals with artificial intelligence image recognition by having test subjects perform task-oriented shopping in three different models: self-shopping, accompanied shopping, and device assisted shopping. IBM SPSS Statistics Version 26 analysis software is used for data analysis, operation time, product acquisition accuracy, and device satisfaction. The research methods such as research subjects, research tools and research design are described as follows:
3.1 Research Subjects
The research subjects of this study are college students, aged between 18 to 21 years old, a total of 18 students, 8 male and 10 female. Students are to sign the experiment consent before participating in the experiment. The purpose and the experiment procedures are explained to them. During the test, they are blindfolded. The task designation test is performed in order to simulate the visually impaired individuals. The task verification is performed in the simulation store.
3.2 Research Tools
This study uses AIoT-assisted shopping device to conduct device shopping experiment, and simulates the store by displaying 17 commercial products from the market to carry out the experiment.
Establish the AIoT Product Training Dataset:
Machine learning training dataset requires a large number of images to train product models. It uses Github open source software written in Python and image search download tool (google images download) to download Google images in batches. This tool supports cross-platform, shortening the time used to collect products. chromedriver is used in situations where the number of downloaded pictures exceeds 100. After collecting pictures, the program checks whether the pictures match the products. In situation where the pictures are not enough for an effective recognition, photos are manually taken from various angles to increase the recognition rate. The image file format is converted to csv format and stored in the Google AutoML cloud dataset.
Use Google AutoML Vision to develop product graphics training, use image search and download tools to collect training images to a computer, build a product dataset, train a machine learning model, and simulate product definition labels in the store accordingly to classify images. In order to improve the model accuracy, the trained model is continuously iterative to optimize the model exact precision and recall. Select the training model with the highest accuracy rate to connect with the AutoML API. The confusion matrix is the performance list of the visualization algorithm for the AI device product module. Observe the error rate of the product image. The data table can help understand which classification errors occur. Then use the specific labels to train and enhance the accuracy rate. The threshold tool affects the model accuracy and average recall precision. The closer this score approaches 1.0, the better the model performs during the test. Finally, write a Python program to connect the trained modules, the camera lens of the Internet of Things with Acer Cloud Professor.
Shopping Device with AIoT Assistance:
Push the button to trigger the autofocus camera lens. Send the captured product to the product dataset for identification. Then use the Bluetooth headset to play the recognition result of products by name, price, product composition, and expiration date through voice to the visually impaired individual. It allows the visually impaired individual to confirm the products purchased.
AIoT hardwares: Professor Acer CloudProfessor Windows 10, Ardunio Leonardo, expansion board module, Logitech autofocus camera, USB Hub, trigger button and wireless Bluetooth headset, as in Fig. 1.
-
(1)
Product Label Voice Dataset:
Select 17 items on the market, record the product name, price, composition, and expiration date. Adjust speed and volume, and save it in wav format.
-
(2)
Product Shelf Installment:
The experiment is to simulate a real store. The shelf is based on research of the relevant stores. The best vision and display range is from 60 cm to 150 cm [13]. This study sets the display height of the product at 80 cm to 150 cm. There is one item on each rack with a total of 17 items, as in Fig. 2.
3.3 Test Design
This experiment selects 17 commercial products, mainly snacks, beverages and daily commodities. This research carries out the experimental tasks through three shopping modes: A) shopping alone, B) accompanied shopping, C) device assistance shopping. The location of the goods is changed according to different tasks to avoid learning. Each task must find four designated commodities in the simulation store set up in the experiment. The experiment will evaluate by recording time required for the task, product identification rate and device satisfaction rate. It uses Questionnaire of Likert five-level scale for satisfaction verification.
-
(1)
Task A. Shopping Alone
The subject conducts experiment on product specified by the designator. The visually impaired individual enter the simulation store to shop for the specified product.
-
(2)
Task B. Accompanied Shopping
The subject conducts experiment on product specified by the designator. The visually impaired individual and the accompanying person enter the simulation store to shop for designated product.
-
(3)
Task C. Device Assistance Shopping
The subject conducts experiment on product specified by the designator. The visually impaired individual wearing device enters the simulation store shopping. The subject must use the device to identify product, listen to product information and find the product.
4 Results
4.1 Research Subjects Results of Establishing AIoT Product Training Dataset
This research uses Google AutoML Vision to build a dataset of 1,329 products. Optimize the model after four iterations, the final model has an average accuracy rate of 0.976 and a recall rate of 0.920, as in Fig. 3.
After confusion matrix, the accuracy rate for commodities such as Uni-President Black Tea is 67%. The error rate for Uni-President Milk Tea is 22%. The error rate for Apple Cidra is 11%. The accuracy rate for Kinder Chocolate Maxi is 14%. The error rate for toothpaste reaches 14% and the error rate for fruit candy is 14%.
4.2 Analysis Result for Time Operation Effectiveness Record
A total of 18 college students wear blindfolds, 8 males account for 44% of the total sample, and 10 females account for 56% of the total sample.
Task one is shopping alone. The average operation time is 204.709 s, standard deviation 159.909. Male shopping time 159.941 s, standard deviation 113.303. Female shopping time 240.524 s, standard deviation 187.375. The second task is accompanied shopping. The average operation time is 208.946 s, standard deviation 69.500. Male 225.963 s, standard deviation 76.104. Female 195.332 s, standard deviation 64.465. Task three is device assistance shopping. The average operation time is 377.858 s, standard deviation 203.964. Male 456.771 s, standard deviation 191.260. Female 377.858 s, standard deviation 200.412. As shown in Table 1.
As Table 2 shows, the single-factor variation analysis of repeated quantities meets the homogeneity hypothesis of variation, and it meets the spherical test hypothesis. The Mauchly’s W coefficient is .697 (x2 = 5.415, p = .067 > .05), so no correction tool is required.
The data complies with the sphericity test hypothesis. The sum of squares for group effect and independent variable effect is 386285.378. The mean square value is 193142.689. F (2,32) = 11.344, p < .001. This indicates that under different shopping modes, the task operation time of the subjects has significant differences.
Taking gender as the inter-subject factor, the sum of squared deviations from the mean of the effect for the independent variable is 110138.553, the mean square value is 55069.276, and F (2,32) = 3.234, p > .001, indicating that in different shopping modes, gender has no significant difference in the subjects’ task operation time. As in Table 3.
Through pairwise comparison of dependent single-factor variation numbers, the results show that the subjects’ task operation time in the three different shopping modes C > B > A, as in Table 4.
4.3 Analysis Result for Task Correction Rate
Task one is shopping alone. The average correct operation rate is 39%, standard deviation 28.725. Male 12.5%, standard deviation 18.9. Female 60%, standard deviation 12.9. The second task is accompanied shopping. The average correct operation rate is 97.25%, standard deviation 8.075. Male 93.75%, standard deviation .000. Female 100%, standard deviation .000. Task three is device assistance shopping. The average operation accuracy rate is 90.25%, standard deviation 12.55. Male 87.5%, standard deviation 13.375 s. Female 92.5%, standard deviation 12.075, as in Table 5.
As shown in Table 6, the analysis of the single factor variation of repeated quantity meets the homogeneity assumption of the variation number, and reaches the spherical test hypothesis. The Mauchly’s W coefficient is .891 (x2 = 1.733, p = .420 > .05), so the use of correction tool is not required.
The data complies with the spherical test assumption. The sum of squared deviations from the mean of the independent variable effect is 62.689. The mean square value is 31.344, and F (2,32) = 127.503, p < .001. It indicates that in different shopping modes, the subjects’ task operation accuracy rates are significantly different.
Taking gender as the inter-subject factor, the sum of squared deviations from the mean of the independent variable effect is 8.319, the mean square value is 4.159, and F (2,32) = 16.919, p < .001. It indicates that in different shopping modes, the gender has a significant difference in task operation accuracy. Table 7.
Through the pairwise comparison of the dependent single factor variation analysis, the results show that the subjects’ task operation accuracy rate is B > C > A in three different shopping modes, as in Table 8.
4.4 Analysis Result for Device Satisfaction
The highest score in satisfactory rating is the extensiveness of the product information audio at an average at 4.5. And product information audio accurateness averages at 4.44 points. As to device effective for shopping assistance, the average score is 4.17. And the average score for device operation usability is 4.00.
5 Conclusion and Discussion
The result of experiment indicates that the task operation time of the three models significantly differs. The device assisted shopping model took longer to complete as the input and output of the device took time. Handheld camera lens may not be able to confirm that the product is captured. Lighting or angles may also limit the device accuracy in terms of product recognition. When the device fails to recognize the product, notification takes times to make audible cues. The operation time lengthens as a result of repeated recognition process. Time for the accompanied shopping model varies depending on the personality and lifestyle of the test subject. The experiment discovers that some test subjects rely entirely on the companion while other test subjects tend to judge on their own with touch and smell before relying on their respective companion. When test subjects shop alone without any help, the subjects resort to shopping experiences in the past to identify package. They recognize product via smell and guess via intuition. The test time for this model is short, but the accuracy rate is relatively lower relative to other shopping models.
The result of task operation accuracy rate carries significant differences in three shopping modes. Different gender also contributes to prominent disparity. If device assistance shopping recognizes the products in the confusion matrix, such as Uni-President black tea and milk tea, it is easy to pick the wrong product due to likely recognition mistakes. Accompanied shopping is highly accurate because it is assisted by a companion. As to shopping alone, the accuracy rate for women is higher than that for men. The research notices that women are more familiar with the products. Sunscreen lotion, for instance, is a product frequently acquired by women. Men seldom purchase it. For this particular item, sunscreen lotion, the accuracy rate is lower for men.
This study stresses to probe if the device is efficient to assist the visually impaired individual in shopping. The experimental results show that device assistance shopping broadcasting voiced product information can effectively enhance shopping ability of the visually impaired individual while shopping alone. It is expected that system recognition will speed up in the future. Lighting restriction and enlarged vision field are expected to effectively heighten the level of accompanying shopping, and solve the problem of insufficient accompanying manpower. The subjects in this experiment are blindfolded. Due to certain discrepancies of daily habits between the sighted and the visually impaired individual, this research is just a preliminary attempt. The future experiment will invite the visually impaired individual to test the actual device. As more and more handheld mobile devices are used by the visually impaired individual, our future experiment scheme includes the development of multi-platform concept for handheld mobile device software.
References
Journal 2(5), 99–110 (2016). World Health Organization, https://www.who.int/publications-detail/world-report-on-vision. Accessed 17 Jan 2019
Kulyukin, V., Kutiyanawala, A.: Accessible shopping systems for blind and visually impaired individuals: design requirements and the state of the art. Open Rehabil. J. 3, 158–168 (2010)
Kostyra, E., Zakowska-Biemans, S., Sniegocka, K., Piotrowska, A.: Food shopping, sensory determinants of food choice and meal preparation by visually impaired people. Obstacles and expectations in daily food experiences. Appetite 113, 14–22 (2017)
Chuang, Y.: A Study of People with Severe Visual Impairments regarding their daily shopping behaviors (2013)
Lin, C.: A study of the effect of wearable assistive shopping device on visual impaired shoppers (2019)
Li, K.F., Wang Y.K.: Artificial Intelligence is Coming. Commonwealth (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Ashton, K.: That ‘Internet of Things’ thing. RFID J. 22, 97–114 (2009)
Gharpure, C.P., Kulyukin, V.A.: Robot-assisted shopping for the blind: issues in spatial cognition and product selection. Intel. Serv. Robot. 1(3), 237–251 (2008)
Nicholson, J., Kulyukin, V., Coster, D.: ShopTalk: independent blind shopping through verbal route directions and barcode Scans. Open Rehabil. J. 2, 11–23 (2009)
Hung, Y.-H., Feng, C.-H., Lin, C.-T., Chen, C.-J.: Research on wearable shopping aid device for visually impaired people. In: Antona, M., Stephanidis, C. (eds.) HCII 2019. LNCS, vol. 11572, pp. 244–257. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23560-4_18
Lin, C.S., Huang, Y. J.: Success Operation of Shopping Mall. Tiaoho Culture Co. (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Feng, CH., Hsieh, JY., Hung, YH., Chen, CJ., Chen, CH. (2020). Research on the Visually Impaired Individuals Shopping with Artificial Intelligence Image Recognition Assistance. In: Antona, M., Stephanidis, C. (eds) Universal Access in Human-Computer Interaction. Applications and Practice. HCII 2020. Lecture Notes in Computer Science(), vol 12189. Springer, Cham. https://doi.org/10.1007/978-3-030-49108-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-49108-6_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49107-9
Online ISBN: 978-3-030-49108-6
eBook Packages: Computer ScienceComputer Science (R0)