skip to main content
10.1145/3686081.3686108acmotherconferencesArticle/Chapter ViewFull TextPublication PagesicdsmConference Proceedingsconference-collections
research-article
Open access

Mining Novel Customer Needs from Online Product Review

Published: 18 November 2024 Publication History

Abstract

Identifying new customer needs is essential for companies to take advantage of evolving technology and social trends. However, the traditional methods used to discover emerging needs require much time, expense, and intensive labor. They often leading to delays in product development. Recent years have witnessed the emergence of online product reviews as a promising alternative for uncovering fresh customer requirements. In this study, we suggest utilizing online reviews to identify new customer needs by treating it as a text classification problem. We exploit the BERT language model, which is pre-trained using general text corpus, to create a classifier that can detect reviews containing innovative content. Our experiments validate the effectiveness of this structured approach, even when dealing with reviews of varying lengths. By implementing this methodology, companies can automate the process of identifying new customer needs quickly and efficiently, reducing the need for extensive expert resources. This advancement in research could have implications for product development and other related fields.

1 Introduction

Identifying customer needs is crucial for companies to discover new product opportunities, improve existing products, and adapt to evolving socio-economic and technological changes [1]. This can lead to profitable outcomes, as seen with the success of SUVs and innovative detergent products [2]. However, traditional methods of understanding customer needs, such as surveys and focus groups, are expensive, time-consuming, and labor-intensive [3]. With increasing global competition and the need for faster time-to-market, there is a demand for a more efficient technique to detect novel needs during the early stages of product design.
Online customer reviews, which are abundant and easily accessible on the internet, provide a promising alternative for identifying novel customer needs. Researchers have recognized the value of mining user-generated content (UGC), particularly online reviews, to understand consumer preferences. Previous studies have shown that analyzing UGC can be as effective as traditional methods in identifying customer needs [4]. However, there is a lack of efficient techniques for identifying novel needs from UGC. Many companies still rely on manual screening of UGC data to generate new product ideas.
To overcome these challenges, it is necessary to develop efficient methods for mining novel needs from UGC. Firstly, UGC data is unstructured and requires advanced natural language processing technologies to extract meaningful information [4]. Secondly, novel needs are often unexpected and not expressed in existing UGC. Therefore, it is crucial to have a sensitive needs-detection method to identify emerging needs. Finally, UGC contains irrelevant information, such as complaints about sales service, which makes it necessary to filter out irrelevant content and focus on the informative parts.
This paper aims to address the above challenges and proposes an efficient method for mining UGC to identify novel customer needs. We tackle the problem from a text classification perspective by classifying reviews as positive (containing novel needs) or negative (lacking novel content). Currently, the pretrained large language model has been the mainstream methods in NLP research due to their superior performance. We exploit the BERT language model In this paper to extract semantic features and understand the complex language used in unstructured reviews. The extracted features are then inputted to the classifier of softmax classifier to realize the mapping. We also carried out extensive experiments to show the good performance of the developed baseline research.

2 Relevant Work

Identifying customer needs in product design and marketing research has been a research focus for a long time. Traditional methods, like focus groups [5], the lead user method [6], quality function deployment (QFD) [7], and conjoint analysis [8], are commonly used at the early stages of design. But they have limitations in terms of effectiveness or efficiency when it comes to identifying novel customer needs.
Recently, online product reviews have been accepted as an important resource for researchers and practitioners to understand customer needs. Several studies have explored different approaches to analyze these reviews and extract relevant insights. For example, Zhai et al. summarized review text to identify and rank customer needs, with a focus on those that are critical for design purposes [9]. Liu et al. integrated domain knowledge with online reviews using a knowledge management system for product development. They aimed to leverage this integration to improve the design process [10]. Decker and Trusov aggregated customer needs from review text and examined the how the extracted product information has an influence on customer satisfaction [11]. Zhou et al. (year) utilize a combination of product reviews and customers' purchasing records to determine customer preferences for products through the application of multi-view LDA. This technique allows for the connection of product-related topics with customers' purchasing motivations [12]. Wang et al. analyzed review text and proposed a multitask learning [13] related models to map customer needs expressed in layman languages to the technical engineering characteristics. They further incorporate domain knowledge base to the deep learning network structure to continue boosting the classification's effectiveness [14].
Recent research has also explored the use of online product reviews to elicit useful information and study business related issues. Chen et al. applied online product reviews to extract semantic features and improve the effectiveness of product recommendation [15]. Liu et al. developed a dual attention machine learning method to elicit textual and semantic information from online user generated content, aiming to improve the interpretability of recommendations [16]. Timoshenko and Hauser applied a supervised deep network structure, i.e., CNN, to identify and cluster sentences from product review text, which were then manually reviewed by analysts to find information related to customer needs [4]. However, this work was not specifically focused on identifying novel customer needs and required human expert involvement. Some studies have focused on extracting defect information from online reviews, but it is important to note that defects are different from novel customer needs. As of now, there is limited research on automatically identifying novel needs from user-generated content. This research direction is still at an early phase. It is needed for us to develop more efficient methods by applying machine learning and natural language processing to elicit ideas for product development from online user generated content.

3 Method

3.1 BERT as the Feature Extractor

Considering BERT' s exceptional ability to automatically extract semantic information from text, we employed it to encode the language in reviews. A transformer-based design is utilized by the pre-trained language model BERT [17]. By utilizing bidirectional training, BERT is able to capture the context and meaning of words based on both their former and latter terms, in contrast to typical models that process words in a sequential fashion. BERT is able to comprehend the subtleties and connections inside a phrase because to its bidirectionality. The process of pre-training and fine-tuning BERT is one of its main characteristics. BERT is trained on a sizable corpus of unlabeled text, including novels and Wikipedia articles, during pre-training. BERT gains knowledge about broad language representations thanks to this unsupervised learning. Using labeled data, BERT is further trained on certain downstream tasks, including text classification or sentiment analysis, during the fine-tuning step.
BERT's network contains multiple transformer layers as shown in Figure 1 [18]. Self-attention mechanisms found in a transformer layer enable the model to assign relative weights to words in a phrase depending on how relevant they are to one another as demonstrated in Figure 2. BERT can successfully collect contextual information and long-range dependencies because to its attention technique. Specifically, let \({{\bf x}} = ({x}_1,{x}_2,...,{x}_n)\) be a review containing n words. The review also has associated with a label y. Then \({{\bf x}}\)is fed into BERT and transformed into a feature vector \({{\bf H}} = {\rm{BERT(}}{{\bf x}})\).

3.2 Model Training

The BERT encoding, H, is fed into a SoftMax layer to generate a distribution of likelihood for the label y in the label set as\(P(y = {y}_k|{{\bf x}}) = \frac{{\exp ({{{\bf W}}}_k \cdot {{\bf H}})}}{{\sum\nolimits_j {\exp ({{{\bf W}}}_j \cdot {{\bf H}})} }}\) [19]. Specifically, we use a binary label, denoted as \({y}_k\), to indicate whether the needs are new or not. The parameter \({{{\bf W}}}_k\) is estimated during model training stage. The cross-entropy loss \({L}_{CE} = - \sum\nolimits_{i = 1}^{|{{\bf T}}|} {\sum\nolimits_{j = 1}^{|{{\bf C}}|} {y_j^i\log P_j^i} }\), is minimized during the model training process. \(|{{\bf C}}|\)and\(|{{\bf T}}|\)in the formula represent the training dataset size and the class labels size.
Figure 1:
Figure 1: The structure of BERT in the task
Figure 2:
Figure 2: The details of transformer layer [18]

4 Experiment

4.1 Dataset

We collected laptop reviews from amazon.com between 2017 and 2022, totaling 16,667 reviews. Two annotators with IT backgrounds independently labeled the reviews as either positive (containing novel needs content) or negative (lacking novel needs content). Out of the total reviews, only 187 were labeled as positive.
The dataset was randomly divided into training set (40%), validation set (20%), and testing sets (40%). These sets were applied to performance evaluation, parameter selection, and model training.
We evaluate the developed method by employing the commonly used performance metrics of precision@k, recall@k, and F1@k in the context of text classification [20]. These metrics provide important information about how well the model is able to locate relevant samples within the top-k ranked results. With an emphasis on the significance of the chosen samples, Precision@k evaluates the proportion of properly identified positive samples among the top-k ranked testing samples. A higher percentage of correctly detected positive samples is indicated by higher precision@k values. Recall@k, on the other hand, emphasizes the completeness of the picked samples by measuring the ratio of properly categorized positive samples among all positive samples that should have been chosen inside the top-k ranked samples. A higher percentage of properly detected positive samples is indicated by higher recall@k values. By taking into account both completeness and relevance, F1@k provides a fair assessment of the model's performance by combining precision@k and recall@k into just one metric. The harmonic mean of precision@k and recall@k is used to compute it. Higher F1@k values indicate improved overall performance in selecting relevant and complete samples within the top-k ranked results.

4.2 Dataset Experiment Result

We compare the developed approach with popular text classification approaches, namely fastText [21], CNN [22], and BiLSTM [23], which are commonly used in natural language processing tasks.
For CNN, we used 64-dimension GloVe embeddings, a batch size of 32, and kernel sizes of 2, 3, and 4. The learning rate was determined to be 5e–3 through experimentation. We set the dropout rate to be 0.5. For BiLSTM, we used the same parameter settings for embedding dimensions, batch size, dropout rate, and learning rate due to their superior performance. The performance is indicated in Table 1.
Table 1:
ModelPrecision (overall)Recall (overall)F1 (overall)
fastText0.93940.55330.5927
CNN0.99510.56670.6152
BiLSTM0.87500.66610.7293
BERT0.90600.85240.8773
Table 1: Overall performance in identifying positive samples from the testing dataset
According to the table, the BERT method demonstrated the highest performance across all performance metrics. The overall precision scores for all methods were high due to the imbalanced class distribution, with the negative testing data dominating the overall performance. However, when focusing on the positive sample set, the proposed method showed significantly better recall rates compared to the other methods. In contrast, fastText, a non-deep learning approach, struggled to effectively elicit textual features from the text, resulting in the worst performance. CNN, LSTM, and transformer-based structures like BERT are commonly used for encoding both syntactic and textual features from text [24]. The experimental results indicate that BERT was the most effective in extracting features and identifying novel customer needs.
BERT' s transformer-based architecture is a key factor in its effective capture of contextual information [25]. Unlike models that only consider local dependencies, transformers take into account the entire context of a word or sentence. This ability is especially advantageous for text classification tasks, allowing BERT to comprehend the relationships between words and phrases within the text, even when faced with class imbalance. BERT also gains from its pre-training on large corpora, which helps it understand broad language features and subtleties. [17]. By fine tuning BERT using annotated reviews specifically for identifying novel customer needs, the model can adapt its existing knowledge to better handle imbalanced datasets and improve its proficiency in handling both positive and negative reviews [26].
Table 2:
Testing classPrecision (overall)Recall (overall)F1 (overall)
Positive class0.81540.70670.7571
Negative class0.99670.99820.9974
Overall macro average0.90600.85240.8773
Table 2: Relative performance of BERT and the dual BERT methods
Table 1 displays the text classification performance; however, in order to obtain more understanding, we also analyze the BERT-based classifier performance for each particular class. The total performance and the results for each of the two classes are shown in Table 2. The trial shows that the BERT structure consistently obtains good accuracy, recall, and F1 scores. In our job, task-specific labeled data is used to pretrained and fine-tune BERT. This process of fine-tuning makes it possible for BERT to adjust its prior knowledge to the unbalanced dataset, which is especially pertinent for text classification problems that involve class imbalance, as this work explores [27]. By training on annotated reviews that specifically target the identification of novel customer needs, BERT becomes proficient in effectively handling both positive and negative reviews. Furthermore, BERT employs subword units to represent words, enabling it to handle out-of-vocabulary words. This capability proves advantageous in scenarios with class imbalance, as the minority class may contain unique or unseen phrases which are not adequately represented in the data set. By breaking words into subword units, BERT can still capture the meaning and context of these infrequent words, thereby enhancing its performance on the minority class.

5 Conclusion

In order to effectively address the dynamic nature of consumer demands and provide companies with a competitive edge in responding to market changes, it is essential to be able to quickly and accurately recognize new demands that arise [28, 29]. To tackle this challenge, we have developed a BERT-based method that efficiently identifies new customer needs from online product reviews [30]. Through the application of natural language processing methods and deep neural network, our approach offers a cost-effective and automated solution for extracting valuable insights from the vast amount of online product reviews. The experimental findings show that our approach performs satisfactorily, indicating its potential to transform the idea generation process in new product development. Additionally, our proposed methodology simplifies the identification process and reduces the reliance on extensive expert resources. This research aligns with the evolving landscape of technology-driven innovation, highlighting its capacity to reshape industries by empowering designers to extract more creative ideas and convert insights into tangible product opportunities [31]. Ultimately, the outcomes of this project have the potential to advance the field of product design research and foster a culture of innovation that is adaptable, responsive, and highly attuned to the ever-changing needs of consumers.
Despite the fact that our strategy has worked well, it' s vital to understand its limitations. Our model’ s sensitivity to class imbalance indicates that more study is necessary to look at adaptive methods that can dynamically change the model’ s strength depending on how the data are distributed [32]. To further increase the model’ s adaptability, investigate the effects of hyperparameter changes, including dropout rates and weight modifications [33]. Since this paper uses English-language data, we would like to evaluate its efficacy in other languages. Finally, we are looking into ways to adapt the learned model from one product to another using transfer learning approaches [34].

Acknowledgments

This research conducted in this paper is supported under Hong Kong Research Grant Council Faculty Development Scheme UGC/FDS14/E08/23 and Research Matching Grant SDSC-SRG016.

References

[1]
Eric von Hippel. 2001. User toolkits for innovation. Journal of Product Innovation Management, 18, 4, 247-257.
[2]
Karl T. Ulrich, Steve D. Eppinger. 2016. Product Design and Development, 6th ed. McGraw-Hill, New York.
[3]
Yue Wang, Daniel Mo and Helen Ma. 2023. Perception of time in the online product customization process. Industrial Management & Data Systems, 123(2), 369-385.
[4]
Artem Timoshenko and John Hauser. 2019. Identifying customer needs from user generated content. Marketing Science, 38, 1, 1-20.
[5]
David Morgan. 1997. The Focus Group Guidebook (1st ed.). SAGE Publications.
[6]
Robert G. Cooper and Scott Edgett. 2009. Generating Breakthrough New Product Ideas: Feeding the Innovation Funnel. Product Development Institute Inc.
[7]
Hunter, M. R., 1994. Listening to the customer using QFD. Quality Progress, 27, 4, 55-59.
[8]
Alieksiei Martins, Elaine Aspinwall. 2001. Quality Function Deployment: An Empirical Study in the UK. Total Quality Management, 12, 5, 575-588.
[9]
Zhongwu Zhai, Bing Liu, Hua Xu, Peifa Jia., 2011. Clustering Product Features for Opinion Mining. In Proceedings of the Fourth ACM International Conference on Web Search and Data Mining (WSDM'11) (pp. 347–354). Hong Kong, China.
[10]
Jia Liu and Olivier Toubia. 2018. A semantic approach for estimating consumer content preferences from online search queries. Marketing Science, 37, 930–952.
[11]
Reinhold Decker and Michael Trusov. 2010. Estimating aggregate consumer preferences from online product reviews. International Journal of Research in Marketing, 27, 4, 293–307.
[12]
Fan Zhou, Yuanchun Jiang, Yang Qian, Yezheng Liu, Yidong Chai. 2024. Product consumptions meet reviews: Inferring consumer preferences by an explainable machine learning approach. Decision Support Systems, 177, 114088.
[13]
Yue Wang, Xiang Li and Fugee Tsung. 2020. Configuration-based smart customization service: A multitask learning approach. IEEE Transactions on Automation Science and Engineering, 17, 4, 2038-2047.
[14]
Yue Wang, Xiang Li and Daniel Mo. 2021. Knowledge-empowered multi-task learning to address the semantic gap between customer needs and design specifications. IEEE Transactions on Industrial Informatics, 17, 12, 8397 - 8405.
[15]
Xu Chen, Hanxiong Chen, Hongteng Xu, Yongfeng Zhang, Yixin Cao, Zheng Qin and Hongyuan Zha. 2019. Personalized fashion recommendation with visual explanations based on multimodal attention network: towards visually explainable recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 765–774).
[16]
Donghua Liu, Jing Li, Bo Du, Jun Chang, Rong Gao. 2019. Daml: Dual attention mutual learning between ratings and reviews for item recommendation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 344–352).
[17]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (pp. 4171–4186).
[18]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin. 2017. Attention Is All You Need. Proceedings of the 31st Conference on Neural Information Processing Systems (pp. 6000-6010).
[19]
Linkai Luo, Yue Wang and Hai Liu. 2022. COVID-19 personal health mention detection from Tweets using dual convolutional neural network, Expert Systems with Applications, 200: 117139.
[20]
Wei Zong, Feng Wu, Lap-Keung Chu, Domenic Sculli. 2015. A discriminative and semantic feature selection method for text categorization. International Journal of Production Economics, 165, 215-222.
[21]
Armand Joulin, Edouard Grave, Piotr Bojanowski and Tomas Mikolov. 2016. Bag of Tricks for Efficient Text Classification. arXiv:1607.01759.
[22]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1746–1751, Doha, Qatar. Association for Computational Linguistics.
[23]
Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao and Bo Xu. 2016. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling, Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 3485–3495, Osaka, Japan, December 11-17, 2016
[24]
Zifan Peng, Mingchen Li, Yue Wang, George T.S. Ho. 2023. Combating the COVID-19 infodemic using prompt-based curriculum learning, Expert Systems with Applications, 229, Part A, 120501.
[25]
Jinyang Guo, Jiaheng Liu, Zining Wang, Yuqing Ma, Ruihao Gong, Ke Xu, Xianglong Liu. 2023. Adaptive contrastive knowledge distillation for BERT compression. In Findings of the Association for Computational Linguistics: ACL 2023, pp. 8941–8953, Toronto, Canada. Association for Computational Linguistics.
[26]
Xinran Zhao, Esin Durmus, Dit-Yan Yeung. 2023. Towards reference-free text simplification evaluation with a BERT siamese network architecture. In Findings of the Association for Computational Linguistics: ACL 2023, pp. 13250–13264, Toronto, Canada. Association for Computational Linguistics.
[27]
Suchin Gururangan, Ana Marasović, Swabha Swayamdipta, Kyle Lo, Iz Beltagy, Doug Downey and Noah A. Smith. 2020. Don't stop pretraining: adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360, Online.
[28]
Chun Ho Wu, Yue Wang and Jie Ma. 2021. Maximal Marginal Relevance-Based Recommendation for Product Customization. Enterprise Information Systems, 17(5), 1992018.
[29]
Yue Wang, Xiang Li, Linda L. Zhang and Daniel Mo. 2022. Configuring products with natural language: a simple yet effective approach based on text embeddings and multilayer perceptron. International Journal of Production Research, 60, 17, 5394–5406.
[30]
Yue Wang and Mitchell Tseng. 2014. Incorporating tolerances of customers’ requirements for customized products, CIRP Annals, 63, 1, 129-132.
[31]
Andrea S. Patrucco, Tobias Schoenherr and Antonella Moretto. 2024. Sustaining commitment in preferred buyer-supplier relationships: How to retain the ‘customer of choice’ status? International Journal of Production Economics, 270, 109165.
[32]
Qi Dai, Jian-wei Liu and Yong-hui Shi. 2023. Class-overlap undersampling based on Schur decomposition for Class-imbalance problems. Expert Systems with Applications, 221, 119735.
[33]
Lizhi Liao, Heng Li, Weiyi Shang, and Lei Ma. 2022. An empirical study of the impact of hyperparameter tuning and model optimization on the performance properties of deep neural networks. ACM Transactions on Software Engineering and Methodology, 31, 3, Article 53.
[34]
Xiaohan Chen, Rui Yang, Yihao Xue, Mengjie Huang and Roberto Ferrero. 2023. Deep transfer learning for bearing fault diagnosis: A systematic review since 2016. IEEE Transactions on Instrumentation and Measurement, 72, 1, 1-21, 2023, article 3508221.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICDSM '24: Proceedings of the International Conference on Decision Science & Management
April 2024
356 pages
ISBN:9798400718151
DOI:10.1145/3686081
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 November 2024

Check for updates

Author Tags

  1. Language model
  2. Natural language processing
  3. Novel customer needs
  4. Product design
  5. Product review

Qualifiers

  • Research-article

Funding Sources

  • Hong Kong RGC FDS

Conference

ICDSM 2024

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 144
    Total Downloads
  • Downloads (Last 12 months)144
  • Downloads (Last 6 weeks)69
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media