Abstract
It is often crucial for manufacturers to decide what products to produce so that they can increase their market share in an increasingly fierce market. To decide which products to produce, manufacturers need to analyze the consumers’ requirements and how consumers make their purchase decisions so that the new products will be competitive in the market. In this paper, we first present a general distance-based product adoption model to capture consumers’ purchase behavior. Using this model, various distance metrics can be used to describe different real life purchase behavior. We then provide a learning algorithm to decide which set of distance metrics one should use when we are given some accessible historical purchase data. Based on the product adoption model, we formalize the k most marketable products (or k-MMP) selection problem and formally prove that the problem is NP-hard. To tackle this problem, we propose an efficient greedy-based approximation algorithm with a provable solution guarantee. Using submodularity analysis, we prove that our approximation algorithm can achieve at least 63% of the optimal solution. We apply our algorithm on both synthetic datasets and real-world datasets (TripAdvisor.com), and show that our algorithm can easily achieve five or more orders of speedup over the exhaustive search and achieve about 96% of the optimal solution on average. Our experiments also demonstrate the robustness of our distance metric learning method, and illustrate how one can adopt it to improve the accuracy of product selection.
- S. Borzsony, Donald Kossmann, and Konrad Stocker. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering. IEEE, 421--430. Google ScholarDigital Library
- Timothy M. Chan, Kasper Green Larsen, and Mihai Pătraşcu. 2011. Orthogonal range searching on the RAM, revisited. In Proceedings of the 27th Annual Symposium on Computational Geometry. ACM, 1--10. Google ScholarDigital Library
- Gerard Cornuejols, Marshall L. Fisher, and George L. Nemhauser. 1977. Location of bank accounts to optimize float. Manag. Sci. 23, 8 (1977), 789--810.Google ScholarDigital Library
- Philip E. Gill, Walter Murray, and Margaret H. Wright. 1981. Practical optimization. Academic Press, New York.Google Scholar
- Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. 1996. Implementing data cubes efficiently. In ACM SIGMOD Record, Vol. 25. ACM, 205--216. Google ScholarDigital Library
- David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 137--146. Google ScholarDigital Library
- Jon Kleinberg, Christos Papadimitriou, and Prabhakar Raghavan. 1998. A microeconomic view of data mining. Data Min. Knowl. Discovery 2, 4 (1998), 311--324. Google ScholarDigital Library
- Cuiping Li, Beng Chin Ooi, Anthony K. H. Tung, and Shan Wang. 2006. Dada: A data cube for dominant relationship analysis. In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. ACM, 659--670. Google ScholarDigital Library
- Chen-Yi Lin, Jia-Ling Koh, and Arbee L. P. Chen. 2013. Determining k-most demanding products with maximum expected number of total customers. IEEE Trans.Knowl. Data Eng. 25, 8 (2013), 1732--1747. Google ScholarDigital Library
- Xuemin Lin, Yidong Yuan, Qing Zhang, and Ying Zhang. 2007. Selecting stars: The k most representative skyline operator. In IEEE 23rd International Conference onData Engineering (ICDE 2007). IEEE, 86--95.Google ScholarCross Ref
- George L. Nemhauser and Laurence A. Wolsey. 1988. Integer and Combinatorial Optimization. Vol. 18. Wiley, New York, NY. Google ScholarDigital Library
- George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions - I. Math. Program. 14, 1 (1978), 265--294.Google ScholarDigital Library
- Yu Peng, Raymond Chi-Wing Wong, and Qian Wan. 2012. Finding top-k preferable products. IEEE Trans. Knowl. Data Eng. 24, 10 (2012), 1774--1788. Google ScholarDigital Library
- Qian Wan, Raymond Chi-Wing Wong, Ihab F. Ilyas, M Tamer Özsu, and Yu Peng. 2009. Creating competitive products. In Proceedings of the VLDB Endowment, Vol. 2. VLDB Endowment, 898--909. Google ScholarDigital Library
- Qian Wan, R. C.-W. Wong, and Yu Peng. 2011. Finding top-k profitable products. In IEEE 27th International Conference on Data Engineering (ICDE). IEEE, 1055--1066. Google ScholarDigital Library
- Hongning Wang, Yue Lu, and Chengxiang Zhai. 2010. Latent aspect rating analysis on review text data: A rating regression approach. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 783--792. Google ScholarDigital Library
- Zhenjie Zhang, Laks V. S. Lakshmanan, and Anthony K. H. Tung. 2009. On domination game analysis for microeconomic data mining. ACM Trans. Knowl. Discovery Data (TKDD) 2, 4 (2009), 18. Google ScholarDigital Library
Index Terms
- Product Selection Problem: Improve Market Share by Learning Consumer Behavior
Recommendations
Product selection problem: improve market share by learning consumer behavior
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data miningIt is often crucial for manufacturers to decide what products to produce so that they can increase their market share in an increasingly fierce market. To decide which products to produce, manufacturers need to analyze the consumers' requirements and ...
A provable algorithmic approach to product selection problems for market entry and sustainability
SSDBM '14: Proceedings of the 26th International Conference on Scientific and Statistical Database ManagementGiven the globalized economy, how to process the heterogeneous web data so to extract customers' purchase behavior is crucial to manufacturers who want to enter or sustain in a competitive market. To maximize the sales, manufacturers not only need to ...
On self-selection biases in online product reviews
Online product reviews help consumers infer product quality, and the mean (average) rating is often used as a proxy for product quality. However, two self-selection biases, acquisition bias (mostly consumers with a favorable predisposition acquire a ...
Comments