A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores

Cinicioglu, Esma Nur; Shenoy, Prakash P.

doi:10.1007/s10479-012-1171-9

A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores

Published: 21 June 2012

Volume 244, pages 385–405, (2016)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Esma Nur Cinicioglu¹ &
Prakash P. Shenoy²

998 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

Bayesian networks (BNs) are a useful tool for applications where dynamic decision-making is involved. However, it is not easy to learn the structure and conditional probability tables of BNs from small datasets. There are many algorithms and heuristics for learning BNs from sparse datasets, but most of these are not concerned with the quality of the learned network in the context of a specific application. In this research, we develop a new heuristic on how to build BNs from sparse datasets in the context of its performance in a real-time recommendation system. This new heuristic is demonstrated using a market basket dataset and a real-time recommendation model where all items in the grocery store are RFID tagged and the carts are equipped with an RFID scanner. With this recommendation model, retailers are able to do real-time recommendations to customers based on the products placed in cart during a shopping event.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Design of Storage System Based on RFID and Intelligent Recommendation

Markov Chain Monte Carlo for Effective Personalized Recommendations

A Bayesian Network Approach for Predicting Purchase Behavior via Direct Observation of In-store Behavior

Notes

The Netflix prize competition seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on the ratings of the movies they have already seen.
Although most of the products are identified by a unique barcode, some article numbers in the dataset represent a group of products rather than an individual product item (Brijs et al. 1999).
This number is the sum of each of these products’ count for the baskets they are in.
Rounded to two decimal places 8897/2562=3.47.
In this dataset 3.47 indicates the point where a big drop in the sales figures of the most purchased products happen.
Throughout the paper the “final BN” phrase will refer to the BN which is created on the last step of the application of the heuristic after the variables to be included in the BN are determined. This final BN constitutes the basis for the performance evaluation of the proposed heuristic.
The average probability of correct prediction for the marginal model is 78.2288 % in this case.
BNs created without the using the heuristic.
While some products are common in each of the most purchased 30 products lists (mostly the most purchased top five and their frequencies do differ), the variety of the products included in the data sets are different for all of the four data sets.
The CPTs are obtained from the BNs created with the 6th–30th most purchased products.

References

Breese, J. S., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the fourteenth conference on uncertainty in artificial intelligence, Madison, WI. San Mateo: Morgan Kaufmann.
Google Scholar
Brijs, T., Swinnen, G., Vanhoof, K., & Wets, G. (1999). The use of association rules for product assortment decisions: a case study. In Proceedings of the fifth international conference on knowledge discovery and data mining, San Diego USA, August 15–18 (pp. 254–260). ISBN:1-58113-143-7.
Chapter Google Scholar
Cavoukian, A. (2004). Tag, you’re it: privacy implications of radio frequency identification (RFID) technology. Toronto: Information and Privacy Commissioner.
Google Scholar
Cinicioglu, E. N., Shenoy, P. P., & Kocabasoglu, C. (2007). Use of radio frequency identification for targeted advertising: a collaborative filtering approach using Bayesian networks. In K. Mellouli (Ed.), Lecture notes in artificial intelligence: Vol. 4724. Symbolic and quantitative approaches to reasoning with uncertainty (pp. 889–900). Berlin: Springer.
Chapter Google Scholar
Cozman, F. G. (2000). Credal networks. Artificial Intelligence, 120(2), 199–233.
Article Google Scholar
Cui, G., Wong, M. L., & Zhang, G. (2010). In Bayesian variable selection for binary response models and direct marketing forecasting, expert systems with applications (Vol. 37, pp. 7656–7662).
Google Scholar
Finkenzeller, K. (1999). RFID handbook radio-frequency identification and applications. New York: John Wiley.
Google Scholar
Friedman, N., Goldszmidt, M., Heckerman, D., & Russell, S. (1997). Challange: what is the impact of Bayesian networks on learning? In Proceedings of the 15 ^th international joint conference on artificial intelligence (NIL-97) (pp. 10–15).
Google Scholar
Friedman, N., Nachman, L., & Pe’er, D. (1999). Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proc. fifteenth conference on uncertainty in artificial intelligence (UAI’ 99) (pp. 196–205).
Google Scholar
Goldenberg, A., & Moore, A. (2004). Tractable learning of large Bayes net structures from sparse data. In Proceedings of 21 ^st international conference on machine learning.
Google Scholar
Gu, Q., Cai, Z., Zhu, L., & Huang, B. (2008). Data mining on imbalanced data sets. In Proc. international conference on advanced computer theory and engineering (pp. 1020–1024).
Google Scholar
Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discovery 1, 79–119.
Article Google Scholar
Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R., & Kadie, C. (2000). Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 1, 49–75.
Google Scholar
Hicks, P. (1999). RFID and the book trade. Publishing Research Quarterly, 15(2), 21–23.
Article Google Scholar
Jaroszewicz, S., & Simovici, D. A. (2004). Interestingness of frequent itemsets using Bayesian networks as background knowledge. In Proceedings of the 2004ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WA, USA (pp. 178–186).
Chapter Google Scholar
Liu, B., Zhao, K., Benkler, J., & Xiao, W. (2006). Rule interestingness analysis using OLAP operations. In Proc. ACM, KDD (pp. 297–306).
Google Scholar
Liu, F., Tian, F., & Zhu, Q. (2007). An improved greedy Bayesian network learning algorithm on limited data. In Marques de Sá et al. (Ed.), Lecture notes in computer science: Vol. 4668. ICANN 2007 (pp. 49–57). Berlin: Springer.
Google Scholar
Madsen, A., Lang, M., Kjaerulff, U., & Jensen, F. (2004). The Hugin tool for learning Bayesian networks. In Symbolic and quantitative approaches to reasoning with uncertainty (pp. 594–605). Berlin: Springer.
Google Scholar
Mild, A. (2003). An improved collaborative filtering approach for predicting cross-category purchases based on binary market basket data. Journal of Retailing and Consumer Services, 10, 123–133.
Article Google Scholar
Oniśko, A., Druzdzel, M. J., & Wasyluk, H. (2001). Learning Bayesian network parameters: application of noisy-OR gates. International Journal of Approximate Reasoning, 27, 165–182.
Article Google Scholar
Pine, B. J. II (1993). Mass customization. Boston: Harvard Business School Press.
Google Scholar
Pine, B. J. II, & Gilmore, J. H. (1999). The experience economy. Boston: Harvard Business School Press.
Google Scholar
Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., & Riedl, J. (1994). Grouplens: an open architecture for collaborative filtering of netnews. In Proceedings of the ACM conference on computer supported cooperative work (pp. 175–186).
Google Scholar
Resnick, P., & Varian, H. (1997). Recommender systems. Communications of the ACM, 40(3), 56–58.
Article Google Scholar
SC Digest Editorial Staff (2009, January). The five cent RFID tag is here. http://www.scdigest.com/assets/newsviews/09-01-27-2.pdf.
Romanycia, M. H., & Pelletier, F. J. (1985). What is a heuristic? Computational Intelligence, 1, 57–58.
Article Google Scholar
Scuderi, M., & Clifton, K. (2005). Bayesian approaches to learning from data: using NHTS data for the analysis of land use and travel behavior. Bureau of Transportation Statistics, US Department of Transportation, Washington, DC.
Whitaker, J., Mithas, S., & Krishnan, M. S. (2007). A field study of RFID deployment and return expectations. Production and Operations Management, 16(5), 599–612.
Article Google Scholar
Yu, K., Schwaighofer A., Tresp, V., Xu, X., & Kriegel, H. P. (2004). Probabilistic memory-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 15(1), 56–69.
Google Scholar

Download references

Acknowledgement

This research was partly supported by Istanbul University research fund project number 6858. We are grateful for three anonymous reviewers of AnOR for comments and suggestions for improvements.

Author information

Authors and Affiliations

Istanbul University School of Business, 34320, Avcilar, Istanbul, Turkey
Esma Nur Cinicioglu
University of Kansas School of Business, 1300 Sunnyside Ave, Summerfield Hall, Lawrence, KS, 66045-7601, USA
Prakash P. Shenoy

Authors

Esma Nur Cinicioglu
View author publications
You can also search for this author in PubMed Google Scholar
Prakash P. Shenoy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Esma Nur Cinicioglu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cinicioglu, E.N., Shenoy, P.P. A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores. Ann Oper Res 244, 385–405 (2016). https://doi.org/10.1007/s10479-012-1171-9

Download citation

Published: 21 June 2012
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10479-012-1171-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores

Abstract

Access this article

Similar content being viewed by others

Design of Storage System Based on RFID and Intelligent Recommendation

Markov Chain Monte Carlo for Effective Personalized Recommendations

A Bayesian Network Approach for Predicting Purchase Behavior via Direct Observation of In-store Behavior

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores

Abstract

Access this article

Similar content being viewed by others

Design of Storage System Based on RFID and Intelligent Recommendation

Markov Chain Monte Carlo for Effective Personalized Recommendations

A Bayesian Network Approach for Predicting Purchase Behavior via Direct Observation of In-store Behavior

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation