Skip to main content
Log in

A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Bayesian networks (BNs) are a useful tool for applications where dynamic decision-making is involved. However, it is not easy to learn the structure and conditional probability tables of BNs from small datasets. There are many algorithms and heuristics for learning BNs from sparse datasets, but most of these are not concerned with the quality of the learned network in the context of a specific application. In this research, we develop a new heuristic on how to build BNs from sparse datasets in the context of its performance in a real-time recommendation system. This new heuristic is demonstrated using a market basket dataset and a real-time recommendation model where all items in the grocery store are RFID tagged and the carts are equipped with an RFID scanner. With this recommendation model, retailers are able to do real-time recommendations to customers based on the products placed in cart during a shopping event.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The Netflix prize competition seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on the ratings of the movies they have already seen.

  2. Although most of the products are identified by a unique barcode, some article numbers in the dataset represent a group of products rather than an individual product item (Brijs et al. 1999).

  3. This number is the sum of each of these products’ count for the baskets they are in.

  4. Rounded to two decimal places 8897/2562=3.47.

  5. In this dataset 3.47 indicates the point where a big drop in the sales figures of the most purchased products happen.

  6. Throughout the paper the “final BN” phrase will refer to the BN which is created on the last step of the application of the heuristic after the variables to be included in the BN are determined. This final BN constitutes the basis for the performance evaluation of the proposed heuristic.

  7. The average probability of correct prediction for the marginal model is 78.2288 % in this case.

  8. BNs created without the using the heuristic.

  9. While some products are common in each of the most purchased 30 products lists (mostly the most purchased top five and their frequencies do differ), the variety of the products included in the data sets are different for all of the four data sets.

  10. The CPTs are obtained from the BNs created with the 6th–30th most purchased products.

References

  • Breese, J. S., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the fourteenth conference on uncertainty in artificial intelligence, Madison, WI. San Mateo: Morgan Kaufmann.

    Google Scholar 

  • Brijs, T., Swinnen, G., Vanhoof, K., & Wets, G. (1999). The use of association rules for product assortment decisions: a case study. In Proceedings of the fifth international conference on knowledge discovery and data mining, San Diego USA, August 15–18 (pp. 254–260). ISBN:1-58113-143-7.

    Chapter  Google Scholar 

  • Cavoukian, A. (2004). Tag, you’re it: privacy implications of radio frequency identification (RFID) technology. Toronto: Information and Privacy Commissioner.

    Google Scholar 

  • Cinicioglu, E. N., Shenoy, P. P., & Kocabasoglu, C. (2007). Use of radio frequency identification for targeted advertising: a collaborative filtering approach using Bayesian networks. In K. Mellouli (Ed.), Lecture notes in artificial intelligence: Vol. 4724. Symbolic and quantitative approaches to reasoning with uncertainty (pp. 889–900). Berlin: Springer.

    Chapter  Google Scholar 

  • Cozman, F. G. (2000). Credal networks. Artificial Intelligence, 120(2), 199–233.

    Article  Google Scholar 

  • Cui, G., Wong, M. L., & Zhang, G. (2010). In Bayesian variable selection for binary response models and direct marketing forecasting, expert systems with applications (Vol. 37, pp. 7656–7662).

    Google Scholar 

  • Finkenzeller, K. (1999). RFID handbook radio-frequency identification and applications. New York: John Wiley.

    Google Scholar 

  • Friedman, N., Goldszmidt, M., Heckerman, D., & Russell, S. (1997). Challange: what is the impact of Bayesian networks on learning? In Proceedings of the 15 th international joint conference on artificial intelligence (NIL-97) (pp. 10–15).

    Google Scholar 

  • Friedman, N., Nachman, L., & Pe’er, D. (1999). Learning Bayesian network structure from massive datasets: the “sparse candidate” algorithm. In Proc. fifteenth conference on uncertainty in artificial intelligence (UAI’ 99) (pp. 196–205).

    Google Scholar 

  • Goldenberg, A., & Moore, A. (2004). Tractable learning of large Bayes net structures from sparse data. In Proceedings of 21 st international conference on machine learning.

    Google Scholar 

  • Gu, Q., Cai, Z., Zhu, L., & Huang, B. (2008). Data mining on imbalanced data sets. In Proc. international conference on advanced computer theory and engineering (pp. 1020–1024).

    Google Scholar 

  • Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discovery 1, 79–119.

    Article  Google Scholar 

  • Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R., & Kadie, C. (2000). Dependency networks for inference, collaborative filtering, and data visualization. Journal of Machine Learning Research, 1, 49–75.

    Google Scholar 

  • Hicks, P. (1999). RFID and the book trade. Publishing Research Quarterly, 15(2), 21–23.

    Article  Google Scholar 

  • Jaroszewicz, S., & Simovici, D. A. (2004). Interestingness of frequent itemsets using Bayesian networks as background knowledge. In Proceedings of the 2004ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WA, USA (pp. 178–186).

    Chapter  Google Scholar 

  • Liu, B., Zhao, K., Benkler, J., & Xiao, W. (2006). Rule interestingness analysis using OLAP operations. In Proc. ACM, KDD (pp. 297–306).

    Google Scholar 

  • Liu, F., Tian, F., & Zhu, Q. (2007). An improved greedy Bayesian network learning algorithm on limited data. In Marques de Sá et al. (Ed.), Lecture notes in computer science: Vol. 4668. ICANN 2007 (pp. 49–57). Berlin: Springer.

    Google Scholar 

  • Madsen, A., Lang, M., Kjaerulff, U., & Jensen, F. (2004). The Hugin tool for learning Bayesian networks. In Symbolic and quantitative approaches to reasoning with uncertainty (pp. 594–605). Berlin: Springer.

    Google Scholar 

  • Mild, A. (2003). An improved collaborative filtering approach for predicting cross-category purchases based on binary market basket data. Journal of Retailing and Consumer Services, 10, 123–133.

    Article  Google Scholar 

  • Oniśko, A., Druzdzel, M. J., & Wasyluk, H. (2001). Learning Bayesian network parameters: application of noisy-OR gates. International Journal of Approximate Reasoning, 27, 165–182.

    Article  Google Scholar 

  • Pine, B. J. II (1993). Mass customization. Boston: Harvard Business School Press.

    Google Scholar 

  • Pine, B. J. II, & Gilmore, J. H. (1999). The experience economy. Boston: Harvard Business School Press.

    Google Scholar 

  • Resnick, P., Iacovou, N., Suchak, M., Bergstorm, P., & Riedl, J. (1994). Grouplens: an open architecture for collaborative filtering of netnews. In Proceedings of the ACM conference on computer supported cooperative work (pp. 175–186).

    Google Scholar 

  • Resnick, P., & Varian, H. (1997). Recommender systems. Communications of the ACM, 40(3), 56–58.

    Article  Google Scholar 

  • SC Digest Editorial Staff (2009, January). The five cent RFID tag is here. http://www.scdigest.com/assets/newsviews/09-01-27-2.pdf.

  • Romanycia, M. H., & Pelletier, F. J. (1985). What is a heuristic? Computational Intelligence, 1, 57–58.

    Article  Google Scholar 

  • Scuderi, M., & Clifton, K. (2005). Bayesian approaches to learning from data: using NHTS data for the analysis of land use and travel behavior. Bureau of Transportation Statistics, US Department of Transportation, Washington, DC.

  • Whitaker, J., Mithas, S., & Krishnan, M. S. (2007). A field study of RFID deployment and return expectations. Production and Operations Management, 16(5), 599–612.

    Article  Google Scholar 

  • Yu, K., Schwaighofer A., Tresp, V., Xu, X., & Kriegel, H. P. (2004). Probabilistic memory-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 15(1), 56–69.

    Google Scholar 

Download references

Acknowledgement

This research was partly supported by Istanbul University research fund project number 6858. We are grateful for three anonymous reviewers of AnOR for comments and suggestions for improvements.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Esma Nur Cinicioglu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cinicioglu, E.N., Shenoy, P.P. A new heuristic for learning Bayesian networks from limited datasets: a real-time recommendation system application with RFID systems in grocery stores. Ann Oper Res 244, 385–405 (2016). https://doi.org/10.1007/s10479-012-1171-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-012-1171-9

Keywords

Navigation