Abstract
“Tell me what you eat and I will tell you what you are”. Jean Anthelme Brillat-Savarin was among the firsts to recognize the relationship between identity and food consumption. Food adoption choices are much less exposed to external judgment and social pressure than other individual behaviours, and can be observed over a long period. That makes them an interesting basis for, among other applications, studying the integration of immigrants from a food consumption viewpoint. Indeed, in this work we analyze immigrants’ food consumption from shopping retail data for understanding if and how it converges towards those of natives. As core contribution of our proposal, we define a score of adoption of natives’ consumption habits by an individual as the probability of being recognized as a native from a machine learning classifier, thus adopting a completely data-driven approach. We measure the immigrant’s adoption of natives’ consumption behavior over a long time, and we identify different trends. A case study on real data of a large nation-wide supermarket chain reveals that we can distinguish five main different groups of immigrants depending on their trends of native consumption adoption.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Data for refugees of Turkey http://d4r.turktelekom.com.tr/.
- 2.
We assume that the expenditure function E also accounts for the quantity.
- 3.
We consider also features derived from others, like \( AL \) and \( AE \), since they might capture different aspects of the customer shopping behavior. Where needed, redundant features can be removed at the preprocessing stage preceding the training phase of the machine learning classifier.
- 4.
Notice that we are implicitly assuming that the food consumption habits of natives do not change over time. While not true in general, we empirically observed that it holds for the vast majority of customers in our data. Studying natives’ evolution in time is part of our future works.
- 5.
The source code of tinca is available here: https://github.com/riccotti/TINCA.
- 6.
- 7.
The 100 product groups are available in the shared repository. The grouping was performed manually to respect the implicit semantic meaning. Each product models on average 1.9 ± 2.0 categories of items of the UniCoop dataset. The largest product groups are those modeling “bread”, “fish”, and “vegetables”.
- 8.
- 9.
Pearson of 0.75 and Spearman of 0.78 in both cases with p-value < 0.0005.
- 10.
We leave to future works the study of the effect of other specific functions for time series clustering like dynamic time warping.
- 11.
For each group we emphasize the countries having the largest relative number of customers in that group normalized on the total number of customers from that specific country. Focusing on countries with larger absolute presence would be less interesting, as a few countries with overall very large presence (e.g. Romania, Switzerland and Germany) would simply overwhelm the others in all groups.
References
Abramitzky, R., et al.: Cultural assimilation during the age of mass migration. Technical report, National Bureau of Economic Research (2016)
Agrawal, R., et al.: Fast algorithms for mining association rules. In Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)
Akerlof, G.A., et al.: Identity economics. Econ. Voice 7(2), 1–3 (2010)
Alba, R., et al.: Only english by the third generation? Demography 39(3), 467 (2002)
Alesina, A., Tabellini, G., Trebbi, F.: Is Europe an optimal political area?. Technical report, National Bureau of Economic Research (2017)
Arai, M., et al.: Renouncing personal names: an empirical examination of surname change and earnings. J. Labor Econ. 27(1), 127–147 (2009)
Atkin, D.: The caloric costs of culture: Evidence from Indian migrants. Am. Econ. Rev. 106(4), 1144–1181 (2016)
Bertoli, S., et al.: Integration of Syrian refugees: insights from D4R, media events and housing market data. In: Salah, A.A., Pentland, A., Lepri, B., Letouzé, E. (eds.) Guide to Mobile Data Analytics in Refugee Scenarios, pp. 179–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-12554-7_10
Bertrand, M., Kamenica, E.: Coming apart? cultural distances in the united states over time. Technical report, National Bureau of Economic Research (2018)
Borjas, G.J.: The analytics of the wage effect of immigration. IZA J. Migr. 2(1), 1–25 (2013). https://doi.org/10.1186/2193-9039-2-22
Borjas, G.J.: Unraveling the immigration narrative. N&C (2016)
Brillat-Savarin, J.A.: Physiologie du goût. Charpentier (1841)
Bronnenberg, B.J., et al.: The evolution of brand preferences: evidence from consumer migration. Am. Econ. Rev. 102(6), 2472–2508 (2012)
Bucheli, J.R., Fontenla, M., Waddell, B.J.: Return migration and violence. World Dev. 116, 113–124 (2019)
Chaffey, D., Ellis-Chadwick, F., Mayer, R., Johnston, K.: Internet Marketing: Strategy, Implementation and Practice. Pearson Education, London (2009)
Chamberlain, B.P., et al.: Customer lifetime value prediction using embeddings. In: ACM SIGKDD, pp. 1753–1762 (2017)
Chen, M.-C., Chiu, A.-L., Chang, H.-H.: Mining changes in customer behavior in retail marketing. Expert Syst. Appl. 28(4), 773–781 (2005)
Docquier, F., et al.: Emigration and democracy. The World Bank (2011)
Dustmann, C., et al.: Labor supply shocks, native wages, and the adjustment of local employment. Q. J. Econ. 132(1), 435–483 (2017)
Fryer Jr., R.G., Levitt, S.D.: The causes and consequences of distinctively black names. Q. J. Econ. 119(3), 767–805 (2004)
Guidotti, R., Coscia, M., Pedreschi, D., Pennacchioli, D.: Behavioral entropy and profitability in retail. In: 2015 IEEE DSAA, pp. 1–10. IEEE (2015)
Guidotti, R., Gabrielli, L.: Recognizing residents and tourists with retail data using shopping profiles. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 353–363. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76111-4_35
Guidotti, R., Gabrielli, L., Monreale, A., et al.: Discovering temporal regularities in retail customers’ shopping behavior. EPJ Data Sci. 7(1), 1–26 (2018)
Guidotti, R., Monreale, A., Nanni, M.: Clustering individual transactional data for masses of users. In: KDD, pp. 195–204. ACM (2017)
Guidotti, R., Monreale, A., Ruggieri, S., et al.: A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)
Guidotti, R., Rossetti, G., et al.: Personalized market basket prediction with temporal annotated recurring sequences. IEEE TKDE 31(11), 2151–2163 (2018)
Herdağdelen, A., State, B., Adamic, L., Mason, W.: The social ties of immigrant communities in the united states. In: ACM WEBSCI, pp. 78–84 (2016)
Hyndman, R.J., et al.: Forecasting: Principles and Practice. OTexts, Melbourne (2018)
Kulkarni, V., et al.: Freshman or fresher? quantifying the geographic variation of language in online social media. In: AAAI ICWSM, pp. 615–618 (2016)
Lamanna, F., Lenormand, M., et al.: Immigrant community integration in world cities. PloS one 13(3), e0191612 (2018)
Logan, T.D., Rhode, P.W.: Moveable feasts: A new approach to endogenizing tastes. manuscript (The Ohio State University) (2010)
L. Luo, et al. Tracking the evolution of customer purchase behavior segmentation via a fragmentation-coagulation process. In: IJCAI, pp. 2414–2420 (2017)
Magdy, A., Ghanem, T.M., Musleh, M., Mokbel, M.F.: Exploiting geo-tagged tweets to understand localized language diversity. In: GeoRich, pp. 1–6 (2014)
Qian, Z., et al.: Social boundaries and marital assimilation: Interpreting trends in racial and ethnic intermarriage. Am. Sociol. Rev. 72(1), 68–94 (2007)
Ray, K.: The Migrants Table: Meals And Memories In. Temple University Press, Philadelphia (2004)
Sîrbu, A., et al.: Human migration: the big data perspective. Int. J. Data Sci. Anal. 1–20 (2020). https://doi.org/10.1007/s41060-020-00213-5
Spilimbergo, A.: Democracy and foreign education. AER 99(1), 528–43 (2009)
Tan, P.-N., et al.: Introduction to Data Mining. Pearson Education India, Noida (2016)
Wedel, M., Kamakura, W.A.: Market segmentation: Conceptual and Methodological Foundations, vol. 8. Springer, New York (2012)
Yoshua, B., Réjean, D., Pascal, V., Christian, J.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Acknowledgment
This work is partially supported by the European Community H2020 programme under the funding schemes: H2020-INFRAIA-2019-1: Res. Infr. G.A. 871042 SoBigData++, G.A. 825619 AI4EU, G.A. 761758 Humane AI, and G.A. 780754 Track&Know. We thank UniCoop Tirreno for providing the data, and Roberto Zicaro for preliminary studies on the proposed methodology and analysis.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Guidotti, R. et al. (2021). Measuring Immigrants Adoption of Natives Shopping Consumption with Machine Learning. In: Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12461. Springer, Cham. https://doi.org/10.1007/978-3-030-67670-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-67670-4_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67669-8
Online ISBN: 978-3-030-67670-4
eBook Packages: Computer ScienceComputer Science (R0)