Estimating city-level poverty rate based on e-commerce data with machine learning

Wijaya, Dedy Rahman; Paramita, Ni Luh Putu Satyaning Pradnya; Uluwiyah, Ana; Rheza, Muhammad; Zahara, Annisa; Puspita, Dwi Rani

doi:10.1007/s10660-020-09424-1

Estimating city-level poverty rate based on e-commerce data with machine learning

Published: 18 June 2020

Volume 22, pages 195–221, (2022)
Cite this article

Electronic Commerce Research Aims and scope Submit manuscript

Dedy Rahman Wijaya ORCID: orcid.org/0000-0003-0351-7331¹,
Ni Luh Putu Satyaning Pradnya Paramita²,
Ana Uluwiyah³,
Muhammad Rheza⁴,
Annisa Zahara⁴ &
…
Dwi Rani Puspita⁵

1457 Accesses
Explore all metrics

Abstract

There are many big data sources in Indonesia, for example, data from social media, financial transactions, transportation, call detail records, and e-commerce. These types of data have been considered as potential resources to complement periodic surveys and censuses to monitor development indicators such as poverty levels. Data from e-commerce in particular could potentially represent the real expenditure of households, better complying with the formal calculation of the poverty line than other datasets. The contribution of this research is to propose a framework for poverty rate estimation based on e-commerce data using machine learning algorithms. The influence of items and aspects in e-commerce data was investigated in conjunction with poverty rate estimation. The experimental result showed that e-commerce data could potentially be used as a proxy for calculating city-level poverty rates. It was also found that cars and motorbikes are the two most significant items for poverty prediction in Indonesia.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Poverty prediction using E-commerce dataset and filter-based feature selection approach

Article Open access 07 February 2024

Smart E-commerce Hub for Real Estate Web Application

A Comparative Analysis of Multidimensional COVID-19 Poverty Determinants: An Observational Machine Learning Approach

Article 01 February 2023

References

Moore, B., Akib, K., & Sugden, S. (2018). E-commerce in Indonesia: A guide for Australian business. Sydney. Retrieved January 6, 2019, from https://www.austrade.gov.au/ArticleDocuments/1358/E-commerce-in-Indonesia-Guide.pdf.aspx.
OECD. (2018). Poverty rate (indicator). In Organisation for economic co-operation and development. Retrieved July 26, 2018, from https://data.oecd.org/inequality/poverty-rate.htm.
Indonesia, B.-S. (2018). National social and economic survey, Jakarta. Retrieved June 13, 2020, from https://microdata.bps.go.id/mikrodata/index.php/catalog/SUSENAS/about.
BPS—Statistics Indonesia. (2018). Kemiskinan dan Ketimpangan. Retrieved June 13, 2020, from https://www.bps.go.id/subject/23/kemiskinan-dan-ketimpangan.html.
Kipkosgei Lagat, A. (2019). Support vector regression and artificial neural network approaches: Case of economic growth in East Africa community. American Journal of Theoretical and Applied Statistics, 7(2), 67. https://doi.org/10.11648/j.ajtas.20180702.13.
Article Google Scholar
Shirzad, A., Tabesh, M., & Farmani, R. (2014). A comparison between performance of support vector regression and artificial neural network in prediction of pipe burst rate in water distribution networks. KSCE Journal of Civil Engineering, 18(4), 941–948. https://doi.org/10.1007/s12205-014-0537-8.
Article Google Scholar
Naguib, I. A., & Darwish, H. W. (2012). Support vector regression and artificial neural network models for stability indicating analysis of mebeverine hydrochloride and sulpiride mixtures in pharmaceutical preparation: A comparative study. Spectrochimica Acta—Part A: Molecular and Biomolecular Spectroscopy, 86, 515–526. https://doi.org/10.1016/j.saa.2011.11.003.
Article Google Scholar
Mustakim, B. A., & Hermadi, I. (2016). Performance comparison between support vector regression and artificial neural network for prediction of oil palm production. Journal of Computer Science and Information, 1, 99–102. https://doi.org/10.21609/jiki.v9i1.287.
Article Google Scholar
Guo, K. H., & Wang, X. Y. (2011). Comparisons of support vector regression and neural network in modelling the hydraulic damper. Advanced Materials Research, 403–408, 3805–3812. https://doi.org/10.4028/www.scientific.net/amr.403-408.3805.
Article Google Scholar
Wijaya, D. R., Sarno, R., & Zulaika, E. (2019). Noise filtering framework for electronic nose signals: An application for beef quality monitoring. Computers and Electronics in Agriculture, 157(January 2018), 305–321. https://doi.org/10.1016/j.compag.2019.01.001.
Article Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97. https://doi.org/10.1109/MSP.2012.2205597.
Article Google Scholar
Li, X., He, Q., Wang, Q., Huang, Q., Li, Y., Zhang, X., et al. (2017). Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection. Multimedia Tools and Applications, 77(1), 897–916. https://doi.org/10.1007/s11042-016-4332-z.
Article Google Scholar
Guan, W. (2018). Performance optimization of speech recognition system with deep neural network model. Optical Memory and Neural Networks, 27(4), 272–282. https://doi.org/10.3103/s1060992x18040094.
Article Google Scholar
Du, J., & Xu, Y. (2017). Hierarchical deep neural network for multivariate regression. Pattern Recognition, 63(June 2015), 149–157. https://doi.org/10.1016/j.patcog.2016.10.003.
Article Google Scholar
Braithwaite, A., Dasandi, N., & Hudson, D. (2016). Does poverty cause conflict? Isolating the causal origins of the conflict trap. Conflict Management and Peace Science, 33(1), 45–66. https://doi.org/10.1177/0738894214559673.
Article Google Scholar
Målqvist, M. (2015). Abolishing inequity, a necessity for poverty reduction and the realisation of child mortality targets. Archives of Disease in Childhood, 100(Suppl 1), S5–S9. https://doi.org/10.1136/archdischild-2013-305722.
Article Google Scholar
Fund, U. N. P. (2014). Population and poverty. Retrieved July 1, 2019, from https://www.unfpa.org/resources/population-and-poverty.
Steele, J. E., Sundsøy, R., Pezzulo, C., Alegana, V. A., Steele, J. E., Bird, T. J., et al. (2017). Mapping poverty using mobile phone and satellite data. Journal of the Royal Society, Interface. https://doi.org/10.1098/rsif.2016.0690.
Article Google Scholar
The United Nations. (2015). The millennium development goals report. United Nations. ISBN 978-92-1-101320-7.
Blumenstock, J., Cadamuro, G., & On, R. (2015). Predicting poverty and wealth from mobile phone metadata. Science, 350(6264), 1073–1076. https://doi.org/10.1126/science.aac4420.
Article Google Scholar
Soto, V., & Virseda, J. (2011). Prediction of socio-economic levels using cellphone records. In J. A. Konstan, R. Conejo, J. L. Marzo, & N. Oliver (Eds.), International conference on user modeling, adaptation, and personalization (pp. 377–388). Girona: Springer. https://doi.org/10.1007/978-3-642-22362-4.
Mellander, C., Lobo, J., Stolarick, K., & Matheson, Z. (2015). Night-time light data: A good proxy measure for economic activity? PLoS ONE, 10(10), 1–18. https://doi.org/10.1371/journal.pone.0139779.
Article Google Scholar
Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794. https://doi.org/10.1126/science.aaf7894.
Article Google Scholar
Babenko, B., Hersh, J., Newhouse, D., Ramakrishnan, A., & Swartz, T. (2017). Poverty mapping using convolutional neural networks trained on high and medium resolution satellite images, with an application in Mexico. In 31st conference on neural information processing systems (NIPS 2017) (pp. 1–4). Long Beach. https://doi.org/10.1109/vppc.2005.1554579.
Perez, A., Azzari, G., & Burke, M. (2017). Poverty prediction with public Landsat 7 satellite imagery and machine learning. In 31st conference on neural information processing systems (NIPS 2017). Long Beach: Neural Information Processing Systems Foundation, Inc.
Pandey, S. M., Agarwal, T., & Krishnan, N. C. (2018). Multi-task deep learning for predicting poverty from satellite images. In The thirtieth AAAI conference on innovative applications of artificial intelligence (IAAI-18) (pp. 7793–7798). New Orleans: Association for the Advancement of Artificial Intelligence.
Njuguna, C., & McSharry, P. (2017). Constructing spatiotemporal poverty indices from big data. Journal of Business Research, 70, 318–327. https://doi.org/10.1016/j.jbusres.2016.08.005.
Article Google Scholar
Pokhriyal, N., & Christophe, D. (2017). Combining disparate data sources for improved poverty prediction and mapping. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.1700319114.
Article Google Scholar
Alencar, P., & Cowan, D. (2018). The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications, 97, 205–227. https://doi.org/10.1016/j.eswa.2017.12.020.
Article Google Scholar
Tian, F., Wu, F., Chao, K. M., Zheng, Q., Shah, N., Lan, T., et al. (2016). A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews. Electronic Commerce Research and Applications, 16, 66–76. https://doi.org/10.1016/j.elerap.2015.10.003.
Article Google Scholar
Lee, S., & Kim, W. (2017). Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification. Electronic Commerce Research and Applications, 26, 35–49. https://doi.org/10.1016/j.elerap.2017.09.006.
Article Google Scholar
Li, Q., Kurniajaya, K. J., Tseng, K.-K., Zhou, H., & Lin, R. F.-Y. (2017). Price prediction of e-commerce products through Internet sentiment analysis. Electronic Commerce Research, 18(1), 65–88. https://doi.org/10.1007/s10660-017-9272-9.
Article Google Scholar
Rout, J. K., Choo, K. K. R., Dash, A. K., Bakshi, S., Jena, S. K., & Williams, K. L. (2018). A model for sentiment and emotion analysis of unstructured social media text. Electronic Commerce Research, 18(1), 181–199. https://doi.org/10.1007/s10660-017-9257-8.
Article Google Scholar
Wang, Y., Lu, X., & Tan, Y. (2018). Impact of product attributes on customer satisfaction: An analysis of online reviews for washing machines. Electronic Commerce Research and Applications, 29, 1–11. https://doi.org/10.1016/j.elerap.2018.03.003.
Article Google Scholar
Yang, S., Joo, H., & Youm, S. (2019). Demand forecasting model development through big data analysis. Electronic Commerce Research. https://doi.org/10.1007/s10660-019-09337-8.
Article Google Scholar
Ou, W., Huynh, V. N., & Sriboonchitta, S. (2018). Training attractive attribute classifiers based on opinion features extracted from review data. Electronic Commerce Research and Applications, 32(October), 13–22. https://doi.org/10.1016/j.elerap.2018.10.003.
Article Google Scholar
Zhang, W., Du, Y., Yang, Y., & Yoshida, T. (2018). DeRec: A data-driven approach to accurate recommendation with deep learning and weighted loss function. Electronic Commerce Research and Applications, 31(August), 12–23. https://doi.org/10.1016/j.elerap.2018.08.001.
Article Google Scholar
Vincent, O. R., Makinde, A. S., & Akinwale, A. T. (2017). A cognitive buying decision-making process in B2B e-commerce using Analytic-MLP. Electronic Commerce Research and Applications, 25, 59–69. https://doi.org/10.1016/j.elerap.2017.08.002.
Article Google Scholar
Wijaya, D. R., & Afianti, F. (2020). Stability assessment of feature selection algorithms on homogeneous datasets: A study for sensor array optimization problem. IEEE Access, 8, 33944–33953. https://doi.org/10.1109/ACCESS.2020.2974982.
Article Google Scholar
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., et al. (2016). Feature selection: A data perspective. ACM Computing Surveys. https://doi.org/10.1145/3136625.
Article Google Scholar
Liu, X., Zhang, H., Kong, X., & Lee, K. Y. (2020). Wind speed forecasting using deep neural network with feature selection. Neurocomputing. https://doi.org/10.1016/j.neucom.2019.08.108.
Article Google Scholar
Wang, L., Yan, X., Liu, M. L., Song, K. J., Sun, X. F., & Pan, W. W. (2019). Prediction of RNA-protein interactions by combining deep convolutional neural network with feature selection ensemble method. Journal of Theoretical Biology, 461, 230–238. https://doi.org/10.1016/j.jtbi.2018.10.029.
Article Google Scholar
Jiang, S., Chin, K. S., Wang, L., Qu, G., & Tsui, K. L. (2017). Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Systems with Applications, 82, 216–230. https://doi.org/10.1016/j.eswa.2017.04.017.
Article Google Scholar
Mirzaei, A., Pourahmadi, V., Soltani, M., & Sheikhzadeh, H. (2020). Deep feature selection using a teacher-student network. Neurocomputing, 383, 396–408. https://doi.org/10.1016/j.neucom.2019.12.017.
Article Google Scholar
Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In International conference on machine learning (ICML) (pp. 1–8).
Wijaya, D. R., Sarno, R., & Zulaika, E. (2016). Sensor array optimization for mobile electronic nose: Wavelet transform and filter based feature selection approach. International Review on Computers and Software, 11(8), 659–671. https://doi.org/10.15866/irecos.v11i8.9425.
Article Google Scholar
Brown, G., Pocock, A., Zhao, M.-J., & Lujan, M. (2012). Conditional likelihood maximisation: A unifying framework for mutual information feature selection. Journal of Machine Learning Research, 13, 27–66. https://doi.org/10.1016/j.patcog.2015.11.007.
Article Google Scholar
Hariyanto, S. R., & Wijaya, D. R. (2017). Detection of diabetes from gas analysis of human breath using e-Nose. In 2017 11th international conference on information & communication technology and system (ICTS) (Vol. 0, pp. 241–246). Surabaya: IEEE. https://doi.org/10.1109/icts.2017.8265677.
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
Google Scholar
Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 1–27. https://doi.org/10.1145/1961189.1961199.
Article Google Scholar
Greff, K., Srivastava, R. K., Koutnik, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space Odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924.
Article Google Scholar
Kalman, B. L., & Kwasny, S. C. (1992). Why tanh: Choosing a sigmoidal function. In IJCNN international joint conference on neural networks (pp. 578–581). Baltimore: IEEE. https://doi.org/10.1109/ijcnn.1992.227257.
Deeplearning4j Development Team. (2017). Deeplearning4j: Open-source distributed deep learning for the JVM. Apache Software Foundation License 2.0. San Francisco: Skymind. Retrieved January 6, 2019, from http://deeplearning4j.org.
Baranyi, J., Pin, C., & Ross, T. (1999). Validating and comparing predictive models. International Journal of Food Microbiology, 48(3), 159–166.
Article Google Scholar
Indonesia, B.-S. (2018). Persentase Penduduk Miskin Menurut Kabupaten/Kota, 2015–2017. Jakarta. Retrieved January 6, 2019, from https://www.bps.go.id/dynamictable/2017/08/03/1261/persentase-penduduk-miskin-menurut-kabupaten-kota-2015%972017.html.

Download references

Acknowledgements

This work was supported by Pulse Lab Jakarta (PLJ), which is a joint initiative of the United Nations and the Government of Indonesia.

Author information

Authors and Affiliations

School of Applied Science, Telkom University, Bandung, Indonesia
Dedy Rahman Wijaya
Statistics Department, Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia
Ni Luh Putu Satyaning Pradnya Paramita
Education and Training Center, Statistics Indonesia (BPS), Jakarta, Indonesia
Ana Uluwiyah
Pulse Lab Jakarta – United Nations Global Pulse, Jakarta, Indonesia
Muhammad Rheza & Annisa Zahara
Institute for Economic and Social Research, Universitas Indonesia, Jakarta, Indonesia
Dwi Rani Puspita

Authors

Dedy Rahman Wijaya
View author publications
You can also search for this author inPubMed Google Scholar
Ni Luh Putu Satyaning Pradnya Paramita
View author publications
You can also search for this author inPubMed Google Scholar
Ana Uluwiyah
View author publications
You can also search for this author inPubMed Google Scholar
Muhammad Rheza
View author publications
You can also search for this author inPubMed Google Scholar
Annisa Zahara
View author publications
You can also search for this author inPubMed Google Scholar
Dwi Rani Puspita
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dedy Rahman Wijaya.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wijaya, D.R., Paramita, N.L.P.S.P., Uluwiyah, A. et al. Estimating city-level poverty rate based on e-commerce data with machine learning. Electron Commer Res 22, 195–221 (2022). https://doi.org/10.1007/s10660-020-09424-1

Download citation

Published: 18 June 2020
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10660-020-09424-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating city-level poverty rate based on e-commerce data with machine learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Poverty prediction using E-commerce dataset and filter-based feature selection approach

Smart E-commerce Hub for Real Estate Web Application

A Comparative Analysis of Multidimensional COVID-19 Poverty Determinants: An Observational Machine Learning Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now