Abstract
There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data, fill missing value, improve prediction accuracy, and enhance interpretability. In this study, we for the first time apply a tabular deep learning methodology (TabNet) to predict socioeconomic and environmental attributes (number of population and companies, volume of consumption, poker players’ behaviors, forest cover, etc.). Furthermore, we develop a new network architecture, referred to as improved TabNet (iTabNet), that can simultaneously learn local and global features in the tabular data to improve prediction accuracy. We also introduce a difference loss to constrain the feature selection process in iTabNet so that the model can use different features at different steps to enhance interpretability. To deal with missing values, we introduce a fusion strategy based on data mean and Auto-Encoder network to efficiently complete a more reasonable value filling. Experimental results demonstrate that the proposed iTabNet achieves competitive performances in the application to predict socioeconomic and environmental attributes based on tabular data, iTabNet using the proposed fusion strategy significantly outperforms other machine learning models when tabular data have missing values.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Owusu M, Kuffer M, Belgiu M (2021) Towards user-driven earth observation-based slum mapping. Comput Environ Urban Syst 89:101681
Singleton A, Alexiou A, Savani R (2020) Mapping the geodemographics of digital inequality in Great Britain: an integration of machine learning into small area estimation. Comput Environ Urban Syst 82:101486
Chen Z, Wei Y, Shi K (2022) The potential of nighttime light remote sensing data to evaluate the development of digital economy: a case study of china at the city level. Comput Environ Urban Syst 92:101749
Einav L, Levin J (2014) Economics in the age of big data. Science 346(6210):1243089
Glaeser EL, Kominers SD, Luca M, Naik N (2018) Big data and big cities: the promises and limitations of improved measures of urban life. Econ Inq 56(1):114–137
Wardrop N, Jochem W, Bird T (2018) Spatially disaggregated population estimates in the absence of national population and housing census data. Proc Natl Acad Sci 115(14):3529–3537
Muhammad AN, Aseere AM, Chiroma H, Shah H, Gital AY, Hashem IAT (2021) Deep learning application in smart cities: recent development, taxonomy, challenges and research prospects. Neural Comput Appl 33(7):2973–3009
Montasser O, Kifer D (2017) Predicting demographics of high-resolution geographies with geotagged tweets. In: Proceedings of the AAAI conference on artificial intelligence, vol 31, pp 1460–1466
Fan Z, Pei T, Ma T (2018) Estimation of urban crowd flux based on mobile phone location data: a case study of Beijing, China. Comput Environ Urban Syst 69:114–123
Tan M, Li X, Li S (2018) Modeling population density based on nighttime light images and land use data in china. Appl Geogr 90:239–247
Yao Y, Liu X, Li X (2017) Mapping fine-scale population distributions at the building level by integrating multisource geospatial big data. Int J Geogr Inf Sci 31(6):1220–1244
Corbane C, Syrris V, Sabo F, Politis P, Melchiorri M, Pesaresi M, Soille P, Kemper T (2021) Convolutional neural networks for global human settlements mapping from sentinel-2 satellite imagery. Neural Comput Appl 33(12):6697–6720
Duque JC, Patino JE (2015) Measuring intra-urban poverty using land cover and texture metrics derived from remote sensing data. Landsc Urban Plan 135:11–21
La Y, Bagan H, Takeuchi W (2019) Explore urban population distribution using nighttime lights, land-use/land-cover and population census data. In: IEEE international geoscience and remote sensing symposium. IEEE, pp 1554–1557
Suel E, Polak JW (2019) Measuring social, environmental and health inequalities using deep learning and street imagery. Sci Rep 9(1):1–10
Gebru T, Krause J (2017) Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proc Natl Acad Sci 114(50):13108–13113
Bobadilla J, González-Prieto Á, Ortega F, Lara-Cabrera R (2021) Deep learning feature selection to unhide demographic recommender systems factors. Neural Comput Appl 33(12):7291–7308
Zong Z, Feng J, Liu K (2019) DeepDPM: dynamic population mapping via deep neural network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 1294–1301
Gervasoni L, Fenet S (2018) Convolutional neural networks for disaggregated population mapping using open data. In: IEEE international conference on data science and advanced analytics (DSAA). IEEE, pp 594–603
Johnsen M, Brandt O, Garrido S, Pereira F (2022) Population synthesis for urban resident modeling using deep generative models. Neural Comput Appl 34(6):4677–4692
Azcarraga A, Setiono R (2018) Neural network rule extraction for gaining insight into the characteristics of poverty. Neural Comput Appl 30(9):2795–2806
Tian H, Zhu T, Liu W, Zhou W (2022) Image fairness in deep learning: problems, models, and challenges. Neural Comput Appl 34:1–19
Song W, Shi C, Xiao Z (2019) Autoint: automatic feature interaction learning via self-attentive neural networks. In: ACM international conference on information and knowledge management, pp 1161–1170
Frosst N, Hinton G (2017) Distilling a neural network into a soft decision tree. arXiv:1711.09784pdf
Huang X, Khetan A, Cvitkovic M (2020) Tabtransformer: tabular data modeling using contextual embeddings. arXiv:2012.06678
Arık SO, Pfister T (2021) Tabnet: attentive interpretable tabular learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 6679–6687
Athey S (2017) Beyond prediction: using big data for policy problems. Science 355(6324):483–485
Kontokosta CE, Johnson N (2017) Urban phenology: toward a real-time census of the city using Wi-Fi data. Comput Environ Urban Syst 64:144–153
Zhang Y, Aslam NS, Lai J, Cheng T (2020) You are how you travel: a multi-task learning framework for geodemographic inference using transit smart card data. Comput Environ Urban Syst 83:101517
Deville P, Linard C, Martin S, Gilbert M (2014) Dynamic population mapping using mobile phone data. Proc Natl Acad Sci 111(45):15888–15893
Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076
Stevens FR, Gaughan AE (2015) Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS ONE 10(2):0107042
Dong L, Ratti C, Zheng S (2019) Predicting neighborhoods’ socioeconomic attributes using restaurant data. Proc Natl Acad Sci 116(31):15447–15452
Niu T, Chen Y, Yuan Y (2020) Measuring urban poverty using multi-source data and a random forest algorithm: a case study in Guangzhou. Sustain Cities Soc 54:102014
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Vaswani A, Shazeer N, Parmar N (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Xie M, Jean N, Burke M (2016) Transfer learning from deep features for remote sensing and poverty mapping. In: Proceedings of the AAAI conference on artificial intelligence, pp 3929–3935
Meng Y, Xing H, Yuan Y (2020) Sensing urban poverty: from the perspective of human perception-based greenery and open-space landscapes. Comput Environ Urban Syst 84:101544
Wang Y, Chen Q, Gan D, Yang J, Kirschen DS, Kang C (2018) Deep learning-based socio-demographic information identification from smart meter data. IEEE Trans Smart Grid 10(3):2593–2602
Borisov V, Leemann T, Seßler K (2021) Deep neural networks and tabular data: a survey, pp 1–19. arXiv:2110.01889pdf
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794
Ke G, Meng Q, Finley T (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, p 30
Kadra A, Lindauer M, Hutter F (2021) Regularization is all you need: simple neural nets can excel on tabular data. arXiv:2106.11189pdf
Humbird KD, Peterson JL, McClarren RG (2018) Deep neural network initialization with decision trees. IEEE Trans Neural Netw Learn Syst 30(5):1286–1295
Wang S, Aggarwal C, Liu H (2017) Using a random forest to inspire a neural network and improving on it. In: SIAM international conference on data mining. SIAM, pp 1–9
Katzir L, Elidan G, El-Yaniv R (2020) Net-DNF: effective deep modeling of tabular data. In: International conference on learning representations, pp 1–16
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448–456
Agarap AF (2018) Deep learning using rectified linear units (ReLU). arXiv:1803.08375pdf
Hoffer E, Hubara I, Soudry D (2017) Train longer, generalize better: closing the generalization gap in large batch training of neural networks. In: Advances in neural information processing systems, p 30
Martins A, Astudillo R (2016) From softmax to sparsemax: a sparse model of attention and multi-label classification. In: International conference on machine learning. PMLR, pp 1614–1623
Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning. PMLR, pp 933–941
Bousmalis K, Trigeorgis G, Silberman N (2016) Domain separation networks. Adv Neural Inf Process Syst 29:343–351
Blackard JA (1998) Forest cover type. https://archive.ics.uci.edu/ml/datasets/covertype. Accessed 10 Nov 2021
Cattral R (2007) Poker hand. https://archive.ics.uci.edu/ml/datasets/Poker+Hand. Accessed 10 Nov 2021
Vijayakumar S, Schaal S (2000) Locally weighted projection regression: an o (n) algorithm for incremental real time learning in high dimensional space. In: International conference on machine learning, vol 1. Morgan Kaufmann, pp 288–293
Kaggle (2019) House price. https://www.kaggle.com/greenwing1985/housepricing. Accessed 10 Nov 2021
Welling M, Kingma DP (2019) An introduction to variational autoencoders. Found Trends Mach Learn 12(4):307–392
Felix B, Tammo R, Phillipp S, Prathik N, Sebastian S, Andrey T, Dustin L, David S (2019) DataWig: missing value imputation for tables. J Mach Learn Res 20:1–6
Dorogush AV, Ershov V, Gulin A (2018) CatBoost: gradient boosting with categorical features support. arXiv:1810.11363pdf
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674
Ranstam J, Cook J (2018) Lasso regression. J Br Surg 105(10):1348–1348
Ye J, Chow J-H, Chen J, Zheng Z (2009) Stochastic gradient boosted distributed decision trees. In: ACM conference on information and knowledge management, pp 2061–2064
Tanno R, Arulkumaran K, Alexander D (2019) Adaptive neural trees. In: International conference on machine learning. PMLR, pp 6166–6175
Wen Z, He B, Kotagiri R, Lu S, Shi J (2018) Efficient gradient boosted decision tree training on GPUs. In: IEEE international parallel and distributed processing symposium (IPDPS). IEEE, pp 234–243
Yang Y, Morillo IG, Hospedales TM (2018) Deep neural decision trees. arXiv:1806.06988pdf
Murtagh F (1991) Multilayer perceptrons for classification and regression. Neurocomputing 2(5):183–197
Probst P, Wright MN, Boulesteix A-L (2019) Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov 9(3):1301
Acknowledgements
The authors would like to thank Ms. Gu Jiayu from Nankai University for proofreading of the manuscript. The manuscript has been presented and discussed in the Computational Social Science Paper Writing Workshop, hosted by Tsinghua University. We appreciate the comments from Professor Fu Xiaoming and Professor Zhang Yong, which are helpful for quality improvement of this paper. Many thanks to the editor and anonymous reviewers.
Funding
This paper is funded by the National Natural Science Foundation of China (Grant No. 62276208), Key Project Foundation of Philosophy and Social Sciences, Ministry of Education, China (Grant No. 22JZD028), and the Fundamental Research Funds for the Central Universities, China (Grant No. 63222031).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, J., Tian, T., Liu, Y. et al. iTabNet: an improved neural network for tabular data and its application to predict socioeconomic and environmental attributes. Neural Comput & Applic 35, 11389–11402 (2023). https://doi.org/10.1007/s00521-023-08304-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08304-7