Skip to main content
Log in

iTabNet: an improved neural network for tabular data and its application to predict socioeconomic and environmental attributes

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data, fill missing value, improve prediction accuracy, and enhance interpretability. In this study, we for the first time apply a tabular deep learning methodology (TabNet) to predict socioeconomic and environmental attributes (number of population and companies, volume of consumption, poker players’ behaviors, forest cover, etc.). Furthermore, we develop a new network architecture, referred to as improved TabNet (iTabNet), that can simultaneously learn local and global features in the tabular data to improve prediction accuracy. We also introduce a difference loss to constrain the feature selection process in iTabNet so that the model can use different features at different steps to enhance interpretability. To deal with missing values, we introduce a fusion strategy based on data mean and Auto-Encoder network to efficiently complete a more reasonable value filling. Experimental results demonstrate that the proposed iTabNet achieves competitive performances in the application to predict socioeconomic and environmental attributes based on tabular data, iTabNet using the proposed fusion strategy significantly outperforms other machine learning models when tabular data have missing values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  2. Owusu M, Kuffer M, Belgiu M (2021) Towards user-driven earth observation-based slum mapping. Comput Environ Urban Syst 89:101681

    Article  Google Scholar 

  3. Singleton A, Alexiou A, Savani R (2020) Mapping the geodemographics of digital inequality in Great Britain: an integration of machine learning into small area estimation. Comput Environ Urban Syst 82:101486

    Article  Google Scholar 

  4. Chen Z, Wei Y, Shi K (2022) The potential of nighttime light remote sensing data to evaluate the development of digital economy: a case study of china at the city level. Comput Environ Urban Syst 92:101749

    Article  Google Scholar 

  5. Einav L, Levin J (2014) Economics in the age of big data. Science 346(6210):1243089

    Article  Google Scholar 

  6. Glaeser EL, Kominers SD, Luca M, Naik N (2018) Big data and big cities: the promises and limitations of improved measures of urban life. Econ Inq 56(1):114–137

    Article  Google Scholar 

  7. Wardrop N, Jochem W, Bird T (2018) Spatially disaggregated population estimates in the absence of national population and housing census data. Proc Natl Acad Sci 115(14):3529–3537

    Article  Google Scholar 

  8. Muhammad AN, Aseere AM, Chiroma H, Shah H, Gital AY, Hashem IAT (2021) Deep learning application in smart cities: recent development, taxonomy, challenges and research prospects. Neural Comput Appl 33(7):2973–3009

    Article  Google Scholar 

  9. Montasser O, Kifer D (2017) Predicting demographics of high-resolution geographies with geotagged tweets. In: Proceedings of the AAAI conference on artificial intelligence, vol 31, pp 1460–1466

  10. Fan Z, Pei T, Ma T (2018) Estimation of urban crowd flux based on mobile phone location data: a case study of Beijing, China. Comput Environ Urban Syst 69:114–123

    Article  Google Scholar 

  11. Tan M, Li X, Li S (2018) Modeling population density based on nighttime light images and land use data in china. Appl Geogr 90:239–247

    Article  Google Scholar 

  12. Yao Y, Liu X, Li X (2017) Mapping fine-scale population distributions at the building level by integrating multisource geospatial big data. Int J Geogr Inf Sci 31(6):1220–1244

    Google Scholar 

  13. Corbane C, Syrris V, Sabo F, Politis P, Melchiorri M, Pesaresi M, Soille P, Kemper T (2021) Convolutional neural networks for global human settlements mapping from sentinel-2 satellite imagery. Neural Comput Appl 33(12):6697–6720

    Article  Google Scholar 

  14. Duque JC, Patino JE (2015) Measuring intra-urban poverty using land cover and texture metrics derived from remote sensing data. Landsc Urban Plan 135:11–21

    Article  Google Scholar 

  15. La Y, Bagan H, Takeuchi W (2019) Explore urban population distribution using nighttime lights, land-use/land-cover and population census data. In: IEEE international geoscience and remote sensing symposium. IEEE, pp 1554–1557

  16. Suel E, Polak JW (2019) Measuring social, environmental and health inequalities using deep learning and street imagery. Sci Rep 9(1):1–10

    Article  Google Scholar 

  17. Gebru T, Krause J (2017) Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proc Natl Acad Sci 114(50):13108–13113

    Article  Google Scholar 

  18. Bobadilla J, González-Prieto Á, Ortega F, Lara-Cabrera R (2021) Deep learning feature selection to unhide demographic recommender systems factors. Neural Comput Appl 33(12):7291–7308

    Article  Google Scholar 

  19. Zong Z, Feng J, Liu K (2019) DeepDPM: dynamic population mapping via deep neural network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 1294–1301

  20. Gervasoni L, Fenet S (2018) Convolutional neural networks for disaggregated population mapping using open data. In: IEEE international conference on data science and advanced analytics (DSAA). IEEE, pp 594–603

  21. Johnsen M, Brandt O, Garrido S, Pereira F (2022) Population synthesis for urban resident modeling using deep generative models. Neural Comput Appl 34(6):4677–4692

    Article  Google Scholar 

  22. Azcarraga A, Setiono R (2018) Neural network rule extraction for gaining insight into the characteristics of poverty. Neural Comput Appl 30(9):2795–2806

    Article  Google Scholar 

  23. Tian H, Zhu T, Liu W, Zhou W (2022) Image fairness in deep learning: problems, models, and challenges. Neural Comput Appl 34:1–19

    Article  Google Scholar 

  24. Song W, Shi C, Xiao Z (2019) Autoint: automatic feature interaction learning via self-attentive neural networks. In: ACM international conference on information and knowledge management, pp 1161–1170

  25. Frosst N, Hinton G (2017) Distilling a neural network into a soft decision tree. arXiv:1711.09784pdf

  26. Huang X, Khetan A, Cvitkovic M (2020) Tabtransformer: tabular data modeling using contextual embeddings. arXiv:2012.06678

  27. Arık SO, Pfister T (2021) Tabnet: attentive interpretable tabular learning. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 6679–6687

  28. Athey S (2017) Beyond prediction: using big data for policy problems. Science 355(6324):483–485

    Article  Google Scholar 

  29. Kontokosta CE, Johnson N (2017) Urban phenology: toward a real-time census of the city using Wi-Fi data. Comput Environ Urban Syst 64:144–153

    Article  Google Scholar 

  30. Zhang Y, Aslam NS, Lai J, Cheng T (2020) You are how you travel: a multi-task learning framework for geodemographic inference using transit smart card data. Comput Environ Urban Syst 83:101517

    Article  Google Scholar 

  31. Deville P, Linard C, Martin S, Gilbert M (2014) Dynamic population mapping using mobile phone data. Proc Natl Acad Sci 111(45):15888–15893

    Article  Google Scholar 

  32. Blumenstock J, Cadamuro G, On R (2015) Predicting poverty and wealth from mobile phone metadata. Science 350(6264):1073–1076

    Article  Google Scholar 

  33. Stevens FR, Gaughan AE (2015) Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS ONE 10(2):0107042

    Article  Google Scholar 

  34. Dong L, Ratti C, Zheng S (2019) Predicting neighborhoods’ socioeconomic attributes using restaurant data. Proc Natl Acad Sci 116(31):15447–15452

    Article  Google Scholar 

  35. Niu T, Chen Y, Yuan Y (2020) Measuring urban poverty using multi-source data and a random forest algorithm: a case study in Guangzhou. Sustain Cities Soc 54:102014

    Article  Google Scholar 

  36. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  37. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  38. Vaswani A, Shazeer N, Parmar N (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008

  39. Xie M, Jean N, Burke M (2016) Transfer learning from deep features for remote sensing and poverty mapping. In: Proceedings of the AAAI conference on artificial intelligence, pp 3929–3935

  40. Meng Y, Xing H, Yuan Y (2020) Sensing urban poverty: from the perspective of human perception-based greenery and open-space landscapes. Comput Environ Urban Syst 84:101544

    Article  Google Scholar 

  41. Wang Y, Chen Q, Gan D, Yang J, Kirschen DS, Kang C (2018) Deep learning-based socio-demographic information identification from smart meter data. IEEE Trans Smart Grid 10(3):2593–2602

    Article  Google Scholar 

  42. Borisov V, Leemann T, Seßler K (2021) Deep neural networks and tabular data: a survey, pp 1–19. arXiv:2110.01889pdf

  43. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

  44. Ke G, Meng Q, Finley T (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, p 30

  45. Kadra A, Lindauer M, Hutter F (2021) Regularization is all you need: simple neural nets can excel on tabular data. arXiv:2106.11189pdf

  46. Humbird KD, Peterson JL, McClarren RG (2018) Deep neural network initialization with decision trees. IEEE Trans Neural Netw Learn Syst 30(5):1286–1295

    Article  Google Scholar 

  47. Wang S, Aggarwal C, Liu H (2017) Using a random forest to inspire a neural network and improving on it. In: SIAM international conference on data mining. SIAM, pp 1–9

  48. Katzir L, Elidan G, El-Yaniv R (2020) Net-DNF: effective deep modeling of tabular data. In: International conference on learning representations, pp 1–16

  49. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning. PMLR, pp 448–456

  50. Agarap AF (2018) Deep learning using rectified linear units (ReLU). arXiv:1803.08375pdf

  51. Hoffer E, Hubara I, Soudry D (2017) Train longer, generalize better: closing the generalization gap in large batch training of neural networks. In: Advances in neural information processing systems, p 30

  52. Martins A, Astudillo R (2016) From softmax to sparsemax: a sparse model of attention and multi-label classification. In: International conference on machine learning. PMLR, pp 1614–1623

  53. Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning. PMLR, pp 933–941

  54. Bousmalis K, Trigeorgis G, Silberman N (2016) Domain separation networks. Adv Neural Inf Process Syst 29:343–351

    Google Scholar 

  55. Blackard JA (1998) Forest cover type. https://archive.ics.uci.edu/ml/datasets/covertype. Accessed 10 Nov 2021

  56. Cattral R (2007) Poker hand. https://archive.ics.uci.edu/ml/datasets/Poker+Hand. Accessed 10 Nov 2021

  57. Vijayakumar S, Schaal S (2000) Locally weighted projection regression: an o (n) algorithm for incremental real time learning in high dimensional space. In: International conference on machine learning, vol 1. Morgan Kaufmann, pp 288–293

  58. Kaggle (2019) House price. https://www.kaggle.com/greenwing1985/housepricing. Accessed 10 Nov 2021

  59. Welling M, Kingma DP (2019) An introduction to variational autoencoders. Found Trends Mach Learn 12(4):307–392

    Article  MATH  Google Scholar 

  60. Felix B, Tammo R, Phillipp S, Prathik N, Sebastian S, Andrey T, Dustin L, David S (2019) DataWig: missing value imputation for tables. J Mach Learn Res 20:1–6

    MATH  Google Scholar 

  61. Dorogush AV, Ershov V, Gulin A (2018) CatBoost: gradient boosting with categorical features support. arXiv:1810.11363pdf

  62. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  63. Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21(3):660–674

    Article  MathSciNet  Google Scholar 

  64. Ranstam J, Cook J (2018) Lasso regression. J Br Surg 105(10):1348–1348

    Article  Google Scholar 

  65. Ye J, Chow J-H, Chen J, Zheng Z (2009) Stochastic gradient boosted distributed decision trees. In: ACM conference on information and knowledge management, pp 2061–2064

  66. Tanno R, Arulkumaran K, Alexander D (2019) Adaptive neural trees. In: International conference on machine learning. PMLR, pp 6166–6175

  67. Wen Z, He B, Kotagiri R, Lu S, Shi J (2018) Efficient gradient boosted decision tree training on GPUs. In: IEEE international parallel and distributed processing symposium (IPDPS). IEEE, pp 234–243

  68. Yang Y, Morillo IG, Hospedales TM (2018) Deep neural decision trees. arXiv:1806.06988pdf

  69. Murtagh F (1991) Multilayer perceptrons for classification and regression. Neurocomputing 2(5):183–197

    Article  MathSciNet  Google Scholar 

  70. Probst P, Wright MN, Boulesteix A-L (2019) Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov 9(3):1301

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Ms. Gu Jiayu from Nankai University for proofreading of the manuscript. The manuscript has been presented and discussed in the Computational Social Science Paper Writing Workshop, hosted by Tsinghua University. We appreciate the comments from Professor Fu Xiaoming and Professor Zhang Yong, which are helpful for quality improvement of this paper. Many thanks to the editor and anonymous reviewers.

Funding

This paper is funded by the National Natural Science Foundation of China (Grant No. 62276208), Key Project Foundation of Philosophy and Social Sciences, Ministry of Education, China (Grant No. 22JZD028), and  the Fundamental Research Funds for the Central Universities, China (Grant No. 63222031).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunxia Liu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Tian, T., Liu, Y. et al. iTabNet: an improved neural network for tabular data and its application to predict socioeconomic and environmental attributes. Neural Comput & Applic 35, 11389–11402 (2023). https://doi.org/10.1007/s00521-023-08304-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08304-7

Keywords

Navigation